this post was submitted on 08 Jun 2024
126 points (100.0% liked)

technology

23113 readers
234 users here now

On the road to fully automated luxury gay space communism.

Spreading Linux propaganda since 2020

Rules:

founded 4 years ago
MODERATORS
 

A user on the online forum 4chan has leaked a massive 270GB of data belonging to The New York Times. This leak includes the source code for the newspaper’s digital operations.

Here are some other findings we can confirm:

  • The leak does have the original source code of the game Wordle, which the NY Times acquired in 2022.
  • The leak includes a dated WordPress database of 1,500 NY Times Education site users. The database contains names and surnames, email addresses, and hashed passwords. You should expect it to be added to HIBP shortly.
  • Several folders contain internal communications from NY Times Slack channels.
  • Times uses various machine learning algorithms and NLP techniques/scripts for its services.
  • Many exposed authentication methods exist, including authentication URLs and their respective passwords, secret keys, and API tokens. The majority are well protected, but plenty of such secrets need immediate attention. We have also seen private user keys used for authentication.
  • There are a lot of details about internal NY Times architecture from a software development point of view.

So far, it is difficult to say whether the NY Times will need to reset the passwords for everyone who is a member of its site.

It’s worth pointing out that this leak appears to involve data from The New York Times’s IT/infrastructure/website organization rather than the news organization composed of reporters. In media companies, these two entities are largely separate. The IT/infrastructure team handles the technical aspects of the website and digital operations, while the news organization manages reporting and editorial content.

top 26 comments
sorted by: hot top controversial new old
[–] [email protected] 76 points 2 months ago (3 children)

Several folders contain internal communications from NY Times Slack channels.

My body is ready

[–] [email protected] 15 points 2 months ago (2 children)

When will that stuff be released?

[–] [email protected] 12 points 2 months ago* (last edited 2 months ago)

I haven't looked over the leak myself, but it is available online. Not sure if I can/should post the torrent though. But if you look around I'm sure you can find it.

[–] [email protected] 6 points 2 months ago

¯\_(ツ)_/¯

[–] [email protected] 61 points 2 months ago

The leak does have the original source code of the game Wordle, which the NY Times acquired in 2022.

sicko-wholesome

[–] [email protected] 36 points 2 months ago (3 children)

The New York Times has over 5,000 source code repositories

that seems like a lot

[–] [email protected] 33 points 2 months ago* (last edited 2 months ago)

Something weird about that figure. Branches within repos maybe, otherwise those are mostly junk or their supply chain attack security requirements had them cloning and building themselves the repos of every open source library they've ever used for vulnerability scans.

[–] [email protected] 19 points 2 months ago* (last edited 2 months ago) (1 children)

the article links to this list of repos https://files.catbox.moe/jx7ksm.txt and says is 6200 lines long.

i am not framiliar enough with this kind of development to know if this is a reasonable structure for this kind of large project. anyone?

[–] [email protected] 18 points 2 months ago* (last edited 2 months ago)

Looks like they put each of their modules in a separate repo. This wouldn't be a single project. NYTimes is a pretty huge operation. They obviously have their website but they also have apps, infrastructure to ingest and process whatever media they get, infrastructure for ads, games, security (lol), user account management, billing, legal, etc etc.

it's possible this is organized differently in their source control and it appears kinda disorganized because we're looking at it flattened.

[–] [email protected] 11 points 2 months ago* (last edited 2 months ago) (1 children)

It's probably just every repository name on their spurce control management server. Users can usually create their own repositories whenever. So a bunch if these could just be random little experiments or side projects people made.

[–] [email protected] 6 points 2 months ago

Spruce Control is all the rage these days, but what about Douglass Fir Control?

[–] [email protected] 36 points 2 months ago (1 children)

...so? Isn't that just like a normal website

[–] [email protected] 49 points 2 months ago (1 children)

Several folders contain internal communications from NY Times Slack channels.

There potentially could be something interesting in there

[–] [email protected] 4 points 2 months ago

magnet link is in this thread if you want to try dig up some liberalism: https://boards.4chan.org/t/thread/1310643#p1310643

I have no goddamn idea why the tracker links are like that. average 4chan stuff I guess

[–] [email protected] 33 points 2 months ago (1 children)

the source code that serves up news content is 500 MB.

the rest is for interactive pop ups and mobile layout breaking, random spontaneous, invisible click boxes to makenit so you accidentally activate an ad when trying to watch, pause, close or otherwise interact with a video.

it is some of the most cutting edge website complicating code ever written.

[–] dch82 1 points 2 months ago

500MB? What the heck? That’s literally 350 or so floppies or so many win95 installs

[–] [email protected] 25 points 2 months ago* (last edited 2 months ago) (1 children)

Horse's mouth: (CW: everything and more, this is the bubonic plauge of brainrot you are exposing yourself to if you click this link) https://archived.moe/g/thread/100843783/

Magnet: here

No peer currently has over 85.4% downloaded and there are no seeds. It looks like it some of it might be lost.

[–] [email protected] 1 points 2 months ago (1 children)

Horse's mouth:

"It's the NYTs, they have way more dirty laundry than that. Show us the memos about deposing world leaders the feds don't like or gaslighting people about border security and refugee violence. Show us the big donations from the cccp to ignore crimes against humanity."

they're so fucking stupid

[–] [email protected] 2 points 2 months ago (1 children)

It's a horse. What do expect, lavender breath?

They are so incredibly moronic it's amazing.

[–] [email protected] 1 points 2 months ago (1 children)

It's a horse. What do expect, lavender breath?

lmao good one

did you figure out the hrt stuff btw? I can help you out if you need it

[–] [email protected] 1 points 2 months ago (1 children)

I have a cart in all day chemist and it accepts my card, I'm just scared to order it. I think I'm good.

[–] [email protected] 1 points 2 months ago

okay, good. I hope you get the courage to go through with the order. Just double check every part, lots of good communities for diy. And you can always ask here for help.

[–] [email protected] 17 points 2 months ago (1 children)

So do we think this is a random person or?

Every leak that specifically happens on 4chan I tend to assume is a cia op that uses 4chan because they want to promote it.

[–] [email protected] 19 points 2 months ago

Either that or it's where people go to leak shit from inside. Wouldn't rule out this being some rogue comrade.