this post was submitted on 13 Jun 2023
80 points (100.0% liked)

Technology

37585 readers
386 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:


This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 17 points 1 year ago* (last edited 1 year ago) (3 children)

I already had to use the cached version of a Reddit thread today to solve a technical issue I had with the rust compiler. There is so much valuable content there that is well indexed by search engines, let's hope they don't lock down the site even further to prevent AI's from training on their data.

Although, in that case, Lemmy can take over as the searchable internet forum.

[–] [email protected] 7 points 1 year ago (1 children)

I wonder if the Internet Archive has preserved much of Reddit's old posts and comments? No one seems to have mentioned it.

[–] [email protected] 5 points 1 year ago

I know there were at least a few projects not affiliated with IA that basically was a mirror copy of reddit. No idea what has happened to them at this point have not checked in a long time.

[–] [email protected] 6 points 1 year ago (1 children)

If they actually want to restrict ai training, they also have to restrict search engines. I may be behind the times, but usually those kind of questions have gone to a stack overflow sort of site I would have thought.

[–] [email protected] 2 points 1 year ago

If they wanted to restrict AI Training they'll need to prevent AI's ability to view the website. Removing the API just removes the low bandwidth low impact manner of gathering the data. Scripts can just as easily HTTP scrape as they can use an API, but that's a lot more resource intensive on Reddit's side. Heck, this is the whole reason free public APIs became a thing in the first place.

[–] [email protected] 1 points 1 year ago

I’m pretty sure it’s only a matter of time till an LLM can solve any sort of obscure compiler issue. If organic data growth happens outside of reddit, it’s not going to be of much use once search engines catch to those other sources.