News and Discussions about Reddit
Welcome to !reddit. This is a community for all news and discussions about Reddit.
The rules for posting and commenting, besides the rules defined here for lemmy.world, are as follows:
Rules
Rule 1- No brigading.
**You may not encourage brigading any communities or subreddits in any way. **
YSKs are about self-improvement on how to do things.
Rule 2- No illegal or NSFW or gore content.
**No illegal or NSFW or gore content. **
Rule 3- Do not seek mental, medical and professional help here.
Do not seek mental, medical and professional help here. Breaking this rule will not get you or your post removed, but it will put you at risk, and possibly in danger.
Rule 4- No self promotion or upvote-farming of any kind.
That's it.
Rule 5- No baiting or sealioning or promoting an agenda.
Posts and comments which, instead of being of an innocuous nature, are specifically intended (based on reports and in the opinion of our crack moderation team) to bait users into ideological wars on charged political topics will be removed and the authors warned - or banned - depending on severity.
Rule 6- Regarding META posts.
Provided it is about the community itself, you may post non-Reddit posts using the [META] tag on your post title.
Rule 7- You can't harass or disturb other members.
If you vocally harass or discriminate against any individual member, you will be removed.
Likewise, if you are a member, sympathiser or a resemblant of a movement that is known to largely hate, mock, discriminate against, and/or want to take lives of a group of people, and you were provably vocal about your hate, then you will be banned on sight.
Rule 8- All comments should try to stay relevant to their parent content.
Rule 9- Reposts from other platforms are not allowed.
Let everyone have their own content.
:::spoiler Rule 10- Majority of bots aren't allowed to participate here.
view the rest of the comments
You know... I don't understand why AI would need access to an API. Can't they just crawl the web? HTML5 was designed with AI in mind iirc. But I'm no expert so I'll probably talk bs here.
They can get all the data they want without the API, and I think they will from here on out. But if a site happens to provide data in a convenient API format for free, they might use it. Or they may not, I don't know what evidence Reddit used to decide that the LLMs training on Reddit data was doing it via the API. It's possible that all Reddit ever intended to do was kill third party apps (and it seems like they're still settling for killing most, and maiming the stragglers).
I think it's only a matter of time before Reddit sets up rate limits and blocks unregistered views like Twitter in another misguided attempt to stop AI from pulling in massive amounts of user data.
Problem is that there's nothing stopping an AI company from just creating thousands of dummy accounts to get as much data as they need before each hits the daily rate limit anyways.
Short of locking all data away that is older than a specific time, it's a losing battle no matter how you look at it. And going that far is sure to doom the service anyways.
I think Reddit would like to keep people clicking through from Google searches, which, apparently, Twitter doesn't, but I suppose they might. And yeah, I think bots are basically inevitable, one way or another.
On the other hand, with increasing amounts of bot-generated content on Reddit, it might not be long before no one wants to train LLMs on Reddit anymore, lol.