But which fraction of AI-generated content is the threshold for collapse? And how is that fraction measured? How much new input is necessary to make sure the model does not overlook it?
Technology
This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.
Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.
Rules:
1: All Lemmy rules apply
2: Do not post low effort posts
3: NEVER post naziped*gore stuff
4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.
5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)
6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist
7: crypto related posts, unless essential, are disallowed
All good questions. I'm not sure if anyone has actually run simulations to estimate the percentage.
oh the irony...
Best term I've heard for this is "Hapsburg AI"
It would be ironic if companies had to pay people to generate clean content to train the AIs.
Or they could just scrape the Internet from before the 2020s haha.
The ultimate irony if it goes this way. The singularity is inherently impossible, instead of an exponential explosion of machine intelligence, there is an exponential implosion.
You know that’s a very interesting perspective. This way, artificial intelligence might find an equilibrium point and going past that could turn out to be really difficult. As far as LLMs are concerned, we might reach that point very soon. However, if you combine several different types of models to make something that can think before speaking, then we’re once again exploring uncharted territory.
True, and an interesting idea. I wonder though if this will show long term that pure computational methods cannot produce sentience at the level of human beings. I still wonder where AI would be now if analog and fuzzy computing had taken off in the 1950's instead of digital/binary computing.
I’ve posted a few GPT generated poems on Reddit. If some next generation LLM uses those as their learning data, it’s not going to get much better than whatever LLM I used back then.
Actually, it might just improve a little bit, but not much. I was the editor of those poems, which means I didn’t accept just any random garbage GPT gave me. It involved a few iterations until the poem was good enough for me. Still probably nowhere near what an actual poet would have written.
My poems were mainly intended to discuss very concrete topics in an entertaining manner, whereas real poems made by real poets tend to use complex symbolism and discuss topics that reflect on the nature of the very core of the human soul.
This seems inevitable