this post was submitted on 28 Jul 2023
162 points (94.0% liked)

Technology

58094 readers
3216 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

OpenAI just admitted it can't identify AI-generated text. That's bad for the internet and it could be really bad for AI models.::In January, OpenAI launched a system for identifying AI-generated text. This month, the company scrapped it.

you are viewing a single comment's thread
view the rest of the comments
[โ€“] [email protected] 2 points 1 year ago (1 children)

Unless I'm mistaken, aren't GANs mostly old news? Most of the current SOTA image generation models and LLMs are either diffusion-based, transformers, or both. GANs can still generate some pretty darn impressive images, even from a few years ago, but they proved hard to steer and were often trained to generate a single kind of image.

[โ€“] [email protected] 1 points 1 year ago* (last edited 1 year ago)

I haven't been in decision analytics for a while (and people smarter than I are working on the problem) but I meant more along the lines of the "model collapse" issue. Just because a human gives a thumbs up or down doesn't make it human written training data to be fed back. Eventually the stuff it outputs becomes "most likely prompt response that this user will thumbs up and accept". (Note: I'm assuming the thumbs up or down have been pulled back into model feedback).

Per my understanding that's not going to remove the core issue which is this:

Any sort of AI detection arms race is doomed. There is ALWAYS new 'real' video for training and even if GANs are a bit outmoded, the core concept of using synthetically generated content to train is a hot thing right now. Technically whomever creates a fake video(s) to train would have a bigger training set than the checkers.

Since we see model collapse when we feed too much of this back to the model we're in a bit of an odd place.

We've not even had a LLM available for the entire year but we're already having trouble distinguishing.

Making waffles so I only did a light google but I don't really think chatgpt is leveraging GANs for it's main algos, simply that the GAN concept could be applied easily to LLM text to further make delineation hard.

We're probably going to need a lot more tests and interviews on critical reasoning and logic skills. Which is probably how it should have been but it'll be weird as that happens.

sorry if grammar is fuckt - waffles