511
submitted 1 month ago by [email protected] to c/[email protected]
you are viewing a single comment's thread
view the rest of the comments
[-] [email protected] 5 points 1 month ago

The training doesn't use csam, 0% chance big tech would use that in their dataset. The models are somewhat able to link concept like red and car, even if it had never seen a red car before.

[-] [email protected] 3 points 1 month ago

Well, with models like SD at least, the datasets are large enough and the employees are few enough that it is impossible to have a human filter every image. They scrape them from the web and try to filter with AI, but there is still a chance of bad images getting through. This is why most companies install filters after the model as well as in the training process.

[-] [email protected] 6 points 1 month ago

You make it sound like it is so easy to even find such content on the www. The point is, they do not need to be trained on such material. They are trained on regular kids, so they know their sizes, faces, etc. They're trained on nude bodies, so they also know how hairless genitals or flat chests look like. You don't need to specifically train a model on nude children to generate nude children.

this post was submitted on 21 May 2024
511 points (95.7% liked)

Technology

55938 readers
3288 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS