this post was submitted on 21 Nov 2024

150 points (97.5% liked)

Technology

59562 readers

3078 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

founded 1 year ago

MODERATORS

[email protected]

150

Child safety org launches AI model trained on real child sex abuse images (arstechnica.com)

submitted 1 day ago by [email protected] to c/[email protected]

77 comments fedilink hide all child comments

Today, a prominent child safety organization, Thorn, in partnership with a leading cloud-based AI solutions provider, Hive, announced the release of an AI model designed to flag unknown CSAM at upload. It's the earliest AI technology striving to expose unreported CSAM at scale.

(page 2) 27 comments

sorted by: hot top controversial new old

[–] [email protected] 15 points 1 day ago (11 children)

This seems like a potential actual good use of AI. Can't have been much fun to train it though.

And is there any risk of people turning these kinds of models around and using them to generate images?

[–] [email protected] 14 points 1 day ago

And is there any risk of people turning these kinds of models around and using them to generate images?

There isn't really much fundamental difference between an image detector and an image generator. The way image generators like stable diffusion work is essentially by generating a starting image that's nothing but random static and telling the generator "find the cat that's hidden in this noise."

It'll probably take a bit of work to rig this child porn detector up to generate images, but I could definitely imagine it happening. It's going to make an already complicated philosophical debate even more complicated.

[–] [email protected] 8 points 1 day ago

I think image generators in general work by iteratively changing random noise and checking it with a classifier, until the resulting image has a stronger and stronger finding of “cat” or “best quality” or “realistic”.

If this classifier provides fine grained descriptive attributes, that’s a nightmare. If it just detects yes or no, that’s probably fine.

[–] [email protected] 7 points 1 day ago

Nobody would have been looking directly at the source data. The FBI or whoever provides the dataset to approved groups, but after that you just say "use all the images in this folder" and it goes. But I don't even know if they actually provide real full-resolution images, or just perceptual hashes, or downsampled images.

And while it's possible to use the dataset to generate new images assuming the training data had full-res images, like I said, I know they investigate the people making the request before allowing access. And access is probably supervised and audited.

load more comments (8 replies)

[–] [email protected] 8 points 1 day ago (1 children)

This is a great development, albeit with a lot of soul crushing development behind it I assume. People who have to look at CSAM or whatever the acronym is have a miserable job, so I'm very supportive of trying to automate that away from people.

[–] [email protected] 1 points 1 day ago* (last edited 1 day ago)

Yeah, I’m happy for AI to take this particular horrifying job from us. Chances are it will be overtuned (too strict), but if there’s a reasonable appeals process I could see it saving a lot of people the trauma of having to regularly view the worst humanity has to offer without major drawbacks.

[–] [email protected] -1 points 1 day ago

... robo chocolate?

[–] [email protected] 2 points 1 day ago (1 children)

I think all CSAM should be destroyed out of respect for the victims, not proliferated. I don't care who is hanging onto this material or for what purpose.

load more comments (1 replies)

[–] [email protected] -5 points 1 day ago (2 children)

At this point how does it differ w/ generating AI powered CP? morons

[–] [email protected] 10 points 1 day ago (1 children)

Uh, well this one tells you if an image looks like it or not. It doesn’t generate images

[–] [email protected] -2 points 1 day ago (1 children)

If it knows if an image looks like it it can generate something like it, one step further

[–] [email protected] 2 points 1 day ago (1 children)

Correct, this kind of software is trained on CP data. So such models can be easily used to generate CP instead of recognizing it, which makes them very dangerous indeed.

Same idea as the current models that are trained to recognized cars, these models can also be used to generate a car from noise as a starting poiint.

[–] [email protected] 4 points 1 day ago (1 children)

In pretttty sure you can’t just run it in reverse like that. There’s a whole different training and operation methodology you have to use to support generating images rather than simple text classification

load more comments (1 replies)

[–] [email protected] 4 points 1 day ago (11 children)

It differs in basically being something completely different. This is a classification model, doesn't have generative capabilities. Even if you were to get the model and it's weights, and you tried to reverse engineer an "input" that it would classify as CP, it would most likely look like pure noise to you.

Moron

load more comments (11 replies)

load more comments