this post was submitted on 24 Jul 2024

1077 points (98.4% liked)

Technology

59651 readers

2643 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

founded 1 year ago

MODERATORS

[email protected]

1077

Forget security – Google's reCAPTCHA v2 is exploiting users for profit | Web puzzles don't protect against bots, but humans have spent 819 million unpaid hours solving them (www.theregister.com)

submitted 4 months ago by [email protected] to c/[email protected]

173 comments fedilink hide all child comments

Research Findings:

reCAPTCHA v2 is not effective in preventing bots and fraud, despite its intended purpose
reCAPTCHA v2 can be defeated by bots 70-100% of the time
reCAPTCHA v3, the latest version, is also vulnerable to attacks and has been beaten 97% of the time
reCAPTCHA interactions impose a significant cost on users, with an estimated 819 million hours of human time spent on reCAPTCHA over 13 years, which corresponds to at least $6.1 billion USD in wages
Google has potentially profited $888 billion from cookies [created by reCAPTCHA sessions] and $8.75–32.3 billion per each sale of their total labeled data set
Google should bear the cost of detecting bots, rather than shifting it to users

"The conclusion can be extended that the true purpose of reCAPTCHA v2 is a free image-labeling labor and tracking cookie farm for advertising and data profit masquerading as a security service," the paper declares.

In a statement provided to The Register after this story was filed, a Google spokesperson said: "reCAPTCHA user data is not used for any other purpose than to improve the reCAPTCHA service, which the terms of service make clear. Further, a majority of our user base have moved to reCAPTCHA v3, which improves fraud detection with invisible scoring. Even if a site were still on the previous generation of the product, reCAPTCHA v2 visual challenge images are all pre-labeled and user input plays no role in image labeling."

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 50 points 4 months ago (2 children)

Do them wrong and then close out

[–] [email protected] 46 points 4 months ago (1 children)

I do it right and it says I’m wrong =\

[–] [email protected] 38 points 4 months ago (1 children)

I have bad news for you friend...

You might be a robot

[–] [email protected] 18 points 4 months ago (2 children)

What do you mean? I am a fleshy human and do fleshy human things like being made of flesh.

[–] [email protected] 3 points 4 months ago

Ever heard of bio-robots?

[–] [email protected] 2 points 4 months ago (2 children)

Time to take a knife and check for sure

Seriously /s Don't harm yourself!

[–] [email protected] 1 points 4 months ago

I disassembled my tail using a knife and it reassembled itself. Based on new data, my name is Rafael Cruz.

[–] [email protected] 1 points 4 months ago

Harm yourself?

Take the knife and harm the people responsible for this travesty. The laws of robotics prevent robots from harming humans: if you manage to harm them, then that means either you're human or they're not!

[–] [email protected] 2 points 4 months ago (4 children)

It knows they’re wrong which is why I don’t really think this article is accurate. Is it training if it already has the answers? Probably not.

[–] [email protected] 23 points 4 months ago* (last edited 4 months ago) (2 children)

That's why it gives you a panel of 9 images. It would have a high confidence on some images, and a low confidence on others. When you pick the correct images and don't pick incorrect ones it uses the ones it's confident about as "validation" while taking the feedback on low confidence images to update the training data.

What this does mean in practice is that only ones actually being "graded" are the ones bots can solve anyway.

[–] [email protected] 5 points 4 months ago

and it will show the images to multiple people

[–] [email protected] 1 points 4 months ago

It seems exactly like that, I experimented with it by trying to leave the one I think it has low confidence unchecked, and it often worked.

[–] [email protected] 5 points 4 months ago (1 children)

My understanding is different from others here. I thought they served the same Captcha to many people at once and use the majority response to decide who is answering correctly.

[–] [email protected] 4 points 4 months ago

That's true, or at least it used to be back when they were using it for OCR. I have no reason to believe it's changed.

[–] [email protected] 2 points 4 months ago (1 children)

It's why they ask you to do multiple, 1-2 of them are the control group, they are training on the others

[–] [email protected] 2 points 4 months ago (1 children)

You're implying they give you multiple. I hardly ever get multiple, pretty much only if I 'fail' the first one.

[–] [email protected] 4 points 4 months ago

If they have a good fingerprint on you they don't need the control group. That's why you get 5+ captchas when using a VPN/tor.

[–] [email protected] 1 points 4 months ago

If they gave two captchas, one which they knew the answer and one which they didn't, they could use the second for training. (Even if you're paying someone, you want to do that sort of thing when crowdsourcing data, because you never know if the paid person is just screwing around.)