this post was submitted on 09 Jan 2024
527 points (98.2% liked)

Technology

57435 readers
3277 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

‘Impossible’ to create AI tools like ChatGPT without copyrighted material, OpenAI says::Pressure grows on artificial intelligence firms over the content used to train their products

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 5 points 7 months ago (10 children)

Too bad

Why do they have free reign to store and use copyrighted material as training data? AIs don’t learn as a human would, and comparisons can’t be made between the learning processes.

[–] [email protected] -1 points 7 months ago* (last edited 7 months ago) (8 children)

Why do you have free reign to do the same?

AIs don’t learn as a human would, and comparisons can’t be made between the learning processes.

I think you're going to have a hard time proving a financial distinction between them

[–] [email protected] 3 points 7 months ago (7 children)

You don’t need to prove a financial difference. They are fundamentally different systems that function in different ways. They cannot be compared 1:1 and laws cannot be applied as a 1:1. New regulations need to be added around AI use of copyrighted material.

[–] [email protected] 0 points 7 months ago (1 children)

I agree. For instance, it should be secured in law that you can train AI on anything, to avoid frivolous discussions like this.

Output is what should be moderated by law.

[–] [email protected] 1 points 7 months ago (1 children)

No

Why are you entitled to use everyone else’s work? It should be secured in law that licensing applies to training data to avoid frivolous discussions like this. Then it’s an entirely opt-in solution, which works in the benefit of everyone except the people stealing data.

Output doesn’t matter since it’s pretty well settled it’s not derivative work (as much as I disagree with that statement).

[–] [email protected] 2 points 7 months ago (1 children)

the people stealing data

No one is doing this

Output doesn’t matter since it’s pretty well settled it’s not derivative work

Cool, discussion over.

[–] [email protected] 0 points 7 months ago (1 children)

It is stealing data. In order to train on it they have to store the data. That’s a copyright violation. There’s no way to interpret it as not stealing data.

[–] [email protected] 0 points 7 months ago (1 children)

It is not stealing. The data is still there. It is, at worst, copyright violation.

[–] [email protected] 2 points 7 months ago (1 children)

Copyright violations is stealing

[–] [email protected] 0 points 7 months ago

Stealing means someone has been deprived of their property, which is not the case for copyright violations.

load more comments (5 replies)
load more comments (5 replies)
load more comments (6 replies)