this post was submitted on 03 Sep 2024
1578 points (97.8% liked)

Technology

59677 readers
3239 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
(page 3) 50 comments
sorted by: hot top controversial new old
[–] [email protected] 17 points 2 months ago* (last edited 2 months ago)

"because it's supposedly "impossible" for the company to train its artificial intelligence models — and continue growing its multi-billion-dollar-business — without them."

O no! Poor richs cant get more rich fast enough :(

[–] bizza 16 points 2 months ago (1 children)

Copyright is a pain in the ass, but Sam Altman is a bigger pain in the ass. Send him to prison and let him rot. Then put his tears in a cup and I'll drink them

load more comments (1 replies)
[–] [email protected] 16 points 2 months ago

Oh no. Anyway...

[–] [email protected] 14 points 2 months ago (1 children)

And I can't eat without shoplifting...

load more comments (1 replies)
[–] [email protected] 14 points 2 months ago

Shut it down then and stop stealing other peoples shit

[–] [email protected] 14 points 2 months ago (4 children)

that guy in that picture looks like the "unwanted house guest" from those memes from 10 years ago

load more comments (4 replies)
[–] [email protected] 14 points 2 months ago

Wow, that's a shame. Anyway, take all his money and throw him in a ditch someplace.

[–] [email protected] 13 points 2 months ago (9 children)

What irks me most about this claim from OpenAI and others in the AI industry is that it's not based on any real evidence. Nobody has tested the counterfactual approach he claims wouldn't work, yet the experiments that came closest--the first StarCoder LLM and the CommonCanvas text-to-image model--suggest that, in fact, it would have been possible to produce something very nearly as useful, and in some ways better, with a more restrained training data curation approach than scraping outbound Reddit links.

All that aside, copyright clearly isn't the right framework for understanding why what OpenAI does bothers people so much. It's really about "data dignity", which is a relatively new moral principle not yet protected by any single law. Most people feel that they should have control over what data is gathered about their activities online, as well as what is done with those data after it's been collected, and even if they publish or post something under a Creative Commons license that permits derived uses of their work, they'll still get upset if it's used as an input to machine learning. This is true even if the generative models thereby created are not created for commercial reasons, but only for personal or educational purposes that clearly constitute fair use. I'm not saying that OpenAI's use of copyrighted work is fair, I'm just saying that even in cases where the use is clearly fair, there's still a perceived moral injury, so I don't think it's wise to lean too heavily on copyright law if we want to find a path forward that feels just.

load more comments (9 replies)
[–] [email protected] 12 points 2 months ago

It's impossible for me to make money without robbing a bank, please let me do that parliament it would be so funny

[–] [email protected] 12 points 2 months ago (1 children)

Honestly, copyright is shit. It is created on the basis of an old way of doing things. That is, where big editors and big studios make mass productions of physical copies of a said 'product'. George R. R. Martin , Warner Studios & co are rich. Maybe they have everything to lose without their copy'right' but that isn't the population's problem. We live in an era where everything is digital and easily copiable and we might as well start acting like it.

I don't care if Sam Altman is evil, this discussion is fundamental.

[–] [email protected] 17 points 2 months ago (3 children)

How did GRRM get rich again?

oh yeah he sold books he worked on for decades, totally the same WB.

load more comments (3 replies)
[–] [email protected] 12 points 2 months ago

What kind of a pathetic statement is that ?

[–] [email protected] 11 points 2 months ago

We can't make money paying for "AI", going to theaters, or paying for streaming services.

So I guess everybody gets a piracy!

[–] [email protected] 11 points 2 months ago

Sounds like they need better bootstraps.

Or at least a business model.

[–] [email protected] 10 points 2 months ago

Idk, usually people shut down their business if it can't make a profit...

[–] [email protected] 10 points 2 months ago

Aww poor shit company and their poor money problems.

[–] [email protected] 10 points 2 months ago

“Too fucking bad”

[–] [email protected] 10 points 2 months ago (2 children)

So I got a crazy idea - hear me out - how about we just abolish copyright completely, for everyone?

I mean, it works in China pretty well.

[–] [email protected] 10 points 2 months ago (3 children)

https://en.wikipedia.org/wiki/Intellectual_property_in_China

Looks like there are still copyright laws in China. What are you on about?

load more comments (3 replies)
load more comments (1 replies)
[–] [email protected] 9 points 2 months ago* (last edited 2 months ago) (3 children)

The internet has been primarily derivative content for a long time. As much as some haven't wanted to admit it. It's true. These fancy algorithms now take it to the exponential factor.

Original content had already become sparsely seen anymore as monetization ramped up. And then this generation of AI algorithms arrived.

The several years before prior to LLMs becoming a thing, the internet was basically just regurgitating data from API calls or scraping someone else's content and representing it in your own way.

load more comments (3 replies)
[–] [email protected] 9 points 2 months ago

Ok... Is that supposed to be a good reason?

load more comments
view more: ‹ prev next ›