this post was submitted on 29 Sep 2023

438 points (93.5% liked)

Technology

59672 readers

2920 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

founded 1 year ago

MODERATORS

[email protected]

438

Authors Are Furious After Finding Their Works on List of Books Used To Train AI (www.themarysue.com)

submitted 1 year ago by [email protected] to c/[email protected]

146 comments fedilink hide all child comments

Authors using a new tool to search a list of 183,000 books used to train AI are furious to find their works on the list.

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 8 points 1 year ago (2 children)

just reinforcement learning models

...like the naturally occuring neural networks are.

[–] [email protected] 32 points 1 year ago (3 children)

The brain does not work the way you think… (I work in the field, bio-informatics). What you call “neural networks” come from an early misunderstanding of how the brain stores information. It’s a LOT more complicated and frankly, barely understood.

[–] [email protected] 12 points 1 year ago (1 children)

Yeah, accurately simulating a single pyramidal neuron requires an eight-layer deep neural network:

https://www.cell.com/neuron/pdf/S0896-6273(21)00501-8.pdf

[–] [email protected] 3 points 1 year ago

that was an interesting read, thank you

[–] [email protected] 4 points 1 year ago (2 children)

It’s a LOT more complicated and frankly, barely understood.

Yet you confidently state that the brain doesn't work the way LLMs do?

Obviously it doesn't work exactly the same way that LLMs do, if only because of the completely different substrates. But when you get to more nebulous concepts like "creativity" and "inspiration" it's not so clear.

[–] [email protected] 5 points 1 year ago

The part where brain and neural net differ is in the learning via backpropagation, that seem to be done different in the brain, as there is no mechanism to go backwards through the network and jiggle the weights.

That aside, they seem to work very similar once they are trained, as the knowledge they are able to extract from data ends up being basically the same that a human would be able to extract. There is surprisingly little weirdness in AI and a surprising amount of human-like capabilities.

[–] [email protected] 0 points 1 year ago

people have a definite fear of being defined as machines... not sure why we think were so special..

[–] [email protected] -4 points 1 year ago (1 children)

so its barely understood, but this definitely is not it. got it.

[–] [email protected] -2 points 1 year ago (1 children)

But you, random stranger on the internet, knows better than the guy that literally works in the field. Got it.

[–] [email protected] 0 points 1 year ago

i do? where did i claim that?

[–] [email protected] 10 points 1 year ago (1 children)

Tell you what, you get a landmark legal decision classifying LLM as people and then we'll talk.

Until then it's software being fed content in a way not permitted by its license i.e. the makers of that software committing copyright infringement.

[–] [email protected] 8 points 1 year ago* (last edited 1 year ago) (1 children)

What exactly was not permitted by the license? Reading?

[–] [email protected] 13 points 1 year ago (4 children)

Using it to (create a tool to) create derivatives of the work on a massive scale.

[–] [email protected] 8 points 1 year ago* (last edited 1 year ago)

An AI model is not a derivative work. It does not contain the copyrighted expression, just information about the copyrighted expression.

[–] [email protected] 7 points 1 year ago

Wikipedia: In copyright law, a derivative work is an expressive creation that includes major copyrightable elements of a first, previously created original work.

I think you may be off a bit on what a derivative work is. I don't see LLMs spouting out major copyrightable elements of books. They can give a summary sure, but Cliff Notes would like to have a word if you think that's copyright infringement.

[–] [email protected] 5 points 1 year ago (1 children)

Better tell that Google and their search index, book scanning project and knowledge graph.

[–] [email protected] -1 points 1 year ago* (last edited 1 year ago)

I didn't know those were LLMs, TIL.

[–] [email protected] -3 points 1 year ago (1 children)

Well when that happens we have laws. So no problems

[–] [email protected] 2 points 1 year ago (2 children)

Would you be okay with applying that argument for any crime?

[–] [email protected] 3 points 1 year ago

I would be, and I don't understand why you think this would be a problem. I wouldn't want the government to be preventing activities that there weren't any actual laws prohibiting.

[–] [email protected] 0 points 1 year ago (1 children)

Ever heard of the early 21st century classic Minority Report

[–] [email protected] 4 points 1 year ago

You're missing the point. I'll make your example more specific.

Well when fraud/rape/murder happens we have laws. So no problems.

Those things happen. Creating a LLM based on copyrighted material without permission happens - it's not a hypothetical. But even then, giving a punishment after the fact does not make the initial crime "no problem", as you put it.