Technology

59651 readers

2692 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

founded 1 year ago

MODERATORS

[email protected]

329

Amazon cloud boss echoes NVIDIA CEO on coding being dead in the water: "If you go forward 24 months from now, it's possible that most developers are not coding" (www.windowscentral.com)

submitted 3 months ago by [email protected] to c/[email protected]

215 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 8 points 3 months ago

An inherent flaw in transformer architecture (what all LLMs use under the hood) is the quadratic memory cost to context. The model needs 4 times as much memory to remember its last 1000 output tokens as it needed to remember the last 500. When coding anything complex, the amount of code one has to consider quickly grows beyond these limits. At least, if you want it to work.

This is a fundamental flaw with transformer - based LLMs, an inherent limit on the complexity of task they can 'understand'. It isn't feasible to just keep throwing memory at the problem, a fundamental change in the underlying model structure is required. This is a subject of intense research, but nothing has emerged yet.

Transformers themselves were old hat and well studied long before these models broke into the mainstream with DallE and ChatGPT.