this post was submitted on 27 Dec 2024
369 points (95.1% liked)
Technology
60138 readers
2734 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
So if you give a human and a system 10 tasks and the human completes 3 correctly, 5 incorrectly and 3 it failed to complete altogether... And then you give those 10 tasks to the software and it does 9 correctly and 1 it fails to complete, what does that mean. In general I'd say the tasks need to be defined, as I can give very many tasks to people right now that language models can solve that they can't, but language models to me aren't "AGI" in my opinion.
any cognitive Task. Not "9 out of the 10 you were able to think of right now".
Any is very hard to benchmark and is also not how humans are tested.
Agree. And these tasks can't be tailored to the AI in order for it to have a chance. It needs to drive to work, fix the computers/plumbing/whatever there, earn a decent salary and return with some groceries and cook dinner. Or at least do something comparable to a human. Just wording emails and writing boilerplate computer-code isn't enough in my eyes. Especially since it even struggles to do that. It's the "general" that is missing.
On the same hand... "Fluently translate this email into 10 random and discrete languages" is a task that 99.999% of humans would fail that a language model should be able to hit.
Agree. That's a super useful thing LLMs can do. I'm still waiting for Mozilla to integrate Japanese and a few other (distant to me) languages into my browser. And it's a huge step up from Google translate. It can do (to a degree) proverbs, nuance, tone... There are a few things AI or machine learning can do very well. And outperform any human by a decent margin.
On the other hand, we're talking about general intelligence here. And translating is just one niche task. By definition that's narrow intelligence. But indeed very useful to have, and I hope this will connect people and broaden their (and my) horizon.
This is more about robotics than AGI. A system can be generally intelligent without having a physical body.
You're - of course - right. Though I'm always a bit unsure about exactly that. We also don't attribute intelligence to books. For example an encyclopedia, or Wikipedia... That has a lot of knowledge stored, yet it is not intelligent. That makes me believe being intelligent has something to do with being able to apply knowledge, and do something with it. And outputting text is just one very limited form of interacting with the world.
And since we're using humans as a benchmark for the "general" part in AGI... Humans have several senses, they're able to interact with their environment in lots of ways, and 90% of that isn't drawing and communicating with words. That makes me wonder: Where exactly is the boundary between an encyclopedia and an intelligent entity... Is intelligence a useful metric if we exclude being able to do anything useful with it? And how much do we exclude by not factoring in parts of the environment/world?
And is there a difference between being book-smart and intelligent? Because LLMs certainly get all of their information second-hand and filtered in some way. They can't really see the world itself, smell it, touch it and manipulate something and observe the consequences... They only get a textual description of what someone did and put into words in some book or text on the internet. Is that a minor or major limitation, and do we know for sure this doesn't matter?
(Plus, I think we need to get "hallucinations" under control. That's also not 100% "intelligence", but it also cuts into actual use if that intelligence isn't reliably there.)