this post was submitted on 02 Aug 2023

116 points (92.0% liked)

Technology

57435 readers

3396 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

founded 1 year ago

MODERATORS

[email protected]

116

Tech experts are starting to doubt that ChatGPT and A.I. ‘hallucinations’ will ever go away: ‘This isn’t fixable’ (fortune.com)

submitted 1 year ago by [email protected] to c/[email protected]

100 comments fedilink hide all child comments

Tech experts are starting to doubt that ChatGPT and A.I. ‘hallucinations’ will ever go away: ‘This isn’t fixable’::Experts are starting to doubt it, and even OpenAI CEO Sam Altman is a bit stumped.

top 50 comments

sorted by: hot top controversial new old

[–] [email protected] 22 points 1 year ago (6 children)

"AI" are just advanced versions of the next word function on your smartphone keyboard, and people expect coherent outputs from them smh

[–] [email protected] 5 points 1 year ago

It is just that everyone now refers to LLMs when talking about AI even though it has sonmany different aspects to it. Maybe at some point there is an AI that actually understands the concepts and meanings of things. But that is not learned by unsupervised web crawling.

[–] [email protected] 5 points 1 year ago (1 children)

It is possible to get coherent output from them though. I’ve been using the ChatGPT API to successfully write ~20 page proposals. Basically give it a prior proposal, the new scope of work, and a paragraph with other info it should incorporate. It then goes through a section at a time.

The numbers and graphics need to be put in after… but the result is better than I’d get from my interns.

I’ve also been using it (google Bard mostly actually) to successfully solve coding problems.

I either need to increase the credit I giver LLM or admit that interns are mostly just LLMs.

load more comments (1 replies)

[–] [email protected] 3 points 1 year ago

So is your brain.

Relative complexity matters a lot, even if the underlying mechanisms are similar.

load more comments (3 replies)

[–] [email protected] 17 points 1 year ago (2 children)

In my limited experience the issue is often that the "chatbot" doesn't even check what it says now against what it said a few paragraphs above. It contradicts itself in very obvious ways. Shouldn't a different algorithm that adds a some sort of separate logic check be able to help tremendously? Or a check to ensure recipes are edible (for this specific application)? A bit like those physics informed NN.

[–] [email protected] 13 points 1 year ago* (last edited 1 year ago)

That's called context. For chatgpt it is a bit less than 4k words. Using api it goes up to a bit less of 32k. Alternative models goes up to a bit less than 64k.

Model wouldn't know anything you said before that

That is one of the biggest limitations of current generation of LLMs.

[–] [email protected] 3 points 1 year ago* (last edited 1 year ago)

Shouldn’t a different algorithm that adds a some sort of separate logic check be able to help tremendously?

Maybe, but it might not be that simple. The issue is that one would have to design that logic in a manner that can be verified by a human. At that point the logic would be quite specific to a single task and not generally useful at all. At that point the benefit of the AI is almost nil.

[–] [email protected] 15 points 1 year ago (4 children)

Yet I've still seen many people clamoring that we won't have jobs in a few years. People SEVERELY overestimate the ability of all things AI. From self driving, to taking jobs, this stuff is not going to take over the world anytime soon

[–] [email protected] 4 points 1 year ago* (last edited 1 year ago)

The problem is that these things never hit a point of competition with humans, they're either worse than us, or they blow way past us. Humans might drive better than a computer right now, but as soon as the computer is better than us it will always be better than us. People doubted that computers would ever beat the best humans at chess, or go, but within a lifetime of computers being invented they blew past us in both. Now they can write articles and paint pictures, sure we're better at it for now, but they're a million times faster than us, and they're making massive improvements month over month. you and I can disagree on how long it'll take for them to pass us, but once they do they'll replace us completely, and the world will never be the same.

[–] [email protected] 4 points 1 year ago* (last edited 1 year ago) (3 children)

Idk, an ai delivering low quality results for free is a lot more cash money than paying someone an almost living wage to perform a job with better results. I think corporations won't care and the only barrier will be whether or not the job in question involves enough physical labor to be performed by an ai or not.

[–] [email protected] 3 points 1 year ago (1 children)

AI isn't free. Right now, an LLM takes a not-insignificant hardware investment to run and a lot of manual human labor to train. And there's a whole lot of unknown and untested legal liability.

Smaller more purpose-driven generative AIs are cheaper, but the total cost picture is still a bit hazy. It's not always going to be cheaper than hiring humans. Not at the moment, anyway.

[–] [email protected] 3 points 1 year ago (1 children)

Compared to human work though, AI is basically free. I've been using the GPT3.5-turbo API in a custom app making calls dozens of times a day for a month now and I've been charged like 10 cents. Even minimum wage humans cost tens of thousands of dollars* per year*, thats a pretty high price that will be easy to undercut.

Yes, training costs are expensive, hardware is expensive, but those are one time costs. Once trained, a model can be used trillions of times for pennies, the same can't be said of humans

[–] altima_neo 2 points 1 year ago

You can bet your ass chat gpt won't be that cheap for long though. They're still developing it and using people as cheap beta testers.

load more comments (2 replies)

[–] [email protected] 12 points 1 year ago (1 children)

the models are also getting larger (and require even more insane amounts of resources to train) far faster than they are getting better.

[–] [email protected] 4 points 1 year ago (1 children)

But bigger models have new "emergent" capabilities. I heard that from a certain size they start to know what they know and hallucinate less.

[–] [email protected] 4 points 1 year ago (1 children)

Wow you heard that crazy bro

[–] [email protected] 2 points 1 year ago

One of the papers about it https://arxiv.org/pdf/2206.07682.pdf

[–] [email protected] 9 points 1 year ago (1 children)

People make a big deal out of this but they forget humans will make shit up all the time.

[–] [email protected] 11 points 1 year ago (5 children)

Yeah but humans can use critical thinking, even on themselves when they make shit up. I've definitely said something and then thought to myself "wait that doesn't make sense for x reason, that can't be right" and then I research and correct myself.

AI is incapable of this.

load more comments (5 replies)

[–] [email protected] 9 points 1 year ago (7 children)

Mean while every one is terrified that chatgpt is going to take their job. Ya we are a looooooooooong way off from that.

load more comments (7 replies)

[–] [email protected] 8 points 1 year ago (2 children)

This is trivially fixable. As is jailbreaking.

It's just that everyone is somehow still focused on trying to fix it in a single monolith model as opposed to in multiple passes of different models.

This is especially easy for jailbreaking, but for hallucinations, just run it past a fact checking discriminator hooked up to a vector db search index service (which sounds like a perfect fit for one of the players currently lagging in the SotA models), adding that as context with the original prompt and response to a revisionist generative model that adjusts the response to be in keeping with reality.

The human brain isn't a monolith model, but interlinked specialized structures that delegate and share information according to each specialty.

AGI isn't going to be a single model, and the faster the industry adjusts towards a focus on infrastructure of multiple models rather than trying to build a do everything single model, the faster we'll get to a better AI landscape.

But as can be seen with OpenAI gating and depreciating their pretrained models and only opening up access to fine tuned chat models, even the biggest player in the space seems to misunderstand what's needed for the broader market to collaboratively build towards the future here.

Which ultimately may be a good thing as it creates greater opportunity for Llama 2 derivatives to capture market share in these kinds of specialized roles built on top of foundational models.

load more comments (2 replies)

[–] [email protected] 7 points 1 year ago

We're likely already (or soon) hit a peak with current AI approach. Unless another breakthrough happen in AI research, ChatGPT probably won't improve much in the future. It might even regress due to OpenAI's effort to reduce computational cost and making their AI "safe" enough for general population.

[–] [email protected] 7 points 1 year ago (3 children)

I was excited for the recent advancements in AI, but seems the area has hit another wall. Seems it is best to be used for automating very simple tasks, or at best used as a guiding tool for professionals (ie, medicine, SWE, …)

[–] [email protected] 16 points 1 year ago (4 children)

Hallucinations is common for humans as well. It's just people who believe they know stuff they really don't know.

We have alternative safeguards in place. It's true however that current llm generation has its limitations

[–] [email protected] 8 points 1 year ago (1 children)

Not just common. If you look at kids, hallucinations come first in their development.

Later, they learn to filter what is real and what is not real. And as adults, we have weird thoughts that we suppress so quickly that we hardly remember them.

And for those with less developed filters, they have more difficulty to distinguish fact from fiction.

Generative AI is good at generating. What needs to be improved is the filtering aspect of AI.

[–] [email protected] 4 points 1 year ago* (last edited 1 year ago)

Hell, just look at various public personalities - especially those with extreme views. Most of what some of them say they have "hallucinated". Far more so than what GPT chat is doing.

[–] [email protected] 3 points 1 year ago (5 children)

Sure, but these things exists as fancy story tellers. They understand language patterns well enough to write convincing language, but they don't understand what they're saying at all.

The metaphorical human equivalent would be having someone write a song in a foreign language they barely understand. You can get something that sure sounds convincing, sounds good even, but to someone who actually speaks Spanish it's nonsense.

[–] [email protected] 3 points 1 year ago* (last edited 1 year ago) (6 children)

GPT can write and edit code that works. It simply can't be true that it's solely doing language patterns with no semantic understanding.

To fix your analogy: the Spanish speaker will happily sing along. They may notice the occasional odd turn of phrase, but the song as a whole is perfectly understandable.

load more comments (6 replies)

load more comments (4 replies)

[–] [email protected] 2 points 1 year ago (1 children)

You are two - CGP Grey us a good video about it.

load more comments (1 replies)

[–] [email protected] 2 points 1 year ago (1 children)

Humans can recognize and account for their own hallucinations. LLMs can't and never will.

[–] [email protected] 3 points 1 year ago

It's pretty ironic that you say they "never will" in this context.

[–] [email protected] 4 points 1 year ago

Well to be honest it is the best way, I mean, I'm pretty sure their purpose was a tool to aid people, and not to replace us... Right?

[–] [email protected] 3 points 1 year ago

Yeah I fully expect to see genre specific LLMs that have a subscription fee attatched squarely aimed at hobbies and industries.

When I finally find my new project car I would absolutely pay for a subscription to an LLM that has read every service manual and can explain to me in plain english what precise steps the job involves and can also answer followup questions.

[–] [email protected] 6 points 1 year ago (3 children)

Not with our current tech. We'll need some breakthroughs, but I feel like it's certainly possible.

load more comments (3 replies)

[–] [email protected] 5 points 1 year ago (4 children)

I don't understand why they don't use a second model to detect falsehoods instead of trying to fix it in the original LLM?

[–] [email protected] 11 points 1 year ago (2 children)

And then they can use a third model to detect falsehoods in the second model and a fourth model to detect falsehoods in the third model and... well, it's LLMs all the way down.

load more comments (2 replies)

load more comments (3 replies)

[–] [email protected] 3 points 1 year ago (2 children)

The way that one learns which of one's beliefs are "hallucinations" is to test them against reality — which is one thing that an LLM simply cannot do.

[–] [email protected] 2 points 1 year ago (1 children)

Sure they can and will as over time they will collect data to determine fact from fiction in the same way that we solve captchas by choosing all the images with bicycles in them. It will never be 100%, but it will approach it over time. Hallucinating will always be something to consider in a response, but it will certainly reduce overtime to the point that they will become rare for well discussed things. At least, that is how I am seeing it developing.

[–] [email protected] 4 points 1 year ago (2 children)

Why do you assume they will improve over time? You need good data for that.

Imagine a world where AI chatbots create a lot of the internet. Now that "data" is scraped and used to train other AIs. Hallucinations could easily persist in this way.

Or humans could just all post "the sky is green" everywhere. When that gets scraped, the resulting AI will know the word "green" follows "the sky is". Instant hallucination.

These bots are not thinking about what they type. They are copying the thoughts of others. That's why they can't check anything. They are not programmed to be correct, just to spit out words.

[–] [email protected] 2 points 1 year ago

I can only speak from my experience which over the past 4 months of daily use of ChatGPT 4 +, it has gone from many hallucinations per hour, to now only 1 a week. I am using it to write c# code and I am utterly blown away how good it has not only gotten with writing error free code, but even more so, how good it has gotten at understanding a complex environment that it cannot even see beyond me trying to explain via prompts. Over the past couple of weeks in particular, it really feels like it has gotten more powerful and for the first time, “feels” like I am working with an expert person. If you asked me in May where it would be at today, I would not have guessed as good as it is. I thought this level of responses which are very intelligent were at least another 3-5 years away.

load more comments (1 replies)

[–] [email protected] 2 points 1 year ago (4 children)

Disclaimer: I am not an AI researcher and just have an interest in AI. Everything I say is probably jibberish, and just my amateur understanding of the AI models used today.

It seems these LLM's use a clever trick in probability to give words meaning via statistic probabilities on their usage. So any result is just a statistical chance that those words will work well with each other. The number of indexes used to index "tokens" (in this case words), along with the number of layers in the AI model used to correlate usage of these tokens, seems to drastically increase the "intelligence" of these responses. This doesn't seem able to overcome unknown circumstances, but does what AI does and relies on probability to answer the question. So in those cases, the next closest thing from the training data is substituted and considered "good enough". I would think some confidence variable is what is truly needed for the current LLMs, as they seem capable of giving meaningful responses but give a "hallucinated" response when not enough data is available to answer the question.

Overall, I would guess this is a limitation in the LLMs ability to map words to meaning. Imagine reading everything ever written, you'd probably be able to make intelligent responses to most questions. Now imagine you were asked something that you never read, but were expected to respond with an answer. This is what I personally feel these "hallucinations" are, or imo best approximations of the LLMs are. You can only answer what you know reliably, otherwise you are just guessing.

[–] [email protected] 2 points 1 year ago (5 children)

I have experience in creating supervised learning networks. (not large language models) I don't know what tokens are, I assume they are output nodes. In that case I think increasing the output nodes don't make the Ai a lot more intelligent. You could measure confidence with the output nodes if they are designed accordingly (1 node corresponds to 1 word, confidence can be measured with the output strength). Ai-s are popular because they can overcome unknown circumstances (most of the cases), like when you input a question slightly different way.

I agree with you on that Ai has a problem understanding the meaning of the words. The Ai's correct answers happened to be correct because the order of the words (output) happened to match with the order of the correct answer's words. I think "hallucinations" happen when there is no sufficient answers to the given problem, the Ai gives an answer from a few random contexts pieced together in the most likely order. I think you have mostly good understanding on how Ai-s work.

load more comments (5 replies)

load more comments (3 replies)

load more comments