this post was submitted on 13 Nov 2024
670 points (95.0% liked)
Technology
59672 readers
3246 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Not copilot, but I run into a fourth problem:
4. The LLM gets hung up on insisting that a newer feature of the language I'm using is wrong and keeps focusing on "fixing" it, even though it has access to the newest correct specifications where the feature is explicitly defined and explained.
Oh god yes, ran into this asking for a shell.nix file with a handful of tricky dependencies. It kept trying to do this insanely complicated temporary pull and build from git instead of just a 6 line file asking for the right packages.
"This code is giving me a return value of X instead of Y"
"Ah the reason you're having trouble is because you initialized this list with brackets instead of
new()
.""How would a syntax error give me an incorrect return"
"You're right, thanks for correcting me!"
"Ok so like... The problem though."
Yeah, once you have to question its answer, it's all over. It got stuck and gave you the next best answer in it's weights which was absolutely wrong.
You can always restart the convo, re-insert the code and say what's wrong in a slightly different way and hope the random noise generator leads it down a better path :)
I'm doing some stuff with translation now, and I'm finding you can restart the session, run the same prompt and get better or worse versions of a translation. After a few runs, you can take all the output and ask it to rank each translation on correctness and critique them. I'm still not completely happy with the output, but it does seem that sometime if you MUST get AI to answer the question, there can be value in making it answer it across more than one session.