this post was submitted on 11 Sep 2023
154 points (92.8% liked)
Technology
59370 readers
4079 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
What stupid bullshit. There is nothing remotely close to an artificial general intelligence in a large language model. This person is a crackpot fool. There is no way for a LLM to have persistent memory. Everything outside of the model that pre and post processes information is where the smoke and mirrors exist. This just just databases and standard code.
The actual model is just a system of categorization and tensor math. It is complex vector math. That is it. There is nothing else going on inside the model. If you want to modify it, you need to recalculate a bunch of math as it relates to the existing vectors/tensor tables. All of this math is static. It can't change. It can't adapt. It can't plan. It has some surprising features that one might not expect to be embedded in human language alone, but that is all this is. Try offline, open source, AI. Use Oobabooga, get models from Hugging Face, start with something like a Llama2 7B. This is not hard. You do not need a graphics card. There are lots of models that work great on just a CPU. You will need a good amount of RAM for running a really good model. A 7B is like talking to a teenager prone to lying, a 13B is like a 20 year old, a 30B at 8bit quantization is like an inexperienced late twenty-something. A 70B at 4 bit quantization is like a 30yo with a masters degree. A 70B at 4 bits will need around 14+ CPU logical cores, and 64GB of system memory to generate around 2 tokens a second, this is around 1-2 words per second and is about as slow as is practical.
Don't believe anything you read in bullshit media about AI right now, and ignore the proprietary stalkerware garbage. The open source offline AI world is the future and it is yours to do as you please. Try it! It is fun.
Wow, that's some of the most concrete, down-to-earth explanation of what everyone is calling AI. Thanks.
I'm technical, but haven't found a good article explaining today's AI in a way I can grasp well enough to help my non-technical friends and family. Any recommendations? Maybe something you've written?
It would be funny if that comment was ai generated.
I read once we shouldn't be worried when AI starts passing Turing tests, we should worry when they start failing them again 🤣
I read a physical book about using chatGPT that I'm pretty sure was written by chatGPT.
Sidenote: you don't need to read a book about using chatGPT.
I’ve had most success explaining LLM ‘fallibility’ to non-techies using the image gen examples. Google ‘AI hands’, and ask them if they see anything wrong. Now point out that we’re _extremely_sensitive to anything wrong with our hands, and so these are very easy for us to spot. But the AI has no concept of what a hand is, it’s just seen a _lot _ of images from different angles, sometimes fingers are hidden, sometimes intertwined etc. So it will happily generate lots more of those kinds of images, with no regard to whether they could / should actually exists.
It’s a pretty similar idea with the LLMs. It’s seen a lot of text, and can put together words in a convincing-looking way. But it has no concept of what it’s writing, and the equivalent of the ‘hands’ will be there in the text. It’s just that we can’t see them at first glance like we can with the hands.
Nice comparisons. Will add that to my explanations.
Thanks!
This one helped me a bit - https://www.understandingai.org/p/large-language-models-explained-with
Thanks!
Yann LeCun is the main person behind open source offline AI as far as putting the pieces in place and events that lead to where we are now. Maybe think of him as the Dennis Ritchie or Stallman of AI research. https://piped.video/watch?v=OgWaowYiBPM
I am not the brightest kid in the room. I'm just learning this stuff in practice and sharing some of what I have picked up thus far. I am at a wall when it comes to things like understanding rank 3 tensors or greater, and I still can't figure out exactly how the categorization network is implemented. I think that last one has to do with Transformers and has something to do with rotation of vectors in an efficient way, but I haven't figured it out intuitively yet. Thanks for the complement through.
Oh crap, you already done lost me in the second half there, but I'll give the link a watch.
Thanks again!