this post was submitted on 20 Jul 2023
660 points (97.4% liked)

Technology

59341 readers
5788 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

Over just a few months, ChatGPT went from correctly answering a simple math problem 98% of the time to just 2%, study finds. Researchers found wild fluctuations—called drift—in the technology’s abi...::ChatGPT went from answering a simple math correctly 98% of the time to just 2%, over the course of a few months.

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 8 points 1 year ago

I suspect that GPT4 started with a crazy parameter count (rumored 1.8 Trillion and 8x200B expert "sub-models") and distilled those experts down to something below 100B. We've seen with Orca that a 13B model can perform at 88% the level of ChatGPT-3.5 (175B) when trained on high quality data, so there's no reason to think that OpenAI haven't explored this on their own and performed the same distillation techniques. OpenAI is probably also using quantization and speculative sampling to further reduce the burden, though I expect these to have less impact on real world performance.