[-] [email protected] 9 points 15 hours ago

"Shhh honey, I'm about to kill God."

[-] [email protected] 3 points 15 hours ago

Unfortunately, removing Harris from the ticket doesn't have the best optics in a lot of scenarios.

[-] [email protected] 13 points 15 hours ago

Exactly. The difference between a cached response and a live one even for non-AI queries is an OOM difference.

At this point, a lot of people just care about the 'feel' of anti-AI articles even if the substance is BS though.

And then people just feed whatever gets clicks and shares.

[-] [email protected] 1 points 3 days ago

It's right in the research I was mentioning:

https://transformer-circuits.pub/2024/scaling-monosemanticity/index.html

Find the section on the model's representation of self and then the ranked feature activations.

I misremembered the top feature slightly, which was: responding "I'm fine" or gives a positive but insincere response when asked how they are doing.

[-] [email protected] 0 points 5 days ago

This comic would slap harder if not for the Supreme Court under christofascist influence from the belief in the divine right of kings having today ruled that Presidents are immune from prosecution for official acts.

That whole divine king thing isn't nearly as dead as the last panel would like to portray it.

[-] [email protected] 9 points 5 days ago

But you also don't have Alfred as the one suiting up to fight the Joker either.

[-] [email protected] 6 points 6 days ago

This is incorrect as was shown last year with the Skill-Mix research:

Furthermore, simple probability calculations indicate that GPT-4's reasonable performance on k=5 is suggestive of going beyond "stochastic parrot" behavior (Bender et al., 2021), i.e., it combines skills in ways that it had not seen during training.

[-] [email protected] 1 points 6 days ago

The problem is that they are prone to making up why they are correct too.

There's various techniques to try and identify and correct hallucinations, but they all increase the cost and none are a silver bullet.

But the rate at which it occurs decreased with the jump in pretrained models, and will likely decrease further with the next jump too.

[-] [email protected] 4 points 6 days ago

Here you are: https://www.nature.com/articles/s41562-024-01882-z

The other interesting thing is how they get it to end up correct on the faux pas questions asking for less certainty to get it to go from refusal to near perfect accuracy.

[-] [email protected] 4 points 1 week ago

Even with early GPT-4 it would also cite real citations that weren't actually about the topic. So you may be doing a lot of work double checking as opposed to just looking into an answer yourself from the start.

[-] [email protected] 3 points 1 week ago

Part of the problem is fine tuning is very shallow, and that a contributing issue for claiming to be right when it isn't is the pretraining on a bunch of training data of people online claiming to be right when they aren't.

134
submitted 1 month ago by [email protected] to c/[email protected]

I often see a lot of people with outdated understanding of modern LLMs.

This is probably the best interpretability research to date, by the leading interpretability research team.

It's worth a read if you want a peek behind the curtain on modern models.

9
submitted 3 months ago by [email protected] to c/[email protected]
78
submitted 3 months ago by [email protected] to c/[email protected]
8
submitted 5 months ago* (last edited 5 months ago) by [email protected] to c/[email protected]

I've been saying this for about a year since seeing the Othello GPT research, but it's nice to see more minds changing as the research builds up.

Edit: Because people aren't actually reading and just commenting based on the headline, a relevant part of the article:

New research may have intimations of an answer. A theory developed by Sanjeev Arora of Princeton University and Anirudh Goyal, a research scientist at Google DeepMind, suggests that the largest of today’s LLMs are not stochastic parrots. The authors argue that as these models get bigger and are trained on more data, they improve on individual language-related abilities and also develop new ones by combining skills in a manner that hints at understanding — combinations that were unlikely to exist in the training data.

This theoretical approach, which provides a mathematically provable argument for how and why an LLM can develop so many abilities, has convinced experts like Hinton, and others. And when Arora and his team tested some of its predictions, they found that these models behaved almost exactly as expected. From all accounts, they’ve made a strong case that the largest LLMs are not just parroting what they’ve seen before.

“[They] cannot be just mimicking what has been seen in the training data,” said Sébastien Bubeck, a mathematician and computer scientist at Microsoft Research who was not part of the work. “That’s the basic insight.”

5
submitted 5 months ago by [email protected] to c/[email protected]

I've been saying this for about a year, since seeing the Othello GPT research, but it's great to see more minds changing on the subject.

70
submitted 6 months ago by [email protected] to c/[email protected]

I'd been predicting this would happen a few months ago with friends and old colleagues (you can have a smart AI or a conservative AI but not both), but it's so much funnier than I thought it would be when it finally arrived.

205
submitted 7 months ago by [email protected] to c/[email protected]
12
submitted 9 months ago by [email protected] to c/[email protected]

Pretty cool thinking and promising early results.

11
submitted 10 months ago by [email protected] to c/[email protected]
9
submitted 10 months ago by [email protected] to c/[email protected]

I've suspected for a few years now that optoelectronics is where this is all headed. It's exciting to watch as important foundations are set on that path, and this was one of them.

1
submitted 10 months ago by [email protected] to c/[email protected]

I've had my eyes on optoelectronics as the future hardware foundation for ML compute (add not just interconnect) for a few years now, and it's exciting to watch the leaps and bounds occurring at such a rapid pace.

17
submitted 11 months ago by [email protected] to c/[email protected]

The Minoan style headbands from Egypt during the 18th dynasty is particularly interesting.

view more: next ›

kromem

joined 1 year ago