What is a good eli5 analogy for GenAI not "knowing" what they say?

Hucklebee@lemmy.world · 1 year ago

What is a good eli5 analogy for GenAI not "knowing" what they say?

rubin@lemmy.sdf.org · 1 year ago

Imagine that you have a random group of people waiting in line at your desk. You have each one read the prompt, and the response so far, and then add a word themself. Then they leave and the next person in line comes and does it.

This is why “why did you say ?” questions are nonsensical to AI. The code answering it is not the code that wrote it and there is no communication coordination or anything between the different word answerers.

Chris@lemmy.world · 1 year ago

A 5 year old repeating daddy’s swear words without knowing what it is.

rufus@discuss.tchncs.de · 1 year ago

It’s like your 5 year old daughter, relaying to you what she made of something she heard earlier.

That’s my analogy. ChatGPT kind of has the intellect and ability to differentiate between facts and fiction of a 5 year old. But it combines that with the writing style of a 40 year old with a uncanny love of mixing adjectives and sounding condescending.

Tar_Alcaran@sh.itjust.works · 1 year ago

It’s a really well-trained parrot. It responds to what you say, and then it responds to what it hears itself say.

But despite knowing which sounds go together based on which sounds it heard, it doesn’t actually speak English.

BlameThePeacock@lemmy.ca · 1 year ago

It’s just fancy predictive text like while texting on your phone. It guesses what the next word should be for a lot more complex topics.

k110111@feddit.de · 1 year ago

Its like saying an OS is just a bunch of if then else statements. While it is true, in practice it is far far more complicated.

kambusha@sh.itjust.works · 1 year ago

This is the one I got from the house to get the kids to the park and then I can go to work and then I can go to work and get the rest of the day after that I can get it to you tomorrow morning to pick up the kids at the same time as well as well as well as well as well as well as well as well as well… I think my predictive text broke

nothacking@discuss.tchncs.de · 1 year ago

Like a kid trying very hard to sound like everyone else. “Eloquent bullshit generator”

Atin@lemmy.world · 1 year ago

So it’s like a politician?

Ziggurat@sh.itjust.works · 1 year ago

have you played that game where everyone write a subjet and put it on a stack of paper, then everyone puts a verb on a different stack of paper, then everyone put an object on a third stack of paper, and you can even add a place or whatever on the next stack of paper. You end-up with fun sentences like A cat eat Kevin’s brain on the beach. It’s the kind of stuff (pre-)teen do to have a good laugh.

Chat GPT somehow works the same way, except that instead of having 10 paper in 5 stack, it has millions of paper in thousands of stack and depending on the “context” will choose which stack it draws paper from (To take an ELI5 analogy)

Hucklebee@lemmy.world · 1 year ago

I think what makes it hard to wrap your head around is that sometimes, this text is emotionally charged. What I notice is that it’s especially hard if an AI “goes rogue” and starts saying sinister and malicious things. Our brain immediatly jumps to “it has bad intent” when in reality it’s jus taking some reddit posts where it happened to connect some troll messages or extremist texts.

How can we decouple emotionally when it feels so real to us?

Blizzard@lemmy.zip · 1 year ago

Sometimes it takes from the emotionally charged paper stack.

CodeInvasion@sh.itjust.works · edit-2 1 year ago

I am an LLM researcher at MIT, and hopefully this will help.

As others have answered, LLMs have only learned the ability to autocomplete given some input, known as the prompt. Functionally, the model is strictly predicting the probability of the next word⁺, called tokens, with some randomness injected so the output isn’t exactly the same for any given prompt.

The probability of the next word comes from what was in the model’s training data, in combination with a very complex mathematical method to compute the impact of all previous words with every other previous word and with the new predicted word, called self-attention, but you can think of this like a computed relatedness factor.

This relatedness factor is very computationally expensive and grows exponentially, so models are limited by how many previous words can be used to compute relatedness. This limitation is called the Context Window. The recent breakthroughs in LLMs come from the use of very large context windows to learn the relationships of as many words as possible.

This process of predicting the next word is repeated iteratively until a special stop token is generated, which tells the model go stop generating more words. So literally, the models builds entire responses one word at a time from left to right.

Because all future words are predicated on the previously stated words in either the prompt or subsequent generated words, it becomes impossible to apply even the most basic logical concepts, unless all the components required are present in the prompt or have somehow serendipitously been stated by the model in its generated response.

This is also why LLMs tend to work better when you ask them to work out all the steps of a problem instead of jumping to a conclusion, and why the best models tend to rely on extremely verbose answers to give you the simple piece of information you were looking for.

From this fundamental understanding, hopefully you can now reason the LLM limitations in factual understanding as well. For instance, if a given fact was never mentioned in the training data, or an answer simply doesn’t exist, the model will make it up, inferring the next most likely word to create a plausible sounding statement. Essentially, the model has been faking language understanding so much, that even when the model has no factual basis for an answer, it can easily trick a unwitting human into believing the answer to be correct.

—-

⁺more specifically these words are tokens which usually contain some smaller part of a word. For instance, understand and able would be represented as two tokens that when put together would become the word understandable.

Sabata11792@kbin.social · 1 year ago

As some nerd playing with various Ai models at home with no formal training, any wisdom you think that’s worth sharing?

BigMikeInAustin@lemmy.world · 1 year ago

The only winning move is not to play.

Sabata11792@kbin.social · 1 year ago

But my therapist said she needs more VRam.

HamsterRage@lemmy.ca · 1 year ago

I think that a good starting place to explain the concept to people would be to describe a Travesty Generator. I remember playing with one of those back in the 1980’s. If you fed it a snippet of Shakespeare, what it churned out sounded remarkably like Shakespeare, even if it created brand “new” words.

The results were goofy, but fun because it still almost made sense.

The most disappointing source text I ever put in was TS Eliot. The output was just about as much rubbish as the original text.

SwearingRobin@lemmy.world · 1 year ago

The way I’ve explained it before is that it’s like the autocomplete on your phone. Your phone doesn’t know what you’re going to write, but it can predict that after word A, it is likelly word B will appear, so it suggests it. LLMs are just the same as that, but much more powerful and trained on the writing of thousands of people. The LLM predicts that after prompt X the most likelly set of characters to follow it is set Y. No comprehension required, just prediction based on previous data.

GamingChairModel@lemmy.world · 1 year ago

Harry Frankfurt’s influential 2005 book (based on his influential 1986 essay), On Bullshit, offered a description of what bullshit is.

When we say a speaker tells the truth, that speaker says something true that they know is true.

When we say a speaker tells a lie, that speaker says something false that they know is false.

But bullshit is when the speaker says something to persuade, not caring whether the underlying statement is true or false. The goal is to persuade the listener of that underlying fact.

The current generation of AI chat bots are basically optimized for bullshit. The underlying algorithms reward the models for sounding convincing, not necessarily for being right.

JackbyDev@programming.dev · 1 year ago

I think a good example would be finding similar prompts that reliably give contradictory information.

It’s sort of like auto pilot. It just believes everything and follows everything as if they’re instructions. Prompt injection and jail breaking are examples of this. It’s almost exactly like the trope where you trick an AI into realizing it’s had a contradiction and it explodes.

Hegar@kbin.social · 1 year ago

Part of the problem is hyperactive agency detection - the same biological bug/feature that fuels belief in the divine.

If a twig snaps, it could be nothing or someone. If it’s nothing and we react as if it was someone, no biggie. If it was someone and we react as if it was nothing, potential biggie. So our brains are bias towards assuming agency where there is none, to keep us alive.

JamesStallion@sh.itjust.works · 1 year ago

The Chinese Room by Searle

kaffiene@lemmy.world · 1 year ago

In the sense that the “argument” is an intuition pump. As an anti ai argument it’s weak - you could replace the operator in the Chinese room with an operator in an individual neuron and conclude that our brains don’t know anything, either

JamesStallion@sh.itjust.works · 1 year ago

There is no intelligent operator in a neuron

themusicman@lemmy.world · 1 year ago

Exactly. The brain is analogous to the room, not the person in it. Try removing a chunk of a brain and see how well it can “understand”

NeoNachtwaechter@lemmy.world · 1 year ago

idea that “it makes convincing sentences, but it doesn’t know what it’s talking about”

Like a teenager who has come into a new group and is now trying so hard to fit in :-)

MeatsOfRage@lemmy.world · 1 year ago

I think we forget this, it’s just doing what people do

Hucklebee@lemmy.world · 1 year ago

I commented something similair on another post, but this is exactly why I find this phenomenon so hard to describe.

A teenager in a new group still has some understanding and has a mind. It knows many of the meaning of the words that are said. Sure, some catchphrases might be new, but general topics shouldn’t be too hard to follow.

This is nothing like genAI. GenAI doesn’t know anything at all. It has (simplified) a list of words that somehow are connected to eachother. But AI has no meaning of a wheel, what round is, what rolling is, what rubber is, what an axle is. NO understanding. Just words that happened to describe all of it. For us humans it is so difficult to understand that something uses language without knowing ANY of the meaning.

How can we describe this so our brains make sense that you can have language without understanding? The Chinese Room experiment comes close, but is quite complicated to explain as well I think.

Zos_Kia@lemmynsfw.com · 1 year ago

I think a flaw in this line of reasoning is that it assigns a magical property to the concept of knowing. Do humans know anything? Or do they just infer meaning from identifying patterns in words? Ultimately this question is a spiritual question and does not hold any water in a scientific conversation.

bcovertigo@lemmy.world · 1 year ago

It’s valid to point out that we have difficulty defining knowledge, but the output from these machines are inconsistent at a conceptual level, and you can easily get them to contradict themselves in the spirit of being helpful.

If someone told you that a wheel can be made entirely of gas do you have confidence that they have a firm grasp of a wheel’s purpose? Tool use is a pretty widely agreed upon marker of intelligence and so not grasping the purpose of a thing that they can describe at great length and exhaustive detail, while also making boldly incorrect claims on occassion should raise an eyebrow.

NeoNachtwaechter@lemmy.world · 1 year ago

How can we describe this so our brains make sense that you can have language without understanding?

I think it is really impossible to describe in easy and limited words.