AGI achieved 🤖

cyrano@lemmy.dbzer0.com · 5 months ago

AGI achieved 🤖

LanguageIsCool@lemmy.world · 5 months ago

How many times do I have to spell it out for you chargpt? S-T-R-A-R-W-B-E-R-R-Y-R

Farid@startrek.website · 5 months ago

I get the meme aspect of this. But just to be clear, it was never fair to judge LLMs for specifically this. The LLM doesn’t even see the letters in the words, as every word is broken down into tokens, which are numbers. I suppose with a big enough corpus of data it might eventually extrapolate which words have which letter from texts describing these words, but normally it shouldn’t be expected.

cyrano@lemmy.dbzer0.com · 5 months ago

True and I agree with you yet we are being told all job are going to disappear, AGI is coming tomorrow, etc. As usual the truth is more balanced

Farid@startrek.website · 5 months ago

I don’t know what part of what I said prompted all those downvotes, but of course all the reasonable people understood, that the “AGI in 2 years” was a stock price pump.

Zacryon@feddit.org · 5 months ago

I know that words are tokenized in the vanilla transformer. But do GPT and similar LLMs still do that as well? I assumed they also tokenize on character/symbol level, possibly mixed up with additional abstraction down the chain.

kayzeekayzee@lemmy.blahaj.zone · 5 months ago

I’ve actually messed with this a bit. The problem is more that it can’t count to begin with. If you ask it to spell out each letter individually (ie each letter will be its own token), it still gets the count wrong.

qx128@lemmy.world · 5 months ago

I really like checking these myself to make sure it’s true. I WAS NOT DISAPPOINTED!

(Total Rs is 8. But the LOGIC ChatGPT pulls out is ……. remarkable!)

Zacryon@feddit.org · 5 months ago

“Let me know if you’d like help counting letters in any other fun words!”

Oh well, these newish calls for engagement sure take on ridiculous extents sometimes.

filcuk@lemmy.zip · 5 months ago

I want an option to select Marvin the paranoid android mood: “there’s your answer, now if you could leave me to wallow in self-pitty”

localhost443@discuss.tchncs.de · 5 months ago

Here I am, emissions the size of a small country, and they ask me to count letters…

Jakeroxs@sh.itjust.works · 5 months ago

Lol someone could absolutely do that as a character card.

ipitco@lemmy.super.ynh.fr · 5 months ago

Try with o4-mini-high. It’s made to think like a human by checking its answer and doing step by step, rather than just kinda guessing one like here

scholar@lemmy.world · 5 months ago

jsomae@lemmy.ml · 5 months ago

This is deepseek model right? OP was posting about GPT o3

scholar@lemmy.world · 5 months ago

Yes this is a small(ish) offline deepseek model

I Cast Fist@programming.dev · 5 months ago

Now ask how many asses there are in assassinations

notdoingshittoday@lemmy.zip · 5 months ago

rumba@lemmy.zip · 5 months ago

ohh god, I never through to ask reasoning models,

DeepSeekR17b was gold too

rumba@lemmy.zip · 5 months ago

then 14b, man sooo close…

merc@sh.itjust.works · 5 months ago

And people are trusting these things to do jobs / parts of jobs that humans used to do.

Jakeroxs@sh.itjust.works · 5 months ago

Humans are pretty dumb sometimes lol

rumba@lemmy.zip · 5 months ago

It’s far better at the use of there, their, and they’re.

The average US citizen couldn’t craft a professional sounding document of their life depended on it.

It’s not better than a professional at anything, The average human is far below that bar.

Jakeroxs@sh.itjust.works · 5 months ago

I like the way you worded that a lot

Mwa@thelemmy.club · edit-2 5 months ago

I wonder how QWEN 3.0 performs cause it surpasses Deepseek apparently

rumba@lemmy.zip · 5 months ago

I don’t have any other models pulled down, if they’re open I’ll try it and respond back here

Mwa@thelemmy.club · 5 months ago

rumba@lemmy.zip · 5 months ago

It did quite well for this.

I Cast Fist@programming.dev · 5 months ago

Oh god, the asses multiplied 🤣🤣🤣

localhost443@discuss.tchncs.de · 5 months ago

It’s painful how Reddit that is…

So,

Now,

Alright,

Kairos@lemmy.today · 5 months ago

Man AI is ass at this

*laugh track*

Rin@lemm.ee · edit-2 4 months ago

deleted by creator

MrLLM@ani.social · 5 months ago

We gotta raise the bar, so they keep struggling to make it “better”

My attempt

0000000000000000
0000011111000000
0000111111111000
0000111111100000
0001111111111000
0001111111111100
0001111111111000
0000011111110000
0000111111000000
0001111111100000
0001111111100000
0001111111100000
0001111111100000
0000111111000000
0000011110000000
0000011110000000

Btw, I refuse to give my money to AI bros, so I don’t have the “latest and greatest”

ipitco@lemmy.super.ynh.fr · edit-2 5 months ago

Tested on ChatGPT o4-mini-high

It sent me this

0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0
0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0
0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0
0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0
0 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0
0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0
0 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0
0 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0
0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0
1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0
1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0
1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0
1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0
0 0 1 1 1 0 0 1 1 1 0 0 0 0 0 0
0 1 1 1 0 0 0 0 1 1 1 0 0 0 0 0
1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0

I asked it to remove the spaces


0001111100000000
0011111111000000
0011111110000000
0111111111100000
0111111111110000
0011111111100000
0001111111000000
0011111100000000
0111111111100000
1111111111110000
1111111111110000
1111111111110000
1111111111110000
0011100111000000
0111000011100000
1111000011110000

I guess I just murdered a bunch of trees and killed a random dude with the water it used, but it looks good

xavier666@lemm.ee · 5 months ago

I just murdered a bunch of trees and killed a random dude with the water it used, but it looks good

Tech bros: “Worth it!”

ipitco@lemmy.super.ynh.fr · 5 months ago

It’s a pretty big problem, but as long as governments don’t do shit then we’re pretty much fucked.

Either we take the train and contribute to the problem, or we don’t but get left behind, and end up being the harmed one.

xavier666@lemm.ee · edit-2 5 months ago

but as long as governments don’t do shit then we’re pretty much fucked

Story of the last few decades (or centuries since I don’t know history too well)

VirgilMastercard@reddthat.com · 5 months ago

Biggest threat to humanity

idiomaddict@lemmy.world · 5 months ago

I know there’s no logic, but it’s funny to imagine it’s because it’s pronounced Mrs. Sippy

merc@sh.itjust.works · 5 months ago

How do you pronounce “Mrs” so that there’s an “r” sound in it?

idiomaddict@lemmy.world · 5 months ago

I don’t, but it’s abbreviated with one.

untorquer@lemmy.world · edit-2 5 months ago

“His property”

Otherwise it’s just Ms.

idiomaddict@lemmy.world · 5 months ago

Mrs. originally comes from mistress, which is why it retains the r.

merc@sh.itjust.works · 5 months ago

But no “r” sound.

idiomaddict@lemmy.world · 5 months ago

Correct. I didn’t say there was an r sound, but that it was going off of the spelling. I agree there’s no r sound.

untorquer@lemmy.world · 5 months ago

Yes but from same source also wife

idiomaddict@lemmy.world · 5 months ago

That came later though, as in “I had dinner with the Mrs last night.”

untorquer@lemmy.world · 5 months ago

Yes but it did come, and took place as the common usage. So much so that Ms. Is used to describe a woman both with and without reference to marital status.

I’m down with using Mrs. not to refer to marital status but imo just going with Ms. Is clearer and easier because of how deeply associated Mrs. Is with it.

jaybone@lemmy.zip · 5 months ago

And if it messed up on the other word, we could say because it’s pronounced Louisianer.

sp3ctr4l@lemmy.dbzer0.com · edit-2 5 months ago

I was gonna say something similar, I have heard a good number of people pronounce Mississippi as if it does have an R in it.

cyrano@lemmy.dbzer0.com · 5 months ago

It is going to be funny those implementation of LLM in accounting software

Slotos@feddit.nl · 5 months ago

https://arxiv.org/html/2410.12835v1

Deceptichum@quokk.au · 5 months ago

Clever, by putting it in Dutch you never know if it’s a typo or Dutch.

cyrano@lemmy.dbzer0.com · 5 months ago

Interesting….troubleshooting is going to be interesting in the future

jsomae@lemmy.ml · edit-2 5 months ago

People who think that LLMs having trouble with these questions is evidence one way or another about how good or bad LLMs are just don’t understand tokenization. This is not a symptom of some big-picture deep problem with LLMs; it’s a curious artifact like in a jpeg image, but doesn’t really matter for the vast majority of applications.

You may hate AI but that doesn’t excuse being ignorant about how it works.

untorquer@lemmy.world · 5 months ago

These sorts of artifacts wouldn’t be a huge issue except that AI is being pushed to the general public as an alternative means of learning basic information. The meme example is obvious to someone with a strong understanding of English but learners and children might get an artifact and stamp it in their memory, working for years off bad information. Not a problem for a few false things every now and then, that’s unavoidable in learning. Thousands accumulated over long term use, however, and your understanding of the world will be coarser, like the Swiss cheese with voids so large it can’t hold itself up.

jsomae@lemmy.ml · 5 months ago

You’re talking about hallucinations. That’s different from tokenization reflection errors. I’m specifically talking about its inability to know how many of a certain type of letter are in a word that it can spell correctly. This is not a hallucination per se – at least, it’s a completely different mechanism that causes it than whatever causes other factual errors. This specific problem is due to tokenization, and that’s why I say it has little bearing on other shortcomings of LLMs.

untorquer@lemmy.world · 5 months ago

No, I’m talking about human learning and the danger imposed by treating an imperfect tool as a reliable source of information as these companies want people to do.

Whether the erratic information is from tokenization or hallucinations is irrelevant when this is already the main source for so many people in their learning, for example, a new language.

jsomae@lemmy.ml · edit-2 5 months ago

Hallucinations aren’t relevant to my point here. I’m not defending that AIs are a good source of information, and I agree that hallucinations are dangerous (either that or misusing LLMs is dangerous). I also admit that for language learning, artifacts caused from tokenization could be very detrimental to the user.

The point I am making is that LLMs struggling with these kind of tokenization artifacts is poor evidence for drawing any conclusions about their behaviour on other tasks.

untorquer@lemmy.world · 5 months ago

That’s a fair point when these LLMs are restricted to areas where they function well. They have use cases that make sense when isolated from the ethics around training and compute. But the people who made them are applying them wildly outside these use cases.

These are pushed as a solution to every problem for the sake of profit with intentional ignorance of these issues. If a few errors impact someone it’s just a casualty in the goal of making it profitable. That can’t be disentwined from them unless you limit your argument to open source local compute.

__dev@lemmy.world · 5 months ago

And yet they can seemingly spell and count (small numbers) just fine.

jsomae@lemmy.ml · 5 months ago

what do you mean by spell fine? They’re just emitting the tokens for the words. Like, it’s not writing “strawberry,” it’s writing tokens <302, 1618, 19772>, which correspond to st, raw, and berry respectively. If you ask it to put a space between each letter, that will disrupt the tokenization mechanism, and it’s going to be quite liable to making mistakes.

I don’t think it’s really fair to say that the lookup 19772 -> berry counts as the LLM being able to spell, since the LLM isn’t operating at that layer. It doesn’t really emit letters directly. I would argue its inability to reliably spell words when you force it to go letter-by-letter or answer queries about how words are spelled is indicative of its poor ability to spell.

__dev@lemmy.world · 5 months ago

what do you mean by spell fine?

I mean that when you ask them to spell a word they can list every character one at a time.

buddascrayon@lemmy.world · 5 months ago

The problem is that it’s not actually counting anything. It’s simply looking for some text somewhere in its database that relates to that word and the number of R’s in that word. There’s no mechanism within the LLM to actually count things. It is not designed with that function. This is not general AI, this is a Generative Adversarial Network that’s using its vast vast store of text to put words together that sound like they answer the question that was asked.

moseschrute@lemmy.world · 5 months ago

Also just checked and every open ai model bigger than 4.1-mini can answer this. I think the joke should emphasize how we developed a super power inefficient way to solve some problems that can be accurately and efficiently answered with a single algorithm. Another example is using ChatGPT to do simple calculator math. LLMs are good at specific tasks and really bad at others, but people kinda throw everything at them.

RedstoneValley@sh.itjust.works · 5 months ago

It’s funny how people always quickly point out that an LLM wasn’t made for this, and then continue to shill it for use cases it wasn’t made for either (The “intelligence” part of AI, for starters)

outhouseperilous@lemmy.dbzer0.com · edit-2 5 months ago

I would say more “blackpilling”, i genuinely don’t believe most humans are people anymore after dealing with this.

UnderpantsWeevil@lemmy.world · edit-2 5 months ago

LLM wasn’t made for this

There’s a thought experiment that challenges the concept of cognition, called The Chinese Room. What it essentially postulates is a conversation between two people, one of whom is speaking Chinese and getting responses in Chinese. And the first speaker wonders “Does my conversation partner really understand what I’m saying or am I just getting elaborate stock answers from a big library of pre-defined replies?”

The LLM is literally a Chinese Room. And one way we can know this is through these interactions. The machine isn’t analyzing the fundamental meaning of what I’m saying, it is simply mapping the words I’ve input onto a big catalog of responses and giving me a standard output. In this case, the problem the machine is running into is a legacy meme about people miscounting the number of "r"s in the word Strawberry. So “2” is the stock response it knows via the meme reference, even though a much simpler and dumber machine that was designed to handle this basic input question could have come up with the answer faster and more accurately.

When you hear people complain about how the LLM “wasn’t made for this”, what they’re really complaining about is their own shitty methodology. They build a glorified card catalog. A device that can only take inputs, feed them through a massive library of responses, and sift out the highest probability answer without actually knowing what the inputs or outputs signify cognitively.

Even if you want to argue that having a natural language search engine is useful (damn, wish we had a tool that did exactly this back in August of 1996, amirite?), the implementation of the current iteration of these tools is dogshit because the developers did a dogshit job of sanitizing and rationalizing their library of data. Also, incidentally, why Deepseek was running laps around OpenAI and Gemini as of last year.

Imagine asking a librarian “What was happening in Los Angeles in the Summer of 1989?” and that person fetching you back a stack of history textbooks, a stack of Sci-Fi screenplays, a stack of regional newspapers, and a stack of Iron-Man comic books all given equal weight? Imagine hearing the plot of the Terminator and Escape from LA intercut with local elections and the Loma Prieta earthquake.

That’s modern LLMs in a nutshell.

jsomae@lemmy.ml · 5 months ago

You’ve missed something about the Chinese Room. The solution to the Chinese Room riddle is that it is not the person in the room but rather the room itself that is communicating with you. The fact that there’s a person there is irrelevant, and they could be replaced with a speaker or computer terminal.

Put differently, it’s not an indictment of LLMs that they are merely Chinese Rooms, but rather one should be impressed that the Chinese Room is so capable despite being a completely deterministic machine.

If one day we discover that the human brain works on much simpler principles than we once thought, would that make humans any less valuable? It should be deeply troubling to us that LLMs can do so much while the mathematics behind them are so simple. Arguments that because LLMs are just scaled-up autocomplete they surely can’t be very good at anything are not comforting to me at all.

outhouseperilous@lemmy.dbzer0.com · 5 months ago

Its not a fucking riddle, it’s a koan/thought experiment.

It’s questioning what ‘communication’ fundamentally is, and what knowledge fundamentally is.

It’s not even the first thing to do this. Military theory was cracking away at the ‘communication’ thing a century before, and the nature of knowledge has discourse going back thousands of years.

jsomae@lemmy.ml · 5 months ago

You’re right, I shouldn’t have called it a riddle. Still, being a fucking thought experiment doesn’t preclude having a solution. Theseus’ ship is another famous fucking thought experiment, which has also been solved.

outhouseperilous@lemmy.dbzer0.com · 5 months ago

‘A solution’

That’s not even remotely the point. Yes there are nany valid solutions. The point isn’t to solve it, but what how you solve it says about and clarifies your ideas.

jsomae@lemmy.ml · 5 months ago

I suppose if you’re going to be postmodernist about it, but that’s beyond my ability to understand. The only complete solution I know to Theseus’ Ship is “the universe is agnostic as to which ship is the original. Identity of a composite thing is not part of the laws of physics.” Not sure why you put scare quotes around it.

outhouseperilous@lemmy.dbzer0.com · 5 months ago

For different value sets and use cases, dear.

kassiopaea@lemmy.blahaj.zone · 5 months ago

This. I often see people shitting on AI as “fancy autocomplete” or joking about how they get basic things incorrect like this post but completely discount how incredibly fucking capable they are in every domain that actually matters. That’s what we should be worried about… what does it matter that it doesn’t “work the same” if it still accomplishes the vast majority of the same things? The fact that we can get something that even approximates logic and reasoning ability from a deterministic system is terrifying on implications alone.

Knock_Knock_Lemmy_In@lemmy.world · 5 months ago

Why doesn’t the LLM know to write (and run) a program to calculate the number of characters?

I feel like I’m missing something fundamental.

jsomae@lemmy.ml · edit-2 5 months ago

The LLM isn’t aware of its own limitations in this regard. The specific problem of getting an LLM to know what characters a token comprises has not been the focus of training. It’s a totally different kind of error than other hallucinations, it’s almost entirely orthogonal, but other hallucinations are much more important to solve, whereas being able to count the number of letters in a word or add numbers together is not very important, since as you point out, there are already programs that can do that.

At the moment, you can compare this perhaps to the Paris in the the Spring illusion. Why don’t people know to double-check the number of 'the’s in a sentence? They could just use their fingers to block out adjacent words and read each word in isolation. They must be idiots and we shouldn’t trust humans in any domain.

outhouseperilous@lemmy.dbzer0.com · 5 months ago

The most convincing arguments that llms are like humans aren’t that llm’s are good, but that humans are just unrefrigerated meat and personhood is a delusion.

jsomae@lemmy.ml · 5 months ago

This might well be true yeah. But that’s still good news for AI companies who want to replace humans – bar’s lower than they thought.

outhouseperilous@lemmy.dbzer0.com · edit-2 5 months ago

It doesn’t know things.

It’s a statistical model. It cannot synthesize information or problem solve, only show you a rough average of it’s library of inputs graphed by proximity to your input.

jsomae@lemmy.ml · 5 months ago

Congrats, you’ve discovered reductionism. The human brain also doesn’t know things, as it’s composed of electrical synapses made of molecules that obey the laws of physics and direct one’s mouth to make words in response to signals that come from the ears.

Not saying LLMs don’t know things, but your argument as to why they don’t know things has no merit.

outhouseperilous@lemmy.dbzer0.com · 5 months ago

Oh, that’s why everything else you said seemed a bit off.

OsrsNeedsF2P@lemmy.ml · 5 months ago

You didn’t get good answers so I’ll explain.

First, an LLM can easily write a program to calculate the number of rs. If you ask an LLM to do this, you will get the code back.

But the website ChatGPT.com has no way of executing this code, even if it was generated.

The second explanation is how LLMs work. They work on the word (technically token, but think word) level. They don’t see letters. The AI behind it literally can only see words. The way it generates output is it starts typing words, and then guesses what word is most likely to come next. So it literally does not know how many rs are in strawberry. The impressive part is how good this “guessing what word comes next” is at answering more complex questions.

Knock_Knock_Lemmy_In@lemmy.world · 5 months ago

But why can’t “query the python terminal” be trained into the LLM. It just needs some UI training.

OsrsNeedsF2P@lemmy.ml · 5 months ago

ChatGPT used to actually do this. But they removed that feature for whatever reason. Now the server that the LLM runs on doesn’t isn’t provide the LLM a Python terminal, so the LLM can’t query it

UnderpantsWeevil@lemmy.world · 5 months ago

one should be impressed that the Chinese Room is so capable despite being a completely deterministic machine.

I’d be more impressed if the room could tell me how many "r"s are in Strawberry inside five minutes.

If one day we discover that the human brain works on much simpler principles

Human biology, famous for being simple and straightforward.

outhouseperilous@lemmy.dbzer0.com · 5 months ago

Ah! But you can skip all that messy biology abd stuff i don’t understand that’s probably not important, abd just think of it as a classical computer running an x86 architecture, and checkmate, liberal my argument owns you now!

jsomae@lemmy.ml · edit-2 5 months ago

Because LLMs operate at the token level, I think it would be a more fair comparison with humans to ask why humans can’t produce the IPA spelling words they can say, /nɔr kæn ðeɪ ˈizəli rid θɪŋz ˈrɪtən ˈpjʊrli ɪn aɪ pi ˈeɪ/ despite the fact that it should be simple to – they understand the sounds after all. I’d be impressed if somebody could do this too! But that most people can’t shouldn’t really move you to think humans must be fundamentally stupid because of this one curious artifact. Maybe they are fundamentall stupid for other reasons, but this one thing is quite unrelated.

UnderpantsWeevil@lemmy.world · 5 months ago

why humans can’t produce the IPA spelling words they can say, /nɔr kæn ðeɪ ˈizəli rid θɪŋz ˈrɪtən ˈpjʊrli ɪn aɪ pi ˈeɪ/ despite the fact that it should be simple to – they understand the sounds after all

That’s just access to the right keyboard interface. Humans can and do produce those spellings with additional effort or advanced tool sets.

humans must be fundamentally stupid because of this one curious artifact.

Humans turns oatmeal into essays via a curios lump of muscle is an impressive enough trick on its face.

LLMs have 95% of the work of human intelligence handled for them and still stumble on the last bits.

jsomae@lemmy.ml · edit-2 5 months ago

I mean, among people who are proficient with IPA, they still struggle to read whole sentences written entirely in IPA. Similarly, people who speak and read chinese struggle to read entire sentences written in pinyin. I’m not saying people can’t do it, just that it’s much less natural for us (even though it doesn’t really seem like it ought to be.)

I agree that LLMs are not as bright as they look, but my point here is that this particular thing – their strange inconsistency understanding what letters correspond to the tokens they produce – specifically shouldn’t be taken as evidence for or against LLMs being capable in any other context.

UnderpantsWeevil@lemmy.world · 5 months ago

Similarly, people who speak and read chinese struggle to read entire sentences written in pinyin.

Because pinyin was implemented by the Russians to teach Chinese to people who use Cyrillic characters. Would make as much sense to call out people who can’t use Katakana.

shalafi@lemmy.world · 5 months ago

You might just love Blind Sight. Here, they’re trying to decide if an alien life form is sentient or a Chinese Room:

“Tell me more about your cousins,” Rorschach sent.

“Our cousins lie about the family tree,” Sascha replied, “with nieces and nephews and Neandertals. We do not like annoying cousins.”

“We’d like to know about this tree.”

Sascha muted the channel and gave us a look that said Could it be any more obvious? “It couldn’t have parsed that. There were three linguistic ambiguities in there. It just ignored them.”

“Well, it asked for clarification,” Bates pointed out.

“It asked a follow-up question. Different thing entirely.”

Bates was still out of the loop. Szpindel was starting to get it, though… .

CitizenKong@lemmy.world · edit-2 5 months ago

Blindsight is such a great novel. It has not one, not two but three great sci-fi concepts rolled into one book.

One is artificial intelligence (the ship’s captain is an AI), the second is alien life so vastly different it appears incomprehensible to human minds. And last but not least, and the most wild, vampires as a evolutionary branch of humanity that died out and has been recreated in the future.

TommySalami@lemmy.world · 5 months ago

My a favorite part of the vampire thing is how they died out. Turns out vampires start seizing when trying to visually process 90° angles, and humans love building shit like that (not to mention a cross is littered with them). It’s so mundane an extinction I’d almost believe it.

outhouseperilous@lemmy.dbzer0.com · edit-2 5 months ago

Also, the extremely post-cyberpunk posthumans, and each member of the crew is a different extremely capable kind of fucked up model of what we might become, with the protagonist personifying the genre of horror that it is, while still being occasionally hilarious.

CitizenKong@lemmy.world · 5 months ago

Oooh, I didn’t even know it had a sequel!

I wouldn’t say it flirts with the supernatural as much as it’s with one foot into weird fiction, which is where cosmic horror comes from.

frostysauce@lemmy.world · 5 months ago

(damn, wish we had a tool that did exactly this back in August of 1996, amirite?)

Wait, what was going on in August of '96?

UnderpantsWeevil@lemmy.world · 5 months ago

Google Search premiered

Leet@lemmy.zip · 5 months ago

Can we say for certain that human brains aren’t sophisticated Chinese rooms…

RedstoneValley@sh.itjust.works · 5 months ago

That’s a very long answer to my snarky little comment :) I appreciate it though. Personally, I find LLMs interesting and I’ve spent quite a while playing with them. But after all they are like you described, an interconnected catalogue of random stuff, with some hallucinations to fill the gaps. They are NOT a reliable source of information or general knowledge or even safe to use as an “assistant”. The marketing of LLMs as being fit for such purposes is the problem. Humans tend to turn off their brains and to blindly trust technology, and the tech companies are encouraging them to do so by making false promises.

merc@sh.itjust.works · 5 months ago

Imagine asking a librarian “What was happening in Los Angeles in the Summer of 1989?” and that person fetching you … That’s modern LLMs in a nutshell.

I agree, but I think you’re still being too generous to LLMs. A librarian who fetched all those things would at least understand the question. An LLM is just trying to generate words that might logically follow the words you used.

IMO, one of the key ideas with the Chinese Room is that there’s an assumption that the computer / book in the Chinese Room experiment has infinite capacity in some way. So, no matter what symbols are passed to it, it can come up with an appropriate response. But, obviously, while LLMs are incredibly huge, they can never be infinite. As a result, they can often be “fooled” when they’re given input that semantically similar to a meme, joke or logic puzzle. The vast majority of the training data that matches the input is the meme, or joke, or logic puzzle. LLMs can’t reason so they can’t distinguish between “this is just a rephrasing of that meme” and “this is similar to that meme but distinct in an important way”.

jsomae@lemmy.ml · 5 months ago

Can you explain the difference between understanding the question and generating the words that might logically follow? I’m aware that it’s essentially a more powerful version of how auto-correct works, but why should we assume that shows some lack of understanding at a deep level somehow?

outhouseperilous@lemmy.dbzer0.com · edit-2 5 months ago

So, what is ‘understanding’?

If you need help, you can look at marx for an answer that still mostly holds up, if your server is an indication of your reading habbits.

jsomae@lemmy.ml · 5 months ago

oh does he have a treatise on the subject?

outhouseperilous@lemmy.dbzer0.com · 5 months ago

He’s said some relevant stuff

jsomae@lemmy.ml · 5 months ago

nice

merc@sh.itjust.works · 5 months ago

Can you explain the difference between understanding the question and generating the words that might logically follow?

I mean, it’s pretty obvious. Take someone like Rowan Atkinson whose death has been misreported multiple times. If you ask a computer system “Is Rowan Atkinson Dead?” you want it to understand the question and give you a yes/no response based on actual facts in its database. A well designed program would know to prioritize recent reports as being more authoritative than older ones. It would know which sources to trust, and which not to trust.

An LLM will just generate text that is statistically likely to follow the question. Because there have been many hoaxes about his death, it might use that as a basis and generate a response indicating he’s dead. But, because those hoaxes have also been debunked many times, it might use that as a basis instead and generate a response indicating that he’s alive.

So, if he really did just die and it was reported in reliable fact-checked news sources, the LLM might say “No, Rowan Atkinson is alive, his death was reported via a viral video, but that video was a hoax.”

but why should we assume that shows some lack of understanding

Because we know what “understanding” is, and that it isn’t simply finding words that are likely to appear following the chain of words up to that point.

jsomae@lemmy.ml · 5 months ago

The Rowan Atkinson thing isn’t misunderstanding, it’s understanding but having been misled. I’ve literally done this exact thing myself, say something was a hoax (because in the past it was) but then it turned out there was newer info I didn’t know about. I’m not convinced LLMs as they exist today don’t prioritize sources – if trained naively, sure, but these days they can, for instance, integrate search results, and can update on new information. If the LLM can answer correctly only after checking a web search, and I can do the same only after checking a web search, that’s a score of 1-1.

because we know what “understanding” is

Really? Who claims to know what understanding is? Do you think it’s possible there can ever be an AI (even if different from an LLM) which is capable of “understanding?” How can you tell?

The_Decryptor@aussie.zone · 5 months ago

I’m not convinced LLMs as they exist today don’t prioritize sources – if trained naively, sure, but these days they can, for instance, integrate search results, and can update on new information.

Well, it includes the text from the search results in the prompt, it’s not actually updating any internal state (the network weights), a new “conversation” starts from scratch.

jsomae@lemmy.ml · 5 months ago

Yes that’s right, LLMs are context-free. They don’t have internal state. When I say “update on new information” I really mean “when new information is available in its context window, its response takes that into account.”

outhouseperilous@lemmy.dbzer0.com · 5 months ago

Yes but have you considered that it agreed with me so now i need to defend it to the death against you horrible apes, no matter the allegation or terrain?

Knock_Knock_Lemmy_In@lemmy.world · 5 months ago

a much simpler and dumber machine that was designed to handle this basic input question could have come up with the answer faster and more accurately

The human approach could be to write a (python) program to count the number of characters precisely.

When people refer to agents, is this what they are supposed to be doing? Is it done in a generic fashion or will it fall over with complexity?

UnderpantsWeevil@lemmy.world · 5 months ago

When people refer to agents, is this what they are supposed to be doing?

That’s not how LLMs operate, no. They aggregate raw text and sift for popular answers to common queries.

ChatGPT is one step removed from posting your question to Quora.

Knock_Knock_Lemmy_In@lemmy.world · 5 months ago

But an LLM as a node in a framework that can call a python library should be able to count the number of Rs in strawberry.

It doesn’t scale to AGI but it does reduce hallucinations.

outhouseperilous@lemmy.dbzer0.com · 5 months ago

You’d still be better off starting with a 50s language processor, then grafting on some API calls.

UnderpantsWeevil@lemmy.world · edit-2 5 months ago

But an LLM as a node in a framework that can call a python library

Isn’t how these systems are configured. They’re just not that sophisticated.

So much of what Sam Alton is doing is brute force, which is why he thinks he needs a $1T investment in new power to build his next iteration model.

Deepseek gets at the edges of this through their partitioned model. But you’re still asking a lot for a machine to intuit whether a query can be solved with some exigent python query the system has yet to identify.

It doesn’t scale to AGI but it does reduce hallucinations

It has to scale to AGI, because a central premise of AGI is a system that can improve itself.

It just doesn’t match the OpenAI development model, which is to scrape and sort data hoping the Internet already has the solution to every problem.

KeenFlame@feddit.nu · 5 months ago

The only thing worse than the ai shills are the tech bro mansplainaitions of how “ai works” when they are utterly uninformed of the actual science. Please stop making educated guesses for others and typing them out in a teacher’s voice. It’s extremely aggravating

outhouseperilous@lemmy.dbzer0.com · edit-2 5 months ago

No, this isn’t what ‘agents’ do, ‘agents’ just interact with other programs. So like move your mouse around to buy stuff, using the same methods as everything else.

Knock_Knock_Lemmy_In@lemmy.world · 5 months ago

‘agents’ just interact with other programs.

If that other program is, say, a python terminal then can’t LLMs be trained to use agents to solve problems outside their area of expertise?

I just tested chatgpt to write a python program to return the frequency of letters in a string, then asked it for the number of L’s in the longest placename in Europe.

‘’‘’

String to analyze

text = “Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch”

Convert to lowercase to count both ‘L’ and ‘l’ as the same

text = text.lower()

Dictionary to store character frequencies

frequency = {}

Count characters

for char in text: if char in frequency: frequency[char] += 1 else: frequency[char] = 1

Show the number of 'l’s

print(“Number of 'l’s:”, frequency.get(‘l’, 0))

‘’’

I was impressed until

Output

Number of 'l’s: 16

REDACTED@infosec.pub · 5 months ago

There are different types of Artificial intelligences. Counter-Strike 1.6 bots, by definition, were AI. They even used deep learning to figure out new maps.

ouRKaoS@lemmy.today · 5 months ago

If you want an even older example, the ghosts in Pac-Man could be considered AI as well.

SoftestSapphic@lemmy.world · 5 months ago

By this logic any solid state machine is AI.

These words used to mean things before marketing teams started calling everything they want to sell “AI”

outhouseperilous@lemmy.dbzer0.com · edit-2 5 months ago

Yes but then we built a weapon with with to murder truth, and with it meaning, so everything is just vibesy meaning-mush now. And you’re a big dumb meanie for hating the thing that saved ys from having/being able to know things. Meanie.

SparroHawc@lemmy.zip · 5 months ago

No. Artificial Intelligence has to be imitating intelligent behavior - such as the ghosts imitating how, ostensibly, a ghost trapped in a maze and hungry for yellow circular flesh would behave, and how CS1.6 bots imitate the behavior of intelligent players. They artificially reproduce intelligent behavior.

Which means LLMs are very much AI. They are not, however, AGI.

SoftestSapphic@lemmy.world · 5 months ago

No, the logic for a Pac Man ghost is a solid state machine

Stupid people attributing intelligence to something that is probably not is a shameful hill to die on.

Your god is just an autocomplete bot that you refuse to learn about outside the hype bubble

outhouseperilous@lemmy.dbzer0.com · 5 months ago

Okay but if i say something from outside the hype bubble then all my friends except chatgpt will go away.

Also chatgpt is my friend and always will be, and it even told me i don’t have to take the psych meds that give me tummy aches!

howrar@lemmy.ca · edit-2 5 months ago

As far as I’m concerned, “intelligence” in the context of AI basically just means the ability to do things that we consider to be difficult. It’s both very hand-wavy and a constantly moving goalpost. So a hypothetical pacman ghost is intelligent before we’ve figured out how to do it. After it’s been figured out and implemented, it ceases to be intelligent but we continue to call it intelligent for historical reasons.

SparroHawc@lemmy.zip · edit-2 5 months ago

Okay, what is your definition of AI then, if nothing burned onto silicon can count?

If LLMs aren’t AI, then absolutely nothing up to this point probably counts either.

SoftestSapphic@lemmy.world · edit-2 5 months ago

since nothing burned into silicon can count

Oh noo you called me a robot racist. Lol fuck off dude you know that’s not what I’m saying

The problem with supporters of AI is they learned everything they know from the companies trying to sell it to them. Like a 50s mom excited about her magic tupperware.

AI implies intelligence

To me that means an autonomous being that understands what it is.

First of all these programs aren’t autonomous, they need to be seeded by us. We send a prompt or question, even when left alone to its own devices it doesn’t do anything until it is given an objective or reward by us.

Looking up the most common answer isn’t intelligence, there is no understanding of cause and effect going on inside the algorithm, just regurgitating the dataset

These models do not reason, though some do a very good job of trying to convince us.

merc@sh.itjust.works · 5 months ago

then continue to shill it for use cases it wasn’t made for either

The only thing it was made for is “spicy autocomplete”.

jsomae@lemmy.ml · 5 months ago

Turns out spicy autocomplete can contribute to the bottom line. Capitalism :(

merc@sh.itjust.works · 5 months ago

So could tulip bulbs, for a while.

Gladaed@feddit.org · 5 months ago

Fair point, but a big part of “intelligence” tasks are memorization.

BussyCat@lemmy.world · 5 months ago

Computers for all intents are purposes have perfect recall so since it was trained on a large data set it would have much better intelligence. But in reality what we consider intelligence is extrapolating from existing knowledge which is what “AI” has shown to be pretty shit at

Gladaed@feddit.org · 5 months ago

They don’t. They can save information on drives, but searching is expensive and fuzzy search is a mystery.

Just because you can save a mp3 without losing data does not mean you can save the entire Internet in 400gb and search within an instant.

BarrelAgedBoredom@lemm.ee · 5 months ago

It’s marketed like its AGI, so we should treat it like AGI to show that it isn’t AGI. Lots of people buy the bullshit

Knock_Knock_Lemmy_In@lemmy.world · 5 months ago

AGI is only a benchmark because it gets OpenAI out of a contract with Microsoft when it occurs.

merc@sh.itjust.works · 5 months ago

You can even drop the “a” and “g”. There isn’t even “intelligence” here. It’s not thinking, it’s just spicy autocomplete.

outhouseperilous@lemmy.dbzer0.com · 5 months ago

Barely even spicy.

SoftestSapphic@lemmy.world · 5 months ago

Maybe they should call it what it is

Machine Learning algorithms from 1990 repackaged and sold to us by marketing teams.

jsomae@lemmy.ml · 5 months ago

Machine learning algorithm from 2017, scaled up a few orders of magnitude so that it finally more or less works, then repackaged and sold by marketing teams.

SoftestSapphic@lemmy.world · edit-2 5 months ago

Adding weights doesn’t make it a fundamentally different algorithm.

We have hit a wall where these programs have combed over the totality of the internet and all available datasets and texts in existence.

There isn’t any more training data to improve with, and these programs have stated polluting the internet with bad data that will make them even dumber and incorrect in the long run.

We’re done here until there’s a fundamentally new approach that isn’t repetitive training.

outhouseperilous@lemmy.dbzer0.com · edit-2 5 months ago

Okay but have you considered that if we just reduce human intelligence enough, we can still maybe get these things equivalent to human level intelligence, or slightly above?

We have the technology.

Also literally all the resources in the world.

jsomae@lemmy.ml · edit-2 5 months ago

Transformers were pretty novel in 2017, I don’t know if they were really around before that.

Anyway, I’m doubtful that a larger corpus is what’s needed at this point. (Though that said, there’s a lot more text remaining in instant messager chat logs like discord that probably have yet to be integrated into LLMs. Not sure.) I’m also doubtful that scaling up is going to keep working, but it wouldn’t surprise that much me if it does keep working for a long while. My guess is that there’s some small tweaks to be discovered that really improve things a lot but still basically like like repetitive training as you put it. Who can really say though.

outhouseperilous@lemmy.dbzer0.com · 5 months ago

Hey now, that’s unfair and queerphobic.

These models are from 1950, with juiced up data sets. Alan turing personally sid a lot of work on them, before he cracked the math and figured out they were shit and would always be shit.

SoftestSapphic@lemmy.world · 5 months ago

Fair lol

Alan Turing was the GOAT

RIP my beautiful prince

outhouseperilous@lemmy.dbzer0.com · 5 months ago

Also, thank you for being basically a person. This topic does a lot to convince me those aren’t a thing.

outhouseperilous@lemmy.dbzer0.com · 5 months ago

His politics weren’t perfect, but he got more nazis killed than a lot of people with much worse takes, and was a genuinely brilliant reasonably ethical contributor to a lot of cool shit that should have fucking stayed cool.

hornyalt@lemmynsfw.com · 5 months ago

“A guy instead”

slaacaa@lemmy.world · 5 months ago

Singularity is here

Lukas Murch@thelemmy.club · 5 months ago

AI is amazing, we’re so fucked.

/s

ZILtoid1991@lemmy.world · 5 months ago

Reality:

The AI was trained to answer 3 to this question correctly.

Wait until the AI gets burned on a different question. Skeptics will rightfully use it to criticize LLMs for just being stochastic parrots, until LLM developers teach their models to answer it correctly, then the AI bros will use it as a proof it becoming “more and more human like”.

outhouseperilous@lemmy.dbzer0.com · 5 months ago

No but see they’re not skeptics, they’re just haters, and there is no valid criticism of this tech. Sorry.

And also youve just been banned from like twenty places tor being A FANATIC “anti ai shill”. Genuinely check the mod log, these fuckers are cultists.

jsomae@lemmy.ml · edit-2 5 months ago

When we see LLMs struggling to demonstrate an understanding of what letters are in each of the tokens that it emits or understand a word when there are spaces between each letter, we should compare it to a human struggling to understand a word written in IPA format (/sʌtʃ əz ðɪs/) even though we can understand the word spoken aloud normally perfectly fine.

GandalftheBlack@feddit.org · 5 months ago

But if you’ve learned IPA you can read it just fine

sheetzoos@lemmy.world · 5 months ago

Honey, AI just did something new. It’s time to move the goalposts again.

Echo5@lemmy.world · 5 months ago

Maybe OP was low on the priority list for computing power? Idk how this stuff works