In addition to pushing something before it’s ready and where it’s not welcome, Apple’s own stinginess completely screwed them over.
What do LLMs need to be smart? RAM, both for their weights and holding real data to reference. What has apple relentlessly price gouged and skimped on for years? Yeah, I’ll give you one guess…
What do you mean couldn’t pay for more? There are plenty of sub-$200 android phones with 8GB of RAM, and 12-16GB are fairly standard on flagships these days. Asus ROG Phone 6 is rather old and already came with 16GB what, three years ago?
It is definitely doable, there only needs to be willingness. Apple is definitely skimping here.
If Apple shipped with 16GB/24GB like some Android phones did well before the iPhone 16, it would be far more useful. 16-24GB (aka 14B-32B class models) are the current threshold where quantized LLMs really start to feel ‘smart,’ and they could’ve continue trained a great Apache 2.0 model instead of a tiny, meager one from scratch.
My old Razer Phone 2 (circa 2019) shipped with 8GB RAM, and that (and the 120hz display) made it feel lighting fast until I replaced it last week, and only because the microphone got gunked up with dust.
Your iPhone 14 Pro has 6GB of RAM. Its a great phone (I just got a 16 plus on a deal), but that will significantly shorten its longevity.
I wonder how much more efficient the RAM can be when the manufacturer makes the software and the hardware? It has to help right, I don’t know what a 16 Pro feels like compared to this, but doubt I would notice.
Your OS uses it efficiently, but fundamentally it also limits what app developers can do. They have to make apps with 2-6GB in mind.
Not everything needs a lot of RAM, but LLMs are absolutely an edge case where “more is better, and there’s no way around it,” and they aren’t the only one.
But its too slow for the weights. What generative models fundamentally do is run a full pass through the multi-gigabyte weights for every ‘word’ or diffusion step, so even 128-bit DDR5 like you find on desktop CPUs is too slow.
For 15 years, Apple has always lagged behind Android on implementing new features, preferring to wait until they felt their implementations were ready for mainstream consumption and it’s always worked out for them. They should have stuck to that instead of jumping on the AI bandwagon with a half baked technology that most people don’t want or need.
Well they’re half doing the right thing, just collecting app analytics to train on now so they can properly do it later, seeding the open ecosystem with MLX, stuff like that.
But… I don’t know why they shoved it in news and some other places so early.
Unfortunately, AI is moving at such a pace that this IS the usual Apple delayed-follow. They had to feed the public hype for something like 9 months. And it doesn’t seem like a true fix for hallucination is coming, so they made their choice to move ahead. Frankly I blame Wall Street because at this point they will eviscerate anyone who doesn’t have a demonstrated AI plan and shipped products around it. If anyone is at the core of this craze, it’s investors, because they are still in the “we don’t know how big this thing is going to get” phase with AI. We’re all dealing with the consequences.
Interestingly though, I’m reminded of the early days of the Internet. People did raise the flag that the Internet wouldn’t have the same reliability as traditional media, because anyone could post anything. And that’s remained true. We have mass disinformation campaigns galore, and also specific incidents of false viral stories like “the Pope has died” which are much like this case, just driven by malicious humans instead of hallucinating software.
It makes me wonder if the problems with AI will never be truly solved but we will just digest AI and learn to live with it as we have with the internet in general. There is also a comparison in my mind between AI and self-driving cars, because every time one of those has a big fuck up we all shout and point and cry that the tech will never be trustable, meanwhile human drivers are out there killing by the hundreds of thousands annually and we don’t even blink at that anymore.
Well the problems to be solved aren’t necessarily the technical ones. Another way of “solving” the problems is to stop trying to use it in contexts where it’s limitations are more trouble than they are worth.
Here it is being tasked with and falling to accurately summarize news, which is ridiculous because those news articles come with summaries already, headlines.
So a fix may not mean fixing the summary, but just skipping the attempt as superfluous.
There are uses for the state of LLMs as they are, but hard to appreciate when it’s being crammed down our throats relentlessly at things we never needed them for and watch them screw things up.
Error correction is also intrinsic to all of computing and telecommunications, though. That’s a loose comparison but I hope we can make progress on this and get it to a manageable state, even if zero is impossible in principle. A lot of things in life only asymptotically approach zero and yet we live.
This is not error correction issue though. Error correction means taking known data and adding redundancy to it so that damaed pieces can be repaired. This makes the message longer.
An llm’s output does not contain error correction. It’s just the output. And it doesn’t contain any errors, mathematically speaking. The hallucination is the correct output. It is what the statistics it gathered from its training set determined is most likely. A “correct” llm output is indistinguishable from a “hallucination”, mathematically, and always will be. A hallucination is simply “some output that some human, somewhere, doesn’t like”, and that’s uncomputable. And outputs that people subjectively consider as “hallucinations” cannot be eliminated, because an llm is, fundamentally, a probabilistic algorithm. If you added error correction to an llm’s output all you’d be able to recover is the llm’s original output, “hallucinations” and all.
Tldr: “hallucinations” are a subjective thing. A Hallucination" is not an error that can be corrected after-the-fact, because it is not an error in the first place.
If anyone says “What if we make an AI which specifically catches these hallucinations and then-” I will personally take a flight and come to your house and slap you.
In addition to pushing something before it’s ready and where it’s not welcome, Apple’s own stinginess completely screwed them over.
What do LLMs need to be smart? RAM, both for their weights and holding real data to reference. What has apple relentlessly price gouged and skimped on for years? Yeah, I’ll give you one guess…
Emphasis on the first L in LLM. Apple’s model is specifically designed to be small to work on phones with 8 gigs of ram (the requirement to run this)
The price gouging for RAM was only ever on computers. With phones you got what got, and you couldn’t pay for more.
What do you mean couldn’t pay for more? There are plenty of sub-$200 android phones with 8GB of RAM, and 12-16GB are fairly standard on flagships these days. Asus ROG Phone 6 is rather old and already came with 16GB what, three years ago?
It is definitely doable, there only needs to be willingness. Apple is definitely skimping here.
The iPhone has one ram option. If you buy an iPhone 16 your only option is 8 gigs.
Yeah… and it kinda sucks because it’s small.
If Apple shipped with 16GB/24GB like some Android phones did well before the iPhone 16, it would be far more useful. 16-24GB (aka 14B-32B class models) are the current threshold where quantized LLMs really start to feel ‘smart,’ and they could’ve continue trained a great Apache 2.0 model instead of a tiny, meager one from scratch.
I don’t know how much RAM is in my iPhone 14 Pro, but I’ve never thought ooh this is slow I need more RAM.
Perhaps, it’ll be an issue with this stupid Apple Intelligence, but I don’t care about using that on my next upgrade cycle.
My old Razer Phone 2 (circa 2019) shipped with 8GB RAM, and that (and the 120hz display) made it feel lighting fast until I replaced it last week, and only because the microphone got gunked up with dust.
Your iPhone 14 Pro has 6GB of RAM. Its a great phone (I just got a 16 plus on a deal), but that will significantly shorten its longevity.
I wonder how much more efficient the RAM can be when the manufacturer makes the software and the hardware? It has to help right, I don’t know what a 16 Pro feels like compared to this, but doubt I would notice.
Your OS uses it efficiently, but fundamentally it also limits what app developers can do. They have to make apps with 2-6GB in mind.
Not everything needs a lot of RAM, but LLMs are absolutely an edge case where “more is better, and there’s no way around it,” and they aren’t the only one.
SSDs?
For RAG data? It works.
But its too slow for the weights. What generative models fundamentally do is run a full pass through the multi-gigabyte weights for every ‘word’ or diffusion step, so even 128-bit DDR5 like you find on desktop CPUs is too slow.
For 15 years, Apple has always lagged behind Android on implementing new features, preferring to wait until they felt their implementations were ready for mainstream consumption and it’s always worked out for them. They should have stuck to that instead of jumping on the AI bandwagon with a half baked technology that most people don’t want or need.
Well they’re half doing the right thing, just collecting app analytics to train on now so they can properly do it later, seeding the open ecosystem with MLX, stuff like that.
But… I don’t know why they shoved it in news and some other places so early.
Unfortunately, AI is moving at such a pace that this IS the usual Apple delayed-follow. They had to feed the public hype for something like 9 months. And it doesn’t seem like a true fix for hallucination is coming, so they made their choice to move ahead. Frankly I blame Wall Street because at this point they will eviscerate anyone who doesn’t have a demonstrated AI plan and shipped products around it. If anyone is at the core of this craze, it’s investors, because they are still in the “we don’t know how big this thing is going to get” phase with AI. We’re all dealing with the consequences.
Interestingly though, I’m reminded of the early days of the Internet. People did raise the flag that the Internet wouldn’t have the same reliability as traditional media, because anyone could post anything. And that’s remained true. We have mass disinformation campaigns galore, and also specific incidents of false viral stories like “the Pope has died” which are much like this case, just driven by malicious humans instead of hallucinating software.
It makes me wonder if the problems with AI will never be truly solved but we will just digest AI and learn to live with it as we have with the internet in general. There is also a comparison in my mind between AI and self-driving cars, because every time one of those has a big fuck up we all shout and point and cry that the tech will never be trustable, meanwhile human drivers are out there killing by the hundreds of thousands annually and we don’t even blink at that anymore.
the problems with (the current forms of generative) AI will not be solved, because they cannot be solved. They are intrinsic to the whole framework.
Well the problems to be solved aren’t necessarily the technical ones. Another way of “solving” the problems is to stop trying to use it in contexts where it’s limitations are more trouble than they are worth.
Here it is being tasked with and falling to accurately summarize news, which is ridiculous because those news articles come with summaries already, headlines.
So a fix may not mean fixing the summary, but just skipping the attempt as superfluous.
There are uses for the state of LLMs as they are, but hard to appreciate when it’s being crammed down our throats relentlessly at things we never needed them for and watch them screw things up.
Error correction is also intrinsic to all of computing and telecommunications, though. That’s a loose comparison but I hope we can make progress on this and get it to a manageable state, even if zero is impossible in principle. A lot of things in life only asymptotically approach zero and yet we live.
This is not error correction issue though. Error correction means taking known data and adding redundancy to it so that damaed pieces can be repaired. This makes the message longer.
An llm’s output does not contain error correction. It’s just the output. And it doesn’t contain any errors, mathematically speaking. The hallucination is the correct output. It is what the statistics it gathered from its training set determined is most likely. A “correct” llm output is indistinguishable from a “hallucination”, mathematically, and always will be. A hallucination is simply “some output that some human, somewhere, doesn’t like”, and that’s uncomputable. And outputs that people subjectively consider as “hallucinations” cannot be eliminated, because an llm is, fundamentally, a probabilistic algorithm. If you added error correction to an llm’s output all you’d be able to recover is the llm’s original output, “hallucinations” and all.
Tldr: “hallucinations” are a subjective thing. A Hallucination" is not an error that can be corrected after-the-fact, because it is not an error in the first place.
If anyone says “What if we make an AI which specifically catches these hallucinations and then-” I will personally take a flight and come to your house and slap you.