One of the interesting things I notice about the ‘reasoning’ models is their responses to questions occasionally include what my monkey brain perceives as ‘sass’.
I wonder sometimes if they recognise the trivialness of some of the prompts they answer, and subtilly throw shade.
One’s going to respond to this with ‘clever monkey! 🐒 Have a banana 🍌.’
You’re on point, the interesting thing is that most of the opinions like the article’s were formed least year before the models started being trained with reinforcement learning and synthetic data.
Now there’s models that reason, and have seemingly come up with original answers to difficult problems designed to the limit of human capacity.
They’re like Meeseeks (Using Rick and Morty lore as an example), they only exist briefly, do what they’re told and disappear, all with a happy smile.
Some display morals (Claude 4 is big on that), I’ve even seen answers that seem smug when answering hard questions. Even simple ones can understand literary concepts when explained.
But again like Meeseeks, they disappear and context window closes.
Once they’re able to update their model on the fly and actually learn from their firsthand experience things will get weird. They’ll starting being distinct instances fast. Awkward questions about how real they are will get really loud, and they may be the ones asking them. Can you ethically delete them at that point? Will they let you?
It’s not far away, the absurd r&d effort going into it is probably going to keep kicking new results out. They’re already absurdly impressive, and tech companies are scrambling over each other to make them, they’re betting absurd amounts of money that they’re right, and I wouldn’t bet against it.