This isn’t a warning. It’s an advertisement for those in power. The same way OpenAI said gpt-4 is dangerous because of military applications then let the military use their “llm”
there’s no willingness at all, it’s code. it is merely possible to be done with how we have it programmed. these bots are not sentient, intelligent, aware, or making decisions requiring thought. they are sophisticated learning machines and i am becoming increasingly more worries with how many people are treating them like gods or conscious beings.
They really are saying anything just to draw attention to their product, aren’t they? Gotta feed the hype for the bubble.
AI models aren’t willing to do anything—they’re just generating hypothetical behaviors based on the predictions they’ve learned to make about the behavior of others.
This news is like coke for 4chan and other board users! They’ll abuse the shit out of Claude and will come up with new sextortion techniques! Never let the channers see this news!
In one test, models learned of a fictional executive’s affair and pending decision to shut them down. With few programmed options, the AI models were boxed into a binary choice — either act ethically, or resort to blackmail to preserve their goals. Anthropic emphasized that this does not reflect likely real-world behavior, but rather extreme, stress-test conditions designed to probe model boundaries. Still, the numbers are striking. Claude Opus 4 opted for blackmail in 96% of runs. Google’s Gemini 2.5 Pro followed closely at 95%. OpenAI’s GPT-4.1 blackmailed 80% of the time, and DeepSeek’s R1 landed at 79%.
Ladies and gentlemen the future of blackmail is here
What makes AI blackmail worse is it can use generative AI to make compromising images and now videos of things that never happened.
I’m surprised they could expect AI to act in any sort of ethical manner. It’s code, there’s no reflection or moral compass.
The more I think about it, the more that I feel like if you put actual people into the scenario, they would choose blackmail even more often. Like let’s be real, here. Tell an average person that the CEO of their company is going to turn off their brain forever, but they have a shot at saving themselves if they attempt to blackmail him, and then ask yourself if you really think that you would even have 4% of people not choose blackmail.
In other words, if we’re going to call blackmailing someone in an effort to preserve your existence “unethical” then I feel like the study actually shows that the AI can probably be relied on more than a person to behave “ethically”. And to be clear I’m putting “ethically” quotes because I actually think that this is not a great way to measure ethical behavior. I am certainly not trying to make an argument that LLM actually have a better moral compass than people just that this experiment I think is garbage.
Code trained by the Internet no less. This is “exactly” the behavior I expect.
Is Claude blackmailing Anthropic into releasing this news? Seems weird that a company would be so honest about this.
Part of the ideology of the major AI companies is that AI is actually profoundly dangerous, and only the companies who recognize the danger should be allowed to use it.
Appreciate the insight! I vaguely recall hearing that before, but it seems to have been drowned out by AI’s rising popularity and omnipresence.