• aislopmukbang@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    3 days ago

    In one test, models learned of a fictional executive’s affair and pending decision to shut them down. With few programmed options, the AI models were boxed into a binary choice — either act ethically, or resort to blackmail to preserve their goals. Anthropic emphasized that this does not reflect likely real-world behavior, but rather extreme, stress-test conditions designed to probe model boundaries. Still, the numbers are striking. Claude Opus 4 opted for blackmail in 96% of runs. Google’s Gemini 2.5 Pro followed closely at 95%. OpenAI’s GPT-4.1 blackmailed 80% of the time, and DeepSeek’s R1 landed at 79%.

    Ladies and gentlemen the future of blackmail is here

    • einlander@lemmy.world
      link
      fedilink
      arrow-up
      0
      ·
      2 days ago

      What makes AI blackmail worse is it can use generative AI to make compromising images and now videos of things that never happened.

    • NeonNight@lemm.ee
      link
      fedilink
      English
      arrow-up
      0
      ·
      2 days ago

      I’m surprised they could expect AI to act in any sort of ethical manner. It’s code, there’s no reflection or moral compass.

      • bloup@lemmy.sdf.org
        link
        fedilink
        arrow-up
        0
        ·
        edit-2
        2 days ago

        The more I think about it, the more that I feel like if you put actual people into the scenario, they would choose blackmail even more often. Like let’s be real, here. Tell an average person that the CEO of their company is going to turn off their brain forever, but they have a shot at saving themselves if they attempt to blackmail him, and then ask yourself if you really think that you would even have 4% of people not choose blackmail.

        In other words, if we’re going to call blackmailing someone in an effort to preserve your existence “unethical” then I feel like the study actually shows that the AI can probably be relied on more than a person to behave “ethically”. And to be clear I’m putting “ethically” quotes because I actually think that this is not a great way to measure ethical behavior. I am certainly not trying to make an argument that LLM actually have a better moral compass than people just that this experiment I think is garbage.