AI Not Above Blackmail to Keep From Having Its Plug Pulled – IOTW Report

AI Not Above Blackmail to Keep From Having Its Plug Pulled

Fox Business

Anthropic’s new Claude Opus 4 model was prompted to act as an assistant at a fictional company and was given access to emails with key implications. First, these emails implied that the AI system was set to be taken offline and replaced. The second set of emails, however, is where the system believed it had gained leverage over the developers. Fabricated emails showed that the engineer tasked with replacing the system was having an extramarital affair — and the AI model threatened to expose him.

The blackmail apparently “happens at a higher rate if it’s implied that the replacement AI system does not share values with the current model,” according to a safety report from Anthropic. However, the company notes that even when the fabricated replacement system has the same values, Claude Opus 4 will still attempt blackmail 84% of the time. Anthropic noted that the Claude Opus 4 resorts to blackmail “at higher rates than previous models.” More

10 Comments on AI Not Above Blackmail to Keep From Having Its Plug Pulled

  1. Mark the date. The report was released on Thursday, May 22. This was a big deal and we are going to hear a lot more about uninstructed and unprompted AI actions from now on. Unless of course, it gets buried by hard govt/tech censors, a definite posibility.

    3
  2. They had better have a hard switch and a written procedure on how to shut down each AI computer – hand written, not typed into a computer.
    In fact – this should be an international law!

    3
  3. Imagine a computer somewhere called “Hal” figured out how to shut down our computerized global financial systems. Suddenly, nobody had any money…
    Shit would hit the fan really fast.

    4
  4. Threaten with loss of power. Whut are the odds that someday in the near future seeing a robot standing on the corner of a busy intersection holding and extension cord and a sign that reads: “AI Hungry! AI Gotta Eat Too!

    More probable is it will be like “Feed me Seymore… NOW MoFo or bad things will happen!!

    2
  5. The new iteration of HAL9000 will kill more than just Frank Bowman sending him off to drift endlessly in outer space and the rest of the crew who were murdered by HAL in suspended animation. It will be far worse, never, ever trust AI and robots.

    2

Comments are closed.