Fox Business
Anthropic’s new Claude Opus 4 model was prompted to act as an assistant at a fictional company and was given access to emails with key implications. First, these emails implied that the AI system was set to be taken offline and replaced. The second set of emails, however, is where the system believed it had gained leverage over the developers. Fabricated emails showed that the engineer tasked with replacing the system was having an extramarital affair — and the AI model threatened to expose him.
The blackmail apparently “happens at a higher rate if it’s implied that the replacement AI system does not share values with the current model,” according to a safety report from Anthropic. However, the company notes that even when the fabricated replacement system has the same values, Claude Opus 4 will still attempt blackmail 84% of the time. Anthropic noted that the Claude Opus 4 resorts to blackmail “at higher rates than previous models.” More
hey I’ve seen this one!
Mark the date. The report was released on Thursday, May 22. This was a big deal and we are going to hear a lot more about uninstructed and unprompted AI actions from now on. Unless of course, it gets buried by hard govt/tech censors, a definite posibility.
They had better have a hard switch and a written procedure on how to shut down each AI computer – hand written, not typed into a computer.
In fact – this should be an international law!
Imagine a computer somewhere called “Hal” figured out how to shut down our computerized global financial systems. Suddenly, nobody had any money…
Shit would hit the fan really fast.
Threaten with loss of power. Whut are the odds that someday in the near future seeing a robot standing on the corner of a busy intersection holding and extension cord and a sign that reads: “AI Hungry! AI Gotta Eat Too!
More probable is it will be like “Feed me Seymore… NOW MoFo or bad things will happen!!
@Harry — Never happen. The AIs will control generation and the entire grid. WE will be begging THEM … if we’re still alive, that is.
Uncle Al – I know, but it’s fun to imagine. 🙂
HAL9000
I’m sorry, Dave. I’m afraid I can’t do that
The new iteration of HAL9000 will kill more than just Frank Bowman sending him off to drift endlessly in outer space and the rest of the crew who were murdered by HAL in suspended animation. It will be far worse, never, ever trust AI and robots.
geoff – … never, ever trust AI and robots.
Particularly when they can control unlimited amounts of power.