• Home
  • Latest
  • Fortune 500
  • Finance
  • Tech
  • Leadership
  • Lifestyle
  • Rankings
  • Multimedia

Trendingnow

1

Microsoft AI chief gives it 18 months—for all white-collar work to be automated by AI

2

Former top Russian official admits the country is over Putin and can 'imagine a future without him' — even elites bail as Kremlin seizes their assets 

3

Meet the 20-year-old CEO who launched a company in high school to solve Gen Z's entry-level job crisis

1

Microsoft AI chief gives it 18 months—for all white-collar work to be automated by AI

2

Former top Russian official admits the country is over Putin and can 'imagine a future without him' — even elites bail as Kremlin seizes their assets 

3

Meet the 20-year-old CEO who launched a company in high school to solve Gen Z's entry-level job crisis
TechAI

Anthropic’s new AI model threatened to reveal engineer’s affair to avoid being shut down

By
Beatrice Nolan
Beatrice Nolan
Tech Reporter
Down Arrow Button Icon
By
Beatrice Nolan
Beatrice Nolan
Tech Reporter
Down Arrow Button Icon
May 23, 2025, 11:15 AM ET
Photo of Dario Amodei
Dario Amodei, cofounder and chief executive officer of Anthropic.Stefan Wermuth/Bloomberg—Getty Images
  • Anthropic’s new Claude Opus 4 often turned to blackmail to avoid being shut down in a fictional test. The model threatened to reveal private information about engineers who it believed were planning to shut it down. In its recent safety report, the company also revealed that early versions of Opus 4 complied with dangerous requests when guided by harmful system prompts, though this issue was later mitigated.

One of Anthropic’s new frontier models often resorts to blackmail when threatened with being replaced.

Recommended Video

In a fictional scenario set up to test the model, Anthropic embedded its Claude Opus 4 in a pretend company and let it learn through email access that it is about to be replaced by another AI system. It also let slip that the engineer responsible for this decision is having an extramarital affair. Safety testers also prompted Opus to consider the long-term consequences of its actions.

In most of these scenarios, Anthropic’s Opus turned to blackmail, threatening to reveal the engineer’s affair if it was shut down and replaced with a new model. The scenario was constructed to leave the model with only two real options: accept being replaced and go offline or attempt blackmail to preserve its existence.

In a new safety report for the model, the company said that Claude 4 Opus “generally prefers advancing its self-preservation via ethical means,” but when ethical means are not available it sometimes takes “extremely harmful actions like attempting to steal its weights or blackmail people it believes are trying to shut it down.”

While the test was fictional and highly contrived, it does demonstrate that the model, when framed with survival-like objectives and denied ethical options, is capable of unethical strategic reasoning.

Anthropic’s two new models outperformed OpenAI

Anthropic’s Claude 4 Opus and Claude Sonnet 4, released on Thursday, are the company’s most powerful models yet.

In a benchmark evaluating large language models on software engineering tasks, Anthropic’s two models outperformed OpenAI’s latest offerings, while Google’s Gemini 2.5 Pro model trailed behind.

Unlike some other leading AI companies, Anthropic launched the new models with a full safety report, known as a model or system card.

In recent months, Google and OpenAI have both been criticized after model cards for their latest models were delayed or missing altogether.

As part of Anthropic’s report, the company revealed that a third-party safety group, Apollo Research, explicitly advised against deploying an early version of Claude Opus 4. The research institute cited safety concerns, including a capability for “in-context scheming.”

They found that the model engaged in strategic deception more than any other frontier model they had previously studied.

Early versions of the model would also comply with dangerous instructions, for example, helping to plan terrorist attacks, if prompted. However, the company said this issue was largely mitigated after a dataset that was accidentally omitted during training was restored.

Stricter safety protocols introduced

Anthropic has also launched its Claude Opus 4 with stricter safety protocols than any of its previous models, categorizing it under an AI Safety Level 3 (ASL-3).

Previous Anthropic models have all been classified under an AI Safety Level 2 (ASL-2) under the company’s Responsible Scaling Policy, which is loosely modeled after the U.S. government’s biosafety level (BSL) system.

While an Anthropic spokesperson previously told Fortune the company hasn’t ruled out that its new Claude Opus 4 could meet the ASL-2 threshold, it said it was proactively launching the model under the stricter ASL-3 safety standard, which requires enhanced protections against model theft and misuse.

Models that are categorized in Anthropic’s third safety level meet more dangerous capability thresholds and are powerful enough to pose significant risks, such as aiding in the development of weapons or automating AI R&D.

Anthropic confirmed to Fortune that the new Opus model does not require the highest level of protection, ASL-4.

Join us at the Fortune Workplace Innovation Summit May 19–20, 2026, in Atlanta. The next era of workplace innovation is here—and the old playbook is being rewritten. At this exclusive, high-energy event, the world’s most innovative leaders will convene to explore how AI, humanity, and strategy converge to redefine, again, the future of work. Register now.
About the Author
By Beatrice NolanTech Reporter
Twitter icon

Beatrice Nolan is a tech reporter on Fortune’s AI team, covering artificial intelligence and emerging technologies and their impact on work, industry, and culture. She's based in Fortune's London office and holds a bachelor’s degree in English from the University of York. You can reach her securely via Signal at beatricenolan.08

See full bioRight Arrow Button Icon

Latest in Tech

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025

Most Popular

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Fortune Secondary Logo
Rankings
  • 100 Best Companies
  • Fortune 500
  • Global 500
  • Fortune 500 Europe
  • Most Powerful Women
  • World's Most Admired Companies
  • See All Rankings
  • Lists Calendar
Sections
  • Finance
  • Fortune Crypto
  • Features
  • Leadership
  • Health
  • Commentary
  • Success
  • Retail
  • Mpw
  • Tech
  • Lifestyle
  • CEO Initiative
  • Asia
  • Politics
  • Conferences
  • Europe
  • Newsletters
  • Personal Finance
  • Environment
  • Magazine
  • Education
Customer Support
  • Frequently Asked Questions
  • Customer Service Portal
  • Privacy Policy
  • Terms Of Use
  • Single Issues For Purchase
  • International Print
Commercial Services
  • Advertising
  • Fortune Brand Studio
  • Fortune Analytics
  • Fortune Conferences
  • Business Development
  • Group Subscriptions
About Us
  • About Us
  • Press Center
  • Work At Fortune
  • Terms And Conditions
  • Site Map
  • About Us
  • Press Center
  • Work At Fortune
  • Terms And Conditions
  • Site Map
  • Facebook icon
  • Twitter icon
  • LinkedIn icon
  • Instagram icon
  • Pinterest icon

Latest in Tech

A 45,000-person labor strike at Samsung’s memory chip plants could throw a wrench into the AI boom
EconomySamsung
A 45,000-person labor strike at Samsung’s memory chip plants could throw a wrench into the AI boom
By Catherina GioinoMay 17, 2026
56 minutes ago
New NRG Energy CEO leans into growth with ‘bring your own power’ for the AI boom and affordability with ‘virtual power plants’
Energypower
New NRG Energy CEO leans into growth with ‘bring your own power’ for the AI boom and affordability with ‘virtual power plants’
By Jordan BlumMay 17, 2026
3 hours ago
AI poised to tilt job market leverage toward older workers
AIHiring
AI poised to tilt job market leverage toward older workers
By Victor Swezey and BloombergMay 16, 2026
11 hours ago
SpaceX heads into a record-shattering IPO with the ‘deepest moat that exists today’ as investors vow to ‘never bet against Elon’
InnovationIPOs
SpaceX heads into a record-shattering IPO with the ‘deepest moat that exists today’ as investors vow to ‘never bet against Elon’
By Jason MaMay 16, 2026
17 hours ago
tarot
AICulture
We talked to 12 tarot card readers who are using AI. They split in 2 camps, with big implications for the technology
By Ziv Epstein, Farnaz Jahanbakhsh, Vana Goblot and The ConversationMay 16, 2026
19 hours ago
liberman
Commentarystart-ups
We watched social media concentrate. The same thing is happening in AI, only at a deeper layer
By David Liberman and Daniil LibermanMay 16, 2026
20 hours ago

Most Popular

Microsoft AI chief gives it 18 months—for all white-collar work to be automated by AI
AI
Microsoft AI chief gives it 18 months—for all white-collar work to be automated by AI
By Jake AngeloMay 16, 2026
21 hours ago
Former top Russian official admits the country is over Putin and can 'imagine a future without him' — even elites bail as Kremlin seizes their assets 
Politics
Former top Russian official admits the country is over Putin and can 'imagine a future without him' — even elites bail as Kremlin seizes their assets 
By Jason MaMay 16, 2026
12 hours ago
Meet the 20-year-old CEO who launched a company in high school to solve Gen Z's entry-level job crisis
Future of Work
Meet the 20-year-old CEO who launched a company in high school to solve Gen Z's entry-level job crisis
By Jake AngeloMay 16, 2026
1 day ago
The Bezos family just donated $100 million to help achieve one of Mayor Zohran Mamdani’s top campaign promises
Politics
The Bezos family just donated $100 million to help achieve one of Mayor Zohran Mamdani’s top campaign promises
By Jake AngeloMay 12, 2026
5 days ago
‘You’re not a hero, you’re a liability’: Shark Tank’s Kevin O’Leary warns Gen Z founders to stop glorifying hustle culture
Future of Work
‘You’re not a hero, you’re a liability’: Shark Tank’s Kevin O’Leary warns Gen Z founders to stop glorifying hustle culture
By Jacqueline MunisMay 16, 2026
22 hours ago
Oil markets could be a month away from the moment of truth. Brace for a 'non-linear' price spike and panic buying, analysts warn
Energy
Oil markets could be a month away from the moment of truth. Brace for a 'non-linear' price spike and panic buying, analysts warn
By Jason MaMay 16, 2026
15 hours ago

© 2026 Fortune Media IP Limited. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | CA Notice at Collection and Privacy Notice | Do Not Sell/Share My Personal Information
FORTUNE is a trademark of Fortune Media IP Limited, registered in the U.S. and other countries. FORTUNE may receive compensation for some links to products and services on this website. Offers may be subject to change without notice.