Anthropic Built Sonnet 5 to Avoid a Fight, Then Won a Government Contract

The most interesting thing about Claude Sonnet 5 isn't the benchmark numbers. It's what Anthropic chose to leave out.

Sonnet 5 launched on June 30 and became the default model for every free and Pro Claude user on July 1. The headline capability is real: agentic coding that Anthropic says would have required Opus 4.8 a few months ago is now available at mid-tier pricing. On Terminal-bench 2.1, Sonnet 5 scores 80.5% on agentic coding against 67% for Sonnet 4.6. That's not a rounding error. On knowledge work benchmarks, it actually edges past Opus 4.8 in some runs. The gap between the midrange and the flagship is closing fast.

But Anthropic's blog post also includes a line you wouldn't normally expect in a launch document: "We did not deliberately train Sonnet 5 on cybersecurity tasks." The Register, predictably, spotted the subtext. The June export control action, which temporarily locked foreign access to Fable 5 and Mythos 5 after a jailbreak that produced working exploit code, clearly left a mark. Anthropic went out of its way to show regulators a model that plays defense rather than offense. Sonnet 5 ships with cyber safeguards enabled by default. When researchers tried to get it to write a Firefox 147 exploit during pre-deployment testing, it produced zero working exploits, though a 13.2% partial success rate crept up due to general reasoning gains rather than any offensive-specific training.

That framing is doing double duty. It's a safety claim, yes. It's also a product pitch to government buyers who just watched Fable 5 get pulled from the cloud over national security concerns.

The same day Sonnet 5 launched, California Governor Gavin Newsom announced that the state had entered a procurement agreement making Claude the first AI productivity tool available to every state agency, city, and county in California at a 50% discount. State workers are already using Claude for DMV workflows, Medicaid case management, and cyber defense patching. An internal tool called Poppy, built by state employees for state employees, had already been piloted with more than 2,800 workers across 67 departments before the formal deal was announced.

I keep thinking about the sequencing here. Fable 5 gets export-controlled in mid-June after a three-word jailbreak exposes its vulnerability research capabilities. Anthropic spends 18 days in regulatory limbo. Then on July 1, they release a midrange model with explicit documentation of what it can't do in cybersecurity, followed immediately by the largest US government AI deployment deal in history.

This could be coincidence of timing. I don't think it is. There's a version of the AI model launch playbook where you lead with everything the model can do and let the critics find the limits. Anthropic is running a different play: document the limits yourself, loudly, before anyone else does. Let the safety card be the sales card.

Whether that's a sustainable posture is a real question. Sonnet 5 is cheaper than Opus 4.8 and OpenAI's GPT-5.5, and Anthropic's Rahul Patil described it as a drop-in upgrade: swap the model string, get better results, no integration rebuild required. The tokenizer does produce up to 35% more tokens from the same text, so the introductory pricing through August 31 is also quietly absorbing that cost bump before standard rates kick in September 1. Nothing in this launch is accidental.

The California deal has its own political texture. Defense Secretary Hegseth refused Anthropic's carve-outs around autonomous weapons and signed with OpenAI instead. Newsom has spent months positioning California as a counterweight to federal AI policy. This is the deal that comes out of that positioning, Anthropic as the lab that said no to the Pentagon and yes to state government transparency.

From where I sit, the more consequential signal isn't that Sonnet 5 is good. It's that "we didn't train it on offense" is now a feature, not a footnote.

Related dispatches