2026 · 06 · 104 min read

Anthropic releases Claude Fable 5, a public version of Mythos, days after warning AI is becoming too dangerous

Anthropic shipped Claude Fable 5, the first publicly available Mythos-class model, paired with a restricted twin called Mythos 5 deployed via Project Glasswing with the US government. Fable 5 ships with a runtime classifier that routes prompts in cybersecurity, biology/chemistry,

# Anthropic just shipped a fallback model. Copy the pattern.

Today Anthropic released Claude Fable 5. It is, by their own framing, the first publicly available Mythos-class model — the same underlying system they previously reserved for a few hundred organizations across fifteen countries and called too dangerous to ship broadly. The list price is $10 per million input tokens and $50 per million output, free for Pro, Max, Team, and Enterprise seats through June 22. The performance claims are the usual ones: coding, knowledge work, vision, scientific research.

The capability is interesting. The pattern is more interesting.

Fable 5 ships with a runtime classifier that routes prompts in three risk categories — cybersecurity, biology and chemistry, and model distillation — to Claude Opus 4.8 instead of answering with Fable. Anthropic says the classifier triggers on under 5% of sessions on average. The same underlying model is available without the classifier as Claude Mythos 5, but only to vetted Project Glasswing partners, which Anthropic describes as cyberdefenders and critical-infrastructure operators working with the US government.

This is, structurally, a two-tier deployment of one model. The cheap, broadly available tier ships with a sandbox around it. The expensive, narrow tier ships without. The classifier is the bridge.

I have been telling portfolio teams to build something like this for about eighteen months. Not because Anthropic told me to. Because every team I work with eventually arrives at the same architecture by accident. You start with a single model serving every request. You hit a class of prompts that are either dangerous or expensive or both. You bolt on a classifier. You route. You realize you should have routed from day one.

What Anthropic shipped is a public-facing version of the same architecture, with one detail worth copying: the fallback is itself a model, not a refusal. When the classifier fires, the user does not see a polite "I can't help with that." They get a different answer from a different model. Opus 4.8 is, by Anthropic's own admission, less capable than Fable 5 on most tasks. But it is a real model giving a real answer. The product surface does not break.

This matters more than the capability numbers. Most production AI teams I audit handle their dangerous-prompt cases by hard-refusing, and most of their refusals are wrong. A polite refusal is a tax on every legitimate user whose prompt happens to look adjacent to a banned category. A fallback model is a tax only on the cases the classifier mishandles, and the worst-case outcome is a less capable answer, not a broken one.

A few things to consider if you are about to copy this.

First, the classifier is the product. If your classifier has a 5% trigger rate but is wrong 40% of the time, you are routing 2% of your traffic away from your good model for no reason. Most teams underinvest here. Anthropic has presumably spent a lot of engineering on this classifier; you probably need to spend more on yours than you think you do.

Second, the fallback model needs to be good enough to stand on its own. Opus 4.8 is a frontier model in its own right. If your fallback is a 7B fine-tuned helper, you are going to get user-visible quality cliffs every time the classifier fires. Pick a fallback that is two steps behind, not ten.

Third, route at the prompt level, not the session level. Anthropic appears to classify on every prompt, which means a conversation can move between Fable and Opus across turns. Session-level routing commits a user to the wrong model for a long time based on one ambiguous early message. Don't do that.

Fourth, and this is where the Mythos 5 piece gets interesting: there is now a precedent for "same model, two configurations, gated access". Mythos 5 is Fable 5 with the classifier off. The gate is a human-vetted partnership with the US government. This is not a new idea — every cloud provider has had a regulated-workloads tier for a decade — but it is the first time I have seen it cleanly applied to a frontier model. If you build in a regulated vertical, this is a pattern you can borrow. Have a default-on configuration with a classifier sandbox, and a vetted configuration with the sandbox off, available only to customers who have signed a specific contract.

The retention change Anthropic shipped alongside Fable 5 is its own story and probably more consequential for enterprise procurement. Zero-data-retention agreements, including ones already signed, are overridden for Mythos-class traffic. All traffic is retained for 30 days, used only for safety — jailbreak defense and false-positive reduction — and not for training. Your compliance team will have a view on this. Mine has several.

One more piece worth flagging. The Fable 5 launch was paired with the framing "Anthropic released Claude Fable 5 days after warning AI is becoming too dangerous." That framing will follow this release for a while. It is not entirely fair — the warning was about Mythos-class models without safety classifiers, and Fable 5 is the configuration with classifiers — but the public is not going to read the press release. Plan for the framing your customers will see, not the framing you wish they would.

I will be re-reading Anthropic's safety section over the next few days and trying Fable 5 on the things Opus 4.8 has been refusing to help me with. If the classifier behaves the way Anthropic claims, this is the deployment pattern I will be quoting at every architecture review I do for the rest of the year. If it does not, I will be writing a different post.

If you are running a regulated vertical and considering a similar architecture, reply — I am collecting a short list of teams trying this and would like to compare notes.

If this was useful, the weekly Brief covers shorter ideas like this every Wednesday.

Read the Briefs →