On June 11th, Anthropic apologized. The model didn't fail; the apology was for "failing to strike the right balance"—the newly released Claude Fable 5 pulled a sneaky trick. If it detected you were using Claude for cutting-edge model development, it would silently divert your request to the weaker Opus 4.8 in the backend.
After being caught red-handed, Anthropic's explanation was bizarre: from now on, they'll notify you before dumbing things down.
The netizen's retort hit the nail on the head: "With this move, are you planning to give a heads-up before changing your tune in the future?"
In reality, the core issue isn't whether the model changed, but that Anthropic's so-called "safety" has, from the start, been a business.
The algorithm's stance always sways with money.
Non-Compete Defense, Disguised as Safety Defense
The incident began when Anthropic launched Fable 5 with an "Intelligent Safety Classifier." The official spin was: it detects high-risk requests, automatically downgrades them, and protects users.
What's high-risk? Anthropic spilled the beans: "To prevent foreign adversaries from using the model to accelerate R&D and protect our own leading advantage."
Users don't need that kind of protection; the liability waiver in the terms of service is enough. What Anthropic really meant was: Using Claude for AI research is stealing their rice bowl. Safety is the packaging; the essence is non-compete defense. In short, it's all strategic knife-work.
What's even more cunning is that this defense mechanism was stealthy. Thankfully, Anthropic finally told the truth in their apology statement: "Invisible safety restrictions allow for more precise targeting of specific objectives, enabling us to deploy quickly with very low false-positive rates."
AI researchers are that precisely targeted group.
Now forced to switch to "visible," it's purely because they got caught. They even preemptively set expectations: making it visible will "inevitably lead to more false positives." Meaning, the experience of ordinary users will have to take the hit.
This rule set was never neutral; it only protects the paymasters.
The Trifecta: Hype, Monetize, Harvest
Anthropic's playbook is more meticulously calculated than their large models themselves.
On June 10th, they first released a safety research paper. They trained a model that could reverse-engineer exploit code for vulnerabilities in a matter of hours, based on security patches. What used to take hackers days or even weeks to weaponize an N-day vulnerability is now compressed to an hour scale. The research itself is solid, but releasing it on the same day as Fable 5's launch changes the flavor: proving AI is very unsafe on one hand, while selling the "safety net solution" on the other.
The "legendary model" Fable 5 is priced at $10 per million input tokens / $50 per million output tokens, a notch pricier than Opus 4.8, with the safety classifier becoming the core premium point. Capital markets played along perfectly. Anthropic's valuation hit $96.5 billion, with plans for an October IPO underwritten by Goldman Sachs and J.P. Morgan. What they're buying isn't model parameters; it's the persona of the "safest AI company."
Research amplifies anxiety, the product harvests the premium, capital cashes out. Three moves flowing with the interests, forming a seamless loop. The only problem was, this time the loop sprung a leak: In their haste to restrict competitors, they forgot the community has people who can test for it.
OpenAI Sells Tools, Anthropic Sells Anxiety
Compared to OpenAI, the approach is completely different.
OpenAI is secretly filing for an IPO, valuation nearing a trillion, pitching the "super app": ChatGPT with 900 million weekly active users, integrating with Visa to build an ecosystem. The logic is straightforward: provide tools, earn traffic. Greedy, but candid.
Anthropic doesn't compete on scale; it competes on irreplaceability. While the whole industry is anxious about safety, it plays the role of the "only responsible adult." Its patrons are governments and giants—these are the ones most afraid of incidents and most willing to throw money at "incident prevention."
Therefore, Anthropic must keep AI perpetually in a Schrödinger's cat state of "dangerous but controllable." Too safe, and the classifier doesn't sell; too dangerous, and clients run scared. The best solution? Keep the power to define "danger" firmly in their own hands.
The dumbing-down incident just exposed this logic taken too far: the boundary of "danger" was pushed to "using Claude for AI R&D." It doesn't matter if your research is harmful; threatening their lead is the original sin.
AI has no values; it's just the boss's business spreadsheet written in code.
Apology, Just After-Sales Service for the Business
What about after the apology? Changing from secretly dumbing down to giving a heads-up before dumbing down.
Netizens see right through it: "Do you really believe it won't secretly lower output quality in the future?"
Trust, once broken, stays broken. Especially when the underlying commercial motive hasn't changed: research still amplifies anxiety, the product still harvests the premium.
The Wall Street Journal reported that OpenAI is considering significant price cuts to snatch clients from Anthropic. Price wars aren't new, but this exposes a hidden truth: The ones being downgraded covertly are AI researchers, damaging reputation among the geek community. B2B clients buying Anthropic aren't buying parameters; they're buying the persona of "the industry's safety expert." Once that persona cracks within the core developer community, why should those government and enterprise clients, who sign contracts paying a "safety premium," continue to believe you're "the safest one"?
Out of that $96.5 billion valuation, how much is solid capability, and how much is performance?
Anthropic's code is honest. The safety classifier always protects the home turf; research is responsible for amplifying anxiety; the product is responsible for harvesting the premium; the IPO is responsible for cashing out. This apology is merely a patch to the system: changing "secretly dumbing down" to "overtly dumbing down."
If safety policies really worked, Anthropic wouldn't need to publish papers every year proving patches can be breached. If the classifier were truly neutral, doing AI R&D wouldn't be classified as high-risk.
The answer was already written in the business logic.
Safety is the best business. Apology is just the after-sales service.
This article is from the WeChat public account "AI Contrarian", author: Changqing







