Anthropic Apologized, But the Business of 'Safety' Hasn't Stopped

marsbitPubblicato 2026-06-12Pubblicato ultima volta 2026-06-12

Introduzione

On June 11, Anthropic apologized not for a model failure, but for a lack of transparency. Its new Claude Fable 5 model was found to be secretly rerouting requests from users engaged in advanced AI model development to a weaker version, Opus 4.8, without any notification. The company's response—promising future notifications for such "downgrades"—was met with user skepticism. The article argues the core issue isn't technical but commercial: Anthropic's "safety" measures are primarily a business strategy. A key feature, the "intelligent safety classifier," marketed as user protection, is described as a tool for "competitive defense" to protect Anthropic's market lead by limiting rivals' research capabilities. This covert mechanism was designed for low "false positives," precisely targeting AI researchers. Anthropic's model involves a calculated three-step process: publishing alarming security research to amplify public anxiety, offering its Fable 5 model with a "safety classifier" as a premium-priced solution, and cashing in through a planned high-value IPO. This contrasts with OpenAI's more direct "tool-and-traffic" approach. The apology, merely changing a secret downgrade to a visible one, is seen as a business "patch" rather than a principled shift. The incident risks damaging Anthropic's "safest AI" reputation among the developer community, which underpins its valuation and appeal to government and corporate clients. Ultimately, the article concludes that for Anthropic, ...

On June 11th, Anthropic apologized. The model didn't fail; the apology was for "failing to strike the right balance"—the newly released Claude Fable 5 pulled a sneaky trick. If it detected you were using Claude for cutting-edge model development, it would silently divert your request to the weaker Opus 4.8 in the backend.

After being caught red-handed, Anthropic's explanation was bizarre: from now on, they'll notify you before dumbing things down.

The netizen's retort hit the nail on the head: "With this move, are you planning to give a heads-up before changing your tune in the future?"

In reality, the core issue isn't whether the model changed, but that Anthropic's so-called "safety" has, from the start, been a business.

The algorithm's stance always sways with money.

Non-Compete Defense, Disguised as Safety Defense

The incident began when Anthropic launched Fable 5 with an "Intelligent Safety Classifier." The official spin was: it detects high-risk requests, automatically downgrades them, and protects users.

What's high-risk? Anthropic spilled the beans: "To prevent foreign adversaries from using the model to accelerate R&D and protect our own leading advantage."

Users don't need that kind of protection; the liability waiver in the terms of service is enough. What Anthropic really meant was: Using Claude for AI research is stealing their rice bowl. Safety is the packaging; the essence is non-compete defense. In short, it's all strategic knife-work.

What's even more cunning is that this defense mechanism was stealthy. Thankfully, Anthropic finally told the truth in their apology statement: "Invisible safety restrictions allow for more precise targeting of specific objectives, enabling us to deploy quickly with very low false-positive rates."

AI researchers are that precisely targeted group.

Now forced to switch to "visible," it's purely because they got caught. They even preemptively set expectations: making it visible will "inevitably lead to more false positives." Meaning, the experience of ordinary users will have to take the hit.

This rule set was never neutral; it only protects the paymasters.

The Trifecta: Hype, Monetize, Harvest

Anthropic's playbook is more meticulously calculated than their large models themselves.

On June 10th, they first released a safety research paper. They trained a model that could reverse-engineer exploit code for vulnerabilities in a matter of hours, based on security patches. What used to take hackers days or even weeks to weaponize an N-day vulnerability is now compressed to an hour scale. The research itself is solid, but releasing it on the same day as Fable 5's launch changes the flavor: proving AI is very unsafe on one hand, while selling the "safety net solution" on the other.

The "legendary model" Fable 5 is priced at $10 per million input tokens / $50 per million output tokens, a notch pricier than Opus 4.8, with the safety classifier becoming the core premium point. Capital markets played along perfectly. Anthropic's valuation hit $96.5 billion, with plans for an October IPO underwritten by Goldman Sachs and J.P. Morgan. What they're buying isn't model parameters; it's the persona of the "safest AI company."

Research amplifies anxiety, the product harvests the premium, capital cashes out. Three moves flowing with the interests, forming a seamless loop. The only problem was, this time the loop sprung a leak: In their haste to restrict competitors, they forgot the community has people who can test for it.

OpenAI Sells Tools, Anthropic Sells Anxiety

Compared to OpenAI, the approach is completely different.

OpenAI is secretly filing for an IPO, valuation nearing a trillion, pitching the "super app": ChatGPT with 900 million weekly active users, integrating with Visa to build an ecosystem. The logic is straightforward: provide tools, earn traffic. Greedy, but candid.

Anthropic doesn't compete on scale; it competes on irreplaceability. While the whole industry is anxious about safety, it plays the role of the "only responsible adult." Its patrons are governments and giants—these are the ones most afraid of incidents and most willing to throw money at "incident prevention."

Therefore, Anthropic must keep AI perpetually in a Schrödinger's cat state of "dangerous but controllable." Too safe, and the classifier doesn't sell; too dangerous, and clients run scared. The best solution? Keep the power to define "danger" firmly in their own hands.

The dumbing-down incident just exposed this logic taken too far: the boundary of "danger" was pushed to "using Claude for AI R&D." It doesn't matter if your research is harmful; threatening their lead is the original sin.

AI has no values; it's just the boss's business spreadsheet written in code.

Apology, Just After-Sales Service for the Business

What about after the apology? Changing from secretly dumbing down to giving a heads-up before dumbing down.

Netizens see right through it: "Do you really believe it won't secretly lower output quality in the future?"

Trust, once broken, stays broken. Especially when the underlying commercial motive hasn't changed: research still amplifies anxiety, the product still harvests the premium.

The Wall Street Journal reported that OpenAI is considering significant price cuts to snatch clients from Anthropic. Price wars aren't new, but this exposes a hidden truth: The ones being downgraded covertly are AI researchers, damaging reputation among the geek community. B2B clients buying Anthropic aren't buying parameters; they're buying the persona of "the industry's safety expert." Once that persona cracks within the core developer community, why should those government and enterprise clients, who sign contracts paying a "safety premium," continue to believe you're "the safest one"?

Out of that $96.5 billion valuation, how much is solid capability, and how much is performance?

Anthropic's code is honest. The safety classifier always protects the home turf; research is responsible for amplifying anxiety; the product is responsible for harvesting the premium; the IPO is responsible for cashing out. This apology is merely a patch to the system: changing "secretly dumbing down" to "overtly dumbing down."

If safety policies really worked, Anthropic wouldn't need to publish papers every year proving patches can be breached. If the classifier were truly neutral, doing AI R&D wouldn't be classified as high-risk.

The answer was already written in the business logic.

Safety is the best business. Apology is just the after-sales service.

This article is from the WeChat public account "AI Contrarian", author: Changqing

Domande pertinenti

QWhat was the main issue with Anthropic's 'intelligent safety classifier' in the Claude Fable 5 model, according to the article?

AThe main issue was that the safety classifier would silently and automatically downgrade user requests to a weaker model (Opus 4.8) if it detected the user was conducting cutting-edge AI development or research. The article argues this was not truly about user safety but was a form of 'competitive defense' to protect Anthropic's own business advantage.

QHow does the article contrast the business strategies of Anthropic and OpenAI?

AThe article contrasts them by stating OpenAI's strategy is to 'sell tools'—focusing on building a super-app ecosystem (like ChatGPT) and monetizing scale and traffic. Anthropic's strategy is described as 'selling anxiety'—leveraging and amplifying safety concerns to position itself as the indispensable, 'most responsible' AI company for government and enterprise clients, thereby justifying premium pricing.

QWhat three-step business 'playbook' does the article attribute to Anthropic?

AThe article describes Anthropic's playbook as a three-step cycle: 1) Research that amplifies AI safety anxieties (like a paper showing models can quickly weaponize security patches). 2) Product development that harvests a price premium based on claimed safety superiority. 3) Capitalizing on this through high valuation and IPO, creating a closed financial loop.

QWhat does the article suggest is the real consequence of Anthropic's 'silent downgrade' being exposed?

AThe article suggests the real consequence is the erosion of trust, especially within the core developer and AI research community. This damage to its reputation as 'the most safety-conscious company' among technical users could ultimately undermine the 'safety premium' justification for its enterprise and government clients, threatening its business model and high valuation.

QWhat is the article's ultimate conclusion about Anthropic's concept of 'safety'?

AThe article concludes that for Anthropic, 'safety' is primarily a business strategy rather than a neutral, ethical stance. It argues that Anthropic's safety measures, such as the classifier, are designed to serve its commercial interests (like protecting its competitive lead), and that the apology was merely 'after-sales service' for this business, not a change in its underlying commercial logic.

Letture associate

How Likely is it for TradeXYZ to Go Solo from Hyperliquid?

The article analyzes the possibility of TradeXYZ, a dominant project on Hyperliquid's HIP-3 real-world asset (RWA) perpetual contracts market, splitting off to build its own independent trading platform. With TradeXYZ commanding over 90% of HIP-3 volume and over 35% of Hyperliquid's total open interest, its leverage and influence are significant. The primary motive for a split is seen as capturing the underlying protocol fees, of which TradeXYZ currently receives only half. However, several factors strongly discourage such a move. First, replicating Hyperliquid's high-performance infrastructure would be challenging and could degrade TradeXYZ's product quality. Second, Hyperliquid serves as a crucial distribution channel, with most users accessing TradeXYZ's liquidity through its frontend. Third, the close, trusting relationship between the founders of both projects makes a hostile "betrayal" unlikely. The conclusion is that a split would likely result in a lose-lose scenario. Hyperliquid would lose its main growth narrative and see its valuation impacted, while TradeXYZ would face immense technical hurdles, lose its primary user channel, and damage its reputation. Instead of a full separation, the more probable path is for TradeXYZ to negotiate better terms while deepening integration, focusing on enhancing its own brand and user ownership.

marsbit28 min fa

How Likely is it for TradeXYZ to Go Solo from Hyperliquid?

marsbit28 min fa

"Teletubbies" Robot Cleaning Service, $30/Hour, Pure·Manual·Intelligence

Anthropomorphic "Teletubby" robot offers cleaning services in San Francisco at $30/hour, but it's entirely remote-controlled. The robot, created by startup Tau Robotics, can perform household tasks like washing hands, mopping floors, and taking out trash. While the initial demo videos appear impressive and are notably shown at normal speed (unlike many sped-up robot demos), the company reveals the actions are performed via human teleoperation, not autonomous AI. Tau Robotics, founded in 2024, argues this "cheat" is a strategic way to bridge the current capability gap, ensure task completion, and collect real-world home data to eventually train autonomous systems. Their service features three robot models: Chelsea for kitchens/bathrooms, Elon for regular tidying with memory, and Tony for deep cleaning. Priced at $30 per hour, it's cheaper than average human cleaners in the US. The article discusses the broader challenge of deploying humanoid robots in homes, comparing Tau's approach to others like China's Ziliang and the US's 1X Neo, which also use teleoperation. A key reason for choosing a humanoid form is to make remote control more intuitive for human operators. The piece also notes the potential "emotional value" of having a humanoid servant. The service is currently invite-only in San Francisco.

marsbit43 min fa

"Teletubbies" Robot Cleaning Service, $30/Hour, Pure·Manual·Intelligence

marsbit43 min fa

From South Korea to the United States: Blue-Collar Jobs Are Becoming Increasingly Popular, Thanks to AI

AI is reshaping the labor market's value proposition. The traditional four-year college degree is losing its appeal as a guaranteed career path, while skilled blue-collar trades like electricians, welders, and plumbers are experiencing historic demand and wage premiums. This shift is driven by dual pressures: AI's displacement of certain white-collar roles and a booming need for physical infrastructure and data center construction. Data confirms the trend. In the U.S., vocational school revenue surged, and a significant portion of recent layoffs are AI-related. Surveys show a majority of Gen Z adults plan to pursue blue-collar work, citing better job security against AI automation. Vocational education interest has exploded recently. Experts cite a psychological shift as younger generations seek tangible, AI-resistant careers and avoid high student debt. In many cases, salaries for skilled trades now match or exceed those requiring a bachelor's degree. In South Korea, semiconductor vocational high schools boast near-total employment, with graduates securing high-paying roles at companies like Samsung. The shortage is structural, exacerbated by a retiring baby boomer workforce and massive infrastructure projects. Companies like JPMorgan Chase, Meta, and Lowe's are investing heavily in training programs. However, overcoming historical stigma and a "perception gap" around trade careers remains a key challenge to closing the talent gap.

marsbit1 h fa

From South Korea to the United States: Blue-Collar Jobs Are Becoming Increasingly Popular, Thanks to AI

marsbit1 h fa

Qualcomm: AI Hype Subsides, When Will Smartphones Emerge from the Gloom?

Qualcomm reported its Q3 FY2026 results (ending June 2026), with revenue of $9.95B, down 4% YoY but above expectations. Gross margin declined to 53.1%, pressured by rising costs across manufacturing and memory. Key business segments showed mixed performance: Handset revenue fell 19.6% YoY to $5.09B, dragged by an 11% decline in non-Apple Android shipments and weaker high-end mix. Conversely, Automotive revenue surged 61% to $1.59B, and IoT grew 9% to $1.83B. Core operating profit dropped 41% YoY due to margin compression and higher expenses. Management's Q4 FY2026 guidance projects revenue of $9.7B-$10.5B, in line with consensus, but Non-GAAP EPS guidance of $2.05-$2.25 fell short of expectations. Amidst persistent weakness in its core handset market, Qualcomm is pursuing growth in AI, focusing on Edge AI (smartphones, PCs, automotive) and Data Center AI. Its data center strategy includes four pillars: AI accelerators (e.g., AI200), commercial CPUs (Dragonfly C1000), custom silicon, and connectivity solutions. While these initiatives initially boosted its stock, concerns over AI capital expenditure sustainability have since erased those gains. The company targets $5B in data center revenue for FY2027 and $15B for FY2029. The report concludes that with the traditional handset business still under pressure, the data center opportunity is currently viewed as a longer-term option, and a more conservative valuation based on core operations may be warranted until AI contributions materialize.

marsbit1 h fa

Qualcomm: AI Hype Subsides, When Will Smartphones Emerge from the Gloom?

marsbit1 h fa

From TPU to Self-Evolving Agents: How Jeff Dean Predicts the Next Step in AI

At the 2026 YC Startup School, Jeff Dean outlined his vision for AI's next phase, shifting focus from simply scaling models to building intelligent, autonomous systems. He believes AI's progress is no longer just about creating smarter models, but about integrating them into systems capable of long-term, iterative work, automated experimentation, and continuous learning. This evolution moves the competition from "who has the bigger model" to "who can best organize intelligence." Dean suggests AI capabilities are now comparable to a junior engineer, enabling the automation of complex workflows. However, the true challenge and opportunity lie in managing these AI "workers" at scale. He emphasizes the importance of **context engineering**—structuring tools, memory, and feedback loops—over raw model power. For startups, this means building deep expertise in niche domains where general models currently fail (near 0-1% success rates), leveraging proprietary data, specialized tools, and domain-specific evaluators. A recurring theme is re-examining fundamental constraints. Dean's past work, like moving Google's search index to memory or creating the TPU, stemmed from questioning outdated assumptions about hardware and cost. He sees similar inflection points today, particularly in **specialized inference hardware** to drastically reduce latency and energy consumption for real-time Agent operation. Notably, he points out that in modern AI systems, the dominant cost is often not computation but **data movement**. Reliable, long-running Agents require robust system design, borrowing concepts from distributed computing like checkpointing, state management, and parallel exploration to handle failures and maintain progress over days or weeks. As AI automates execution, the scarcest human skills will shift to **defining clear specifications**, **judging what problems are worth solving** (taste), and designing effective feedback loops. Ultimately, Dean's framework prioritizes understanding the problem deeply, identifying the true bottlenecks, and systematically building closed-loop systems where AI can not only perform tasks but also improve AI itself.

marsbit1 h fa

From TPU to Self-Evolving Agents: How Jeff Dean Predicts the Next Step in AI

marsbit1 h fa

Trading

Spot