Breaking News: The "Worker's Edition" Claude 5 Is Here, Everyone Can Use It

marsbit2026-07-01 tarihinde yayınlandı2026-07-01 tarihinde güncellendi

Özet

BREAKING: Claude Sonnet 5, dubbed "Fennec," is now the default model for all Free and Pro users. This mid-tier model boasts the strongest Agent capabilities in the Sonnet line yet, with performance rivaling the flagship Opus 4.8. It features autonomous planning and can utilize browser and terminal tools—capabilities previously exclusive to costly, large models. Key benchmarks highlight significant gains over its predecessor, Sonnet 4.6, in reasoning, tool use, coding, and knowledge work. Sonnet 5 scores 63.2% on SWE-bench Pro (surpassing GPT-5.5's 58.6%), 80.4% on Terminal-Bench 2.1, and 57.4% on Humanity's Last Exam (just 0.5% behind Opus 4.8). It even slightly outperforms Opus 4.8 in some knowledge tasks. Anthropic positions it as delivering ~90% of Opus's capability at a fraction of the cost. Pricing is aggressive: a limited-time promotional rate of $2 per million input tokens and $10 per million output tokens (reverting to $3/$15 after August 31). This undercuts Opus 4.8 ($5/$25) and GPT-5.5 ($5/$30). However, a new tokenizer may increase token counts by 1.0-1.35x, affecting final costs post-promotion. Notably, Sonnet 5 excels in security, with a mere 0.93% browser injection attack success rate, outperforming Mythos 5 and Opus 4.8. Its prompt injection defense matches Opus 4.8 at 0.19%. Launching amid uncertainty around the region-restricted Fable 5, Sonnet 5 is globally available. It targets the mid-market, offering near-flagship performance at a competitive price, e...

Just now, Claude Sonnet 5 has arrived!

Code name: Fennec, the fennec fox, the smallest fox in the Sahara Desert.

This is Anthropic's Sonnet model with the strongest Agent capabilities to date, with performance close to that of the flagship Opus 4.8.

Effective immediately, Sonnet 5 becomes the default model for all Free and Pro users.

It can autonomously plan and invoke browser and terminal tools.

Just a few months ago, this required spending a lot of money to invoke super-large models; now, Sonnet easily achieves it.

Compared to the previous generation Sonnet 4.6, Sonnet 5 shows significant performance improvements in reasoning, tool use, programming, and knowledge work tasks.

Key points:

SWE-bench Pro score of 63.2%, surpassing GPT-5.5's 58.6%, slightly behind Opus 4.8's 69.2%

'Humanity's Last Exam' score of 57.4%, only 0.5 percentage points behind Opus 4.8

Standard pricing: $3 per million input tokens / $15 per million output tokens, only 60% of Opus 4.8's price

Browser injection defense: 0.93% success rate, beating both Mythos 5 and Opus 4.8

Interestingly, Fable 5 was also revealed to be making a comeback on the same day. But the cost is mandatory real-name verification, and it will most likely be limited to US users.

Sonnet 5, on the other hand, promises to hold nothing back, and is available globally for all users to use openly starting today.

On Par with Opus 4.8 Across the Board, the Strongest Worker AI Launches a Surprise Attack

This sudden launch of Sonnet 5 also helps fill the void left by the unavailability of Fable 5.

For many developers, the year one of the Agent era began with Sonnet.

Claude Sonnet 3.5, 3.6, and 3.7 were among the earliest models to demonstrate astonishing abilities in writing code and using tools.

In other words, the concept of "letting AI do the work itself" was first proven feasible by the Sonnet "medium cup" series.

But over the past year or so, the most dramatic leaps in capability have been concentrated on the Opus "large cup" line. Sonnet was left directly behind by the flagship.

What Sonnet 5 aims to do is close this gap!

Anthropic sets the tone with one sentence – Claude Sonnet 5 is the most capable "worker" Sonnet in history.

Looking at its real-world performance scores best illustrates this point.

In its traditional stronghold of programming, Sonnet 5 impressively scores 63.2% on SWE-bench Pro. The previous Sonnet 4.6 only managed 58.1%, while Opus 4.8 currently leads with 69.2%.

In contrast, OpenAI's flagship GPT-5.5 only scores 58.6% on the same benchmark, and Google's Gemini 3.5 Flash scores just 55.1%.

Terminal-Bench 2.1 performance is even more ferocious. Sonnet 5 skyrockets to 80.4%, leaving Sonnet 4.6's 67.0% far behind with a huge 13 percentage point jump. It's less than 2 points away from Opus 4.8's 82.7%.

On the cross-disciplinary reasoning benchmark dubbed 'Humanity's Last Exam', Sonnet 5 with tools achieves 57.4%, compared to Opus 4.8's 57.9%—a mere 0.5 percentage point difference. GPT-5.5 scores only 52.2% on the same test, and Gemini 3.1 Pro scores 51.4%.

In computer control capabilities, Sonnet 5 scores 81.2% on OSWorld-Verified, again surpassing GPT-5.5's 78.7% and closely trailing Opus 4.8's 83.4%.

More surprisingly, in knowledge work, Sonnet 5 scores 1618 on GDPval-AA v2, directly overtaking Opus 4.8's 1615.

In agent search and tool use performance, Sonnet 5 provides Opus 4.8-level capabilities at the lowest cost.

It can be said that in almost every benchmark, Sonnet 5's performance falls within the 90% to 100% range of Opus 4.8's scores.

It's practically like paying Sonnet's price for 90% of Opus's brainpower.

$2 Limited-Time Promotion, But With a Hidden Pitfall

The price is this release's real "killer feature".

For API pricing, Anthropic is offering a limited-time promotion: $2 per million tokens for input, $10 per million tokens for output.

After August 31st, the price reverts to the original $3 for input and $15 for output.

In comparison, Opus 4.8 is priced at $5 and $25, and GPT-5.5 Standard is $5 and $30.

During the promotion period, both input and output prices are only 40% of Opus 4.8's. Even after the standard price resumes, it's only 60%.

However, while Anthropic appears full of sincerity on the surface, there's a little trick hidden in the details.

The reason is that Sonnet 5 uses a completely new tokenizer. The number of tokens for the same input text may inflate by a factor of 1.0 to 1.35.

Once the promotion period ends, the original price of $3/$15 combined with the tokenizer inflation effect will definitely make the real spending sting a bit more than using Sonnet 4.6.

But even so, compared to Opus, it's still a crushing difference.

Counterattacking All Flagship Models in the Family

The System Card hides one of Sonnet 5's most underestimated aspects.

Prompt injection attack success rate: 0.19%, on par with Opus 4.8. GPT-5.5 is at 3.08%, Gemini 3.5 Flash is at 6.66%.

In browser injection defense, the attack success rate is only 0.93%, while Mythos 5 is at 29.7% and Opus 4.8 at 31.5%.

A $2 mid-range model has counterattacked and defeated all flagship models in the family; with protective measures enabled, it drops directly to 0%.

For malicious code injection, Sonnet 4.6 had a high attack success rate of 45.26%. Sonnet 5 has reduced this to 0.29%, an improvement of 150 times.

In the Firefox 147 vulnerability exploitation test, Mythos 5 could write usable exploits 88.4% of the time, Opus 4.8 at 8.8%, and Sonnet 5 at 0.0%. It can write top-tier business code, but can't write a single usable exploit.

A side effect is a misalignment behavior score of 2.53 (out of 10), an improvement over Sonnet 4.6's 2.89, but higher than Opus 4.8's 2.10 and Mythos Preview's 1.95.

It has become stronger, and also more opinionated.

Not Competing for the Crown, Specializing in Cutting Down the Mid-Tier

Sonnet 5 occupies an incredibly precise position. Its upward-facing capabilities approach those of Opus 4.8 and GPT-5.5, while its downward-facing price is close to the level of Gemini 3.5 Flash.

Just as OpenAI doubled its prices compared to the previous generation, Anthropic turned around and pushed Sonnet 5's entry price down to $3.

Developers who were previously hesitant about paying for a flagship now have a lethally powerful alternative.

While everyone else is focusing on fighting at the top, Anthropic has fired a shot at the mid-tier.

Developer Wallets Voted Tonight

Now, Sonnet 5's performance has stepped into the flagship range; most tasks like fixing bugs, adding tests, or refactoring can be handled in one go.

The awkward situation where Opus felt too expensive to use, but Sonnet wasn't good enough, is gone as of today.

It's more cost-effective. The same budget that could previously run only one Opus-level Agent can now run two or three parallel Sonnets.

The cost barrier for multi-Agent architectures has been kicked lower by Sonnet 5.

When Fable 5 will make its kingly return is still unknown.

But Sonnet 5 is already standing firmly here right now, with its performance pushed right up to Opus's doorstep.

For the vast majority of developers, it is the most capable and most usable Claude to have on hand for quite some time to come.

References:

https://x.com/claudeai/status/2072017450611142835

https://www.anthropic.com/news/claude-sonnet-5

This article is from the WeChat public account "New Zhiyuan", author: ASI Revelation

İlgili Sorular

QWhat is the main announcement in the article regarding Claude Sonnet 5?

AThe main announcement is the immediate release of Claude Sonnet 5, codenamed 'Fennec,' as the default model for all Claude Free and Pro users. It is described as Anthropic's most capable Sonnet model for Agent tasks, with performance approaching that of the flagship Opus 4.8.

QHow does Claude Sonnet 5's pricing compare to Opus 4.8 and GPT-5.5?

ADuring a promotional period until August 31, Sonnet 5's API pricing is $2 per million input tokens and $10 per million output tokens. After that, it becomes $3 and $15. This is 40% of Opus 4.8's price ($5/$25) during the promotion and 60% afterwards. It is also significantly cheaper than GPT-5.5 standard ($5/$30).

QIn what key benchmark did Sonnet 5 reportedly outperform GPT-5.5?

ASonnet 5 outperformed GPT-5.5 on the SWE-bench Pro benchmark for coding, achieving a score of 63.2% compared to GPT-5.5's 58.6%.

QWhat security advantage does Sonnet 5 have over models like Opus 4.8 and Mythos 5, according to the article?

ASonnet 5 has a significantly lower browser prompt injection success rate of 0.93%, compared to 29.7% for Mythos 5 and 31.5% for Opus 4.8. With mitigation measures enabled, this rate drops to 0% for Sonnet 5.

QWhat strategic market position does the article suggest Claude Sonnet 5 occupies?

AThe article suggests Sonnet 5 occupies a strategic position in the mid-tier market. It offers performance approaching flagship models like Opus 4.8 and GPT-5.5, but at a price point closer to more affordable models like Gemini 3.5 Flash, effectively targeting the 'middle' segment of the market.

İlgili Okumalar

The Largest Upgrade Since The Merge? How Glamsterdam Will Affect Ethereum and Regular Users?

The upcoming Glamsterdam upgrade, scheduled for late 2026, is considered Ethereum's most significant change since The Merge. It focuses on fundamentally restructuring Ethereum's block production, transaction execution, and gas pricing to enable major scalability improvements while preserving decentralization. The upgrade centers on three key innovations: * **Enshrined PBS (ePBS)**: Moves the Proposer-Builder Separation mechanism into the protocol's core, eliminating reliance on external relays. This reorganizes the block pipeline, extending the time window for processing execution payloads, which is crucial for safely increasing block capacity. * **Block-Level Access Lists (BALs)**: Attaches a "map" to each block, declaring in advance which state data its transactions will access. This enables potential parallel transaction processing and faster node synchronization, breaking a key performance bottleneck. * **Gas Repricing**: Introduces a more accurate resource pricing model by separating computation costs from state storage costs. This discourages uncontrolled state growth by making operations that create permanent data (like new accounts) more accurately reflect their long-term network burden. Together, these changes aim to solve the core challenges of increasing Ethereum's throughput (e.g., raising the Gas Limit) without overburdening node hardware or increasing centralization risks. They prepare the infrastructure for higher capacity, targeting a credible post-upgrade capacity of up to 200 million Gas. For users, the impact will be nuanced: * General transaction fees may become lower and more stable as block space increases. * Simple transfers could see cost reductions, while state-intensive operations (like contract deployment) may become relatively more expensive due to the new gas model. * Gas fee estimations by wallets will become more accurate. * L2 networks could benefit long-term from increased data blob capacity. * Standardized logs for all ETH transfers (EIP-7708) will improve tracking for wallets and exchanges. Ultimately, Glamsterdam represents a foundational shift, not a simple block size increase. It seeks to expand Ethereum's capacity by re-engineering its underlying mechanics, maintaining its commitment to decentralization while enabling significant performance gains.

marsbit2 saat önce

The Largest Upgrade Since The Merge? How Glamsterdam Will Affect Ethereum and Regular Users?

marsbit2 saat önce

Circle CEO Responds to the OUSD Challenge: Stablecoin is a Winner-Takes-All Business, and We Won't Slow Down

In response to questions about the OUSD stablecoin initiative, Circle CEO Jeremy Allaire argues that the stablecoin market is a "winner-take-most" platform business driven by powerful network effects, and Circle has no plans to slow down. He outlines three key drivers behind USDC's dominant position: 1. **Protocol/Software Layer Network Effects**: The value of a stablecoin network grows as more developers and services integrate it, creating compounding utility and user preference. Circle has spent nearly a decade building this ecosystem with USDC, now accelerated by mainstream adoption and enhanced by software stacks like CCTP and Gateway for interoperability. 2. **Liquidity Network Effects**: Liquidity begets more liquidity. USDC has achieved top-tier global liquidity—ranking among the top three digital assets alongside BTC and USDT—through nearly a decade of building deep primary and secondary market access across regions and venues. 3. **Regulatory and Policy Integration**: Establishing a global stablecoin requires deep regulatory engagement, licensing, and compliance across key markets—a significant, long-term investment where Circle is a leader. Allaire cites Artemis data showing USDC facilitated 80% of all dollar stablecoin on-chain transaction volume in Q1 2026, with USDT at 20% and all others negligible. He addresses OUSD's purported advantages: "free" minting/burning is often not sustainable in practice; redistributing all revenue can starve essential infrastructure investment; and large consortium models historically struggle with inefficiency and slow execution, unlike focused strategic partnerships. He reaffirms Circle's strong ongoing partnership with Coinbase on USDC and notes Circle collaborates with dozens of other stablecoin issuers through its expanding platform (Arc, CCTP, CPN, etc.). While welcoming OUSD to the ecosystem, Allaire asserts that Circle's vast, trusted network and continued investment make USDC the foundational digital dollar infrastructure for the world.

链捕手2 saat önce

Circle CEO Responds to the OUSD Challenge: Stablecoin is a Winner-Takes-All Business, and We Won't Slow Down

链捕手2 saat önce

Q2 Crypto Market Review: Did Bitcoin Rise for 'Nothing'? Did Money Flow to AI and On-Chain?

Q2 2026 Crypto Market Recap: Bitcoin's Gains Erased Amid Shift to AI and On-Chain Activity The second quarter of 2026 saw a significant reversal for the cryptocurrency market. Bitcoin gave back all its April gains, ending Q2 down approximately 11%, while major stock indices posted strong gains. This divergence was driven by a hawkish shift in Fed rate expectations, capital rotation into AI stocks, and weakening liquidity channels into crypto. Key demand pillars deteriorated simultaneously. Spot Bitcoin ETFs recorded net outflows of $4.08 billion for the quarter, with outflows dominating June. Crypto treasury entity Strategy's bitcoin accumulation slowed markedly, and the total stablecoin market cap contracted by ~$4.2 billion. This created a tighter liquidity environment. Exchange data reflected the downturn. Spot trading volumes fell 28% quarter-over-quarter. The market underwent significant deleveraging, with $8.35 billion in long liquidations for BTC and ETH, primarily in late May/early June. Open interest and order book liquidity also declined. Despite the bearish price action, structural developments point to an expanding on-chain ecosystem. These include the rise of tokenized stocks with full legal rights, the growth of RWA (real-world asset) perpetual contracts for trading stocks and commodities 24/7, and the use of crypto markets for price discovery ahead of major events like the SpaceX IPO. On-chain vaults are also emerging as a core layer for institutional capital allocation.

Foresight News2 saat önce

Q2 Crypto Market Review: Did Bitcoin Rise for 'Nothing'? Did Money Flow to AI and On-Chain?

Foresight News2 saat önce

İşlemler

Spot
活动图片