The More Frequently They Are Updated, the More Similar Claude Code and Codex Become

marsbitPublished on 2026-04-19Last updated on 2026-04-19

Abstract

OpenAI's recent release of GPT-5.4-Cyber demonstrates a striking convergence with Anthropic's Claude Mythos, reflecting a broader trend of product and strategic alignment between the two AI giants. This is particularly evident in their flagship coding assistants, Codex and Claude Code, which have evolved from distinct philosophies into increasingly similar tools. Initially, Codex emphasized speed and real-time interaction, acting like a fast, junior developer, while Claude Code focused on handling extreme complexity with methodical, large-context analysis. However, both have adopted near-identical solutions to core challenges, such as using isolated sub-tasks or agent teams to prevent context pollution during large-scale code modifications. Benchmark results show a tight race: Codex leads in terminal tasks, while Claude Code excels in complex software engineering benchmarks. Community feedback highlights nuanced differences; Claude Code is faster but can accumulate technical debt, whereas Codex is slower but more deliberate and autonomous. The open-source framework OpenClaw has accelerated this homogenization by standardizing workflows, eroding proprietary advantages. Ultimately, the competition has shifted from pure capability to ecosystem strategy, pricing, and user experience. As these tools become ubiquitous, the developer's role evolves toward higher-level problem definition and architectural thinking, beyond automated code generation.

A few days ago, OpenAI officially released the new large model GPT-5.4-Cyber. Like many netizens, this model also gave us an extremely strong sense of déjà vu.

This new model, in terms of target user base, application scenarios, and even promotional strategy, almost completely mirrors Anthropic's recently released Claude Mythos. This "close-quarters combat" posture has reached a point of being completely unabashed. Even The New York Times pointed out sharply in the headline of its latest report: "Like Anthropic, OpenAI...".

This trend of homogenization is by no means limited to the underlying base models. If you look at the series of products recently released by these two companies, you will find that they are becoming mirror images of each other!

Under the shadowless lamp of the capital market, this convergence is even more obvious. Currently, the valuations of the two companies in the secondary market are very close, with Anthropic's even being slightly higher than OpenAI's recently, thanks to its rapid advance in the enterprise market. Capital has the most sensitive nose; in their eyes, these two unicorns are growing the same horns.

It seems that the homogenization of the underlying large models will inevitably lead to the convergence of upper-layer applications.

Today, what I want to discuss with you are the two benchmark tools representing the highest level of AI-assisted programming today: OpenAI's Codex and Anthropic's Claude Code. From once going their separate ways to now converging on the same path, how did they gradually grow to look the same?

From Divergence to Convergence: The Evolution History of the Two Titans

Rewind the clock a few years, and Codex and Claude Code were products of completely different technological philosophies.

Codex's underlying logic is "the ultimate martial art is unsurpassable speed." It is like a senior developer with 5 years of experience following behind you, ready to complete your code at any time.

In OpenAI's conception, Codex is a lightweight, highly interactive terminal agent focused on rapid iteration and interactive programming. Its execution speed is extremely fast; with the support of Cerebras WSE-3 hardware, it can achieve a throughput of 1000 tokens per second. In specific workflows, Codex offers three clear approval modes: suggestion, auto-edit, and full-auto, keeping the developer always in the loop. This design philosophy fits perfectly with geek developers who need to quickly build prototypes and handle high-frequency interactions.

In contrast, Claude Code, from its birth, carried a cold and restrained "architect" attribute.

Anthropic infused it with the genes to handle extremely complex tasks. It relies on a massive context window of up to 1 million tokens and unique "compression" technology to achieve infinite conversation. Claude Code's creed is "global control, plan before acting." Before performing any action, it uses agent search technology to thoroughly understand the context of the entire codebase, then coordinates multi-file consistency modifications. For enterprise-level refactoring tasks involving tens of thousands of lines of code migration, Claude Code has shown astonishing dominance.

However, as time passed and application scenarios continued to expand, these two tools, which originally had very different personalities, began to copy each other's homework.

Image source: MorphLLM

The biggest bottleneck a monolithic AI model faces when handling complex projects is context pollution. You ask the AI to refactor an authentication module; after it reads 40 files, it often forgets the design pattern of the first file. To solve this pain point, the two companies came up with almost identical answers: assign an independent context window for each subtask.

OpenAI quickly launched a new macOS desktop application, isolating tasks into different threads by project and running them independently in a cloud sandbox. Anthropic introduced an agent team architecture, allowing developers to spawn multiple sub-agents that share task lists and dependencies and work in parallel in their own independent windows. You'll find that whether it's called a "cloud sandbox" or an "agent team," their core engineering concepts have completely converged.

On the benchmark test scorecards, they also show a delicate balance. GPT-5.3-Codex leads in the terminal task Terminal-Bench 2.0 with a score of 77.3%. Claude Code scored 80.8% on the complex SWE-bench Verified leaderboard. They have both achieved the extreme in their areas of strength while desperately trying to弥补 (compensate for) their own shortcomings.

The OpenClaw Effect: The Invisible Hand Toppling the Walls

If the internal strategies of the two companies determine the internal cause of their homogenization, then the pressure from the entire open-source ecosystem is an external force that cannot be ignored. Here, we must mention the profound impact OpenClaw has had on the entire AI programming tools track.

As a workflow framework launched by the open-source community, the emergence of OpenClaw can be said to have toppled the ecological walls painstakingly built by the giants. It standardized the interaction process between large models and local terminal toolchains. In the past, how to elegantly allow large models to call local Git commits, how to safely run test scripts in a sandbox, how to perform multi-step reasoning verification—these were all proprietary "black technologies" that Codex and Claude Code were proud of.

But OpenClaw abstracted these processes into a universal protocol. This means that developers no longer need to be locked into a specific platform for a particular collaboration mode. The open-source community's狂欢 (carnival) made standardization an irreversible tide. Faced with this situation, both OpenAI and Anthropic had to lower their姿态 (posture) to兼容 (compatible) with this open standard.

When the underlying technical barriers were leveled by open-source forces like OpenClaw, when all advanced features became standard industry配置 (configurations), the only way out for Codex and Claude Code was to engage in endless involution at the more subtle level of user experience. This is also why we feel they are becoming more and more similar—because under a standardized framework, there is often only one optimal solution—just like convergent evolution in biology.

Codex is Catching Up to Claude Code

Although Claude Code and Codex are on the path of convergent evolution, differences between the two still exist, and Codex is even preferred by developers in some aspects.

The other day, on the r/ClaudeCode community, a senior engineer with 14 years of experience who had worked at tech giants, u/Canamerican726, shared an extremely hardcore evaluation.

Specifically, he invested 100 hours using Claude Code and 20 hours using Codex in a complex project containing 80,000 lines of code.

From his perspective, using Claude Code was like instructing an engineer chased by a deadline; it sprinted extremely fast but often ignored the specifications written by the developer in CLAUDE.md, and liked to continuously pile code into existing files to complete tasks, lacking refactoring thinking.

In contrast, Codex felt more like a steady veteran with 5 to 6 years of experience. Its processing speed was 3 to 4 times slower, but it would proactively stop to think and refactor code midway, and strictly adhere to instruction boundaries. This high degree of autonomy allowed this engineer to dare to throw tasks directly at it and then放心地 (feel at ease) go do other things.

The same voices appear on social networks like X. Researcher Aran Komatsuzaki mentioned, based on his own experience, that Claude Code still has the advantage in the front-end field, but in back-end planning and keeping information updated, Codex, which frequently calls web search, is显然 (clearly) more solid.

The comment section is filled with bloody lessons总结 (summaries) from real business scenarios. Some developers pointed out极其犀利地 (extremely sharply) that models based on Opus, although fast, often accumulate a large amount of "code cleaning debt" for projects; Codex is slow, but can clean the floor顺手 (in passing) while moving forward. I even saw users summarizing a survival rule, suggesting that everyone immediately start a new session when context window usage reaches 70%, otherwise it is extremely easy to receive系统附赠的 (system-attached) hidden bugs.

These real complaints from the front line clearly show that when the ability panels of the two great tools increasingly overlap, what ultimately determines which camp developers belong to is often these tiny experience gaps related to "pit-filling costs" and "maintenance mental load." Of course, there are some special difficulties for Chinese users, such as:

Cold Thinking: The Ecosystem Battle Behind Homogenization

Of course, the pros and cons of Codex and Claude Code also depend on the developers themselves and their own abilities. As summarized in the evaluation report by u/Canamerican726 mentioned above: If you don't understand software engineering, both tools will output糟糕的 (poor) results; tools are not equivalent to skills.

This sentence punctures a certain illusion long营造 (created) by AI programming tools. We once thought that with a powerful enough AI assistant, even a Vobe Coder with no foundation could single-handedly create enterprise-level applications. But the reality is that Claude Code needs an extremely focused and highly skilled "pilot," otherwise it can easily get lost in a huge codebase. Codex, although more independent, also requires developers to provide accurate system context to发挥最大效用 (achieve maximum utility).

So, in today's world of highly homogenized tool capabilities, where have the moats of these two companies转移 (moved) to?

The answer lies in those boring financial statements and pricing strategies. For the same task, the number of tokens consumed by Claude Code is often 3 to 4 times that of Codex. The usage cost is higher. For enterprise teams, using Claude Code costs $100 to $200 per developer per month. Codex, on the other hand, bundles its capabilities into more affordable subscription plans and has accumulated a large number of basic users through the vast GitHub community.

Image source: MorphLLM

Anthropic's ambition is to deeply embed Claude Code into the workflows of tech giants who are not short of money. For example, Stripe had 1370 engineers use Claude Code to complete a cross-language code migration in 4 days that would have taken 10 people weeks. Ramp company relied on it to reduce event response time by 80%. OpenAI, relying on its ubiquitous ecological penetration, has made Codex the default choice for many ordinary developers.

This is no longer a单纯 (pure) technical competition, but a war of attrition about ecological binding, pricing strategies, and reshaping user habits.

The Developer's Crossroads

Looking back at the technological evolution of the past year, the release of GPT-5.4-Cyber is just a small footnote in this long battle. Codex and Claude Code moving towards "the same face" marks the official entry of AI programming tools from an early testing phase full of variables and novelty into a mature and boring industrialized production phase.

Now, Claude Code automatically generates 135,000 GitHub commits daily, a number that already accounts for 4% of all public commits on the entire network. We can foresee that in the near future, most boilerplate code, basic test cases, and常规的 (routine) code refactoring will be silently completed in the background by these AI agents that look more and more alike.

Image source: MorphLLM & SemiAnalysis / GitHub Search API

Facing two super tools that are infinitely接近 (approaching) in capability and模仿 (imitating) each other in experience, what core value do we, as human developers, have left? Perhaps, the tool红利期 (dividend period) is about to end completely. When everyone holds equally sharp weapons, what truly determines victory will no longer be who has better code completion speed, but who can better define problems, who has a broader system architecture vision, and who can find that unique irreplaceability belonging to humans in this code world filled with AI.

By the way, which one do you choose?

Reference Links

https://www.morphllm.com/comparisons/codex-vs-claude-code

https://www.reddit.com/r/ClaudeCode/comments/1sk7e2k/claude_code_100_hours_vs_codex_20_hours/

https://x.com/arankomatsuzaki/status/2044270102003196007

https://www.nytimes.com/2026/04/14/technology/openai-cybersecurity-gpt54-cyber.html

This article is from the WeChat public account "机器之心" (ID: almosthuman2014), author: 机器之心 (Machine Heart)

From South Korea to the United States: Blue-Collar Jobs Are Becoming Increasingly Popular, Thanks to AI

AI is reshaping the labor market's value proposition. The traditional four-year college degree is losing its appeal as a guaranteed career path, while skilled blue-collar trades like electricians, welders, and plumbers are experiencing historic demand and wage premiums. This shift is driven by dual pressures: AI's displacement of certain white-collar roles and a booming need for physical infrastructure and data center construction. Data confirms the trend. In the U.S., vocational school revenue surged, and a significant portion of recent layoffs are AI-related. Surveys show a majority of Gen Z adults plan to pursue blue-collar work, citing better job security against AI automation. Vocational education interest has exploded recently. Experts cite a psychological shift as younger generations seek tangible, AI-resistant careers and avoid high student debt. In many cases, salaries for skilled trades now match or exceed those requiring a bachelor's degree. In South Korea, semiconductor vocational high schools boast near-total employment, with graduates securing high-paying roles at companies like Samsung. The shortage is structural, exacerbated by a retiring baby boomer workforce and massive infrastructure projects. Companies like JPMorgan Chase, Meta, and Lowe's are investing heavily in training programs. However, overcoming historical stigma and a "perception gap" around trade careers remains a key challenge to closing the talent gap.

marsbit34m ago

From South Korea to the United States: Blue-Collar Jobs Are Becoming Increasingly Popular, Thanks to AI

marsbit34m ago

Qualcomm: AI Hype Subsides, When Will Smartphones Emerge from the Gloom?

Qualcomm reported its Q3 FY2026 results (ending June 2026), with revenue of $9.95B, down 4% YoY but above expectations. Gross margin declined to 53.1%, pressured by rising costs across manufacturing and memory. Key business segments showed mixed performance: Handset revenue fell 19.6% YoY to $5.09B, dragged by an 11% decline in non-Apple Android shipments and weaker high-end mix. Conversely, Automotive revenue surged 61% to $1.59B, and IoT grew 9% to $1.83B. Core operating profit dropped 41% YoY due to margin compression and higher expenses. Management's Q4 FY2026 guidance projects revenue of $9.7B-$10.5B, in line with consensus, but Non-GAAP EPS guidance of $2.05-$2.25 fell short of expectations. Amidst persistent weakness in its core handset market, Qualcomm is pursuing growth in AI, focusing on Edge AI (smartphones, PCs, automotive) and Data Center AI. Its data center strategy includes four pillars: AI accelerators (e.g., AI200), commercial CPUs (Dragonfly C1000), custom silicon, and connectivity solutions. While these initiatives initially boosted its stock, concerns over AI capital expenditure sustainability have since erased those gains. The company targets $5B in data center revenue for FY2027 and $15B for FY2029. The report concludes that with the traditional handset business still under pressure, the data center opportunity is currently viewed as a longer-term option, and a more conservative valuation based on core operations may be warranted until AI contributions materialize.

marsbit38m ago

Qualcomm: AI Hype Subsides, When Will Smartphones Emerge from the Gloom?

marsbit38m ago

From TPU to Self-Evolving Agents: How Jeff Dean Predicts the Next Step in AI

At the 2026 YC Startup School, Jeff Dean outlined his vision for AI's next phase, shifting focus from simply scaling models to building intelligent, autonomous systems. He believes AI's progress is no longer just about creating smarter models, but about integrating them into systems capable of long-term, iterative work, automated experimentation, and continuous learning. This evolution moves the competition from "who has the bigger model" to "who can best organize intelligence." Dean suggests AI capabilities are now comparable to a junior engineer, enabling the automation of complex workflows. However, the true challenge and opportunity lie in managing these AI "workers" at scale. He emphasizes the importance of **context engineering**—structuring tools, memory, and feedback loops—over raw model power. For startups, this means building deep expertise in niche domains where general models currently fail (near 0-1% success rates), leveraging proprietary data, specialized tools, and domain-specific evaluators. A recurring theme is re-examining fundamental constraints. Dean's past work, like moving Google's search index to memory or creating the TPU, stemmed from questioning outdated assumptions about hardware and cost. He sees similar inflection points today, particularly in **specialized inference hardware** to drastically reduce latency and energy consumption for real-time Agent operation. Notably, he points out that in modern AI systems, the dominant cost is often not computation but **data movement**. Reliable, long-running Agents require robust system design, borrowing concepts from distributed computing like checkpointing, state management, and parallel exploration to handle failures and maintain progress over days or weeks. As AI automates execution, the scarcest human skills will shift to **defining clear specifications**, **judging what problems are worth solving** (taste), and designing effective feedback loops. Ultimately, Dean's framework prioritizes understanding the problem deeply, identifying the true bottlenecks, and systematically building closed-loop systems where AI can not only perform tasks but also improve AI itself.

marsbit38m ago

From TPU to Self-Evolving Agents: How Jeff Dean Predicts the Next Step in AI

marsbit38m ago

Coldcard exploit sparks Bitcoin flight, ‘bullish’ crypto consolidation: Hodler’s Digest, August 2

A Coldcard hardware wallet exploit led to estimated losses of 1,367 BTC ($88.6 million), causing a spike in small Bitcoin transfers as users moved funds to centralized exchanges and other custody methods. In U.S. politics, the Clarity Act faces hurdles with time running out for a Senate vote, amid debates over ethics rules and crypto regulation. Major crypto firms like Coinbase reported disappointing Q2 earnings, while an analyst notes the industry is entering a significant consolidation phase, with revenue concentrating in a few dominant protocols like Hyperliquid and Pump.fun. Bitcoin's price decline continued, though some analysts suggest the market may have bottomed. Other news includes Telegram's legal troubles in Russia and Australia, layoffs at Pump.fun ahead of token distributions, and a White House staffer accused of insider betting leaving his post.

cointelegraph55m ago

Coldcard exploit sparks Bitcoin flight, ‘bullish’ crypto consolidation: Hodler’s Digest, August 2

cointelegraph55m ago

LATEST NEWS: Donald Trump makes a sharp statement regarding Iran! He has halted attacks

U.S. President Donald Trump announced he called off planned military strikes against Iran after Saudi Arabia, the UAE, Qatar, and Iran itself requested a delay. Trump stated the planned operation would have been large-scale and powerful but was suspended to allow time for diplomatic negotiations. He added that regional allies believe an agreement is near, with initial talks focused on security and reopening the Strait of Hormuz. Negotiations on Iran's nuclear program would follow once that is settled. The Strait of Hormuz is a vital global chokepoint for oil and gas shipments, and conflict there could significantly impact energy prices and world trade. Trump further announced that new talks with Iran will begin tomorrow. Separately, Trump commented on events involving the Japanese yen, stating the U.S. intervened in the market due to good relations with Japan, asserting Washington's consistent support for Tokyo and mutual economic benefits from the relevant rules. *This is not an investment recommendation.

cryptonews.ru2h ago

LATEST NEWS: Donald Trump makes a sharp statement regarding Iran! He has halted attacks

cryptonews.ru2h ago

Trading

Spot

The More Frequently They Are Updated, the More Similar Claude Code and Codex Become

Abstract

From Divergence to Convergence: The Evolution History of the Two Titans

The OpenClaw Effect: The Invisible Hand Toppling the Walls

Codex is Catching Up to Claude Code

Cold Thinking: The Ecosystem Battle Behind Homogenization

The Developer's Crossroads

Reference Links

Related Questions

Related Reads

From South Korea to the United States: Blue-Collar Jobs Are Becoming Increasingly Popular, Thanks to AI

Qualcomm: AI Hype Subsides, When Will Smartphones Emerge from the Gloom?

From TPU to Self-Evolving Agents: How Jeff Dean Predicts the Next Step in AI

Coldcard exploit sparks Bitcoin flight, ‘bullish’ crypto consolidation: Hodler’s Digest, August 2

LATEST NEWS: Donald Trump makes a sharp statement regarding Iran! He has halted attacks

Trading