Editor's Note: In the rapid rise of AI coding agents, OpenAI, which once led the generative AI wave with ChatGPT, has unexpectedly become a "chaser" in this critical race. In stark contrast, Anthropic, founded by former OpenAI members, has quickly gained popularity in the developer community and enterprise market with Claude Code, becoming one of the leading players in the AI programming tools space.
This article, through interviews with OpenAI executives, engineers, and multiple developers, reveals the real story behind this competition: from the early splitting of the OpenAI Codex project and the shift of resources to ChatGPT and multimodal models, to the internal team's reintegration and accelerated launch of AI programming products, OpenAI is undergoing a transition from strategic neglect to a full-scale catch-up. In a sense, this is not a lag in technical capability but a misalignment of strategic rhythm: the explosion of ChatGPT changed the company's priorities, the partnership with Microsoft constrained the product path, while Anthropic bet on the AI programming track earlier.
Behind this competition, deeper questions are emerging: as AI agents begin to take on more and more cognitive work, the software development process and even white-collar labor itself may be redefined.
The following is the original text:
OpenAI CEO Sam Altman sat with his legs crossed on his office chair, looking up at the ceiling as if pondering an answer not yet formed. To some extent, this also had to do with the environment.
OpenAI's new headquarters in San Francisco's Mission Bay is a modern building made of glass and light wood, its aura almost that of a "tech sanctuary." On the display shelf behind the reception desk, handbooks introducing "Eras of AI" were placed, seemingly depicting a path to technological revelation. The stairwell walls were covered with posters marking milestones in AI development, one of which recorded a moment: thousands of viewers watched via livestream as a machine defeated a top esports team in a Dota 2 match. In the hallways, researchers wearing team merch with slogans came and went; one shirt read: "Good research takes time." Ideally, of course, not too much time.
We sat in a huge conference room. The question I posed to Altman was about the programming revolution sweeping the industry, and why OpenAI didn't seem to be leading this wave.
Today, millions of software engineers have already started handing over parts of their coding work to AI, making many in Silicon Valley truly face a reality for the first time: automation might reach their own jobs. Coding agents have thus become one of the few application scenarios for which businesses are willing to pay a high price for AI. Logically, such a moment could very well have been, even should have been, the next "winning moment" on OpenAI's stairwell poster. But now, the name grabbing headlines is not OpenAI.
The company's rival is Anthropic, an AI company founded by former OpenAI members. With its coding agent product, Claude Code, Anthropic has experienced explosive growth. The company disclosed in February that the product already contributed nearly one-fifth of its business scale, corresponding to an annualized revenue of over $2.5 billion. In contrast, according to a person familiar with the matter, as of the end of January, the annualized revenue of OpenAI's own programming product, OpenAI Codex, was just over $1 billion.
The question is: Why is OpenAI falling behind in this AI programming race?
"The value of first-mover advantage is huge," Sam Altman said after a moment of thought. "We've seen that with ChatGPT."
However, in his view, now is the time for OpenAI to go all out on AI programming. He believes the company's existing model capabilities are strong enough to support highly complex coding agents. Of course, such capability is no accident; the company has invested tens of billions of dollars in model training.
"This is going to be a huge business," Altman said, "not only because of the economic value it brings itself but also because of the general productivity that programming unlocks." He paused, then added: "I don't use this word lightly, but I think this is likely one of those multi-trillion-dollar markets."
Going further, he suggested that OpenAI Codex might be the "most likely path" to Artificial General Intelligence (AGI). By OpenAI's definition, AGI is an AI system that can outperform humans in the vast majority of economically valuable work.
However, although Altman projected a confident and unhurried demeanor, the reality inside the company over the past few years has been much more complex. To understand the fuller internal story, I interviewed over 30 people with knowledge of the matter, including current OpenAI executives and employees who spoke with company approval, as well as some former employees who described the company's internal operations anonymously. Piecing these narratives together reveals an uncommon situation: OpenAI is scrambling to catch up.
Rewind to 2021. At that time, Altman and other OpenAI executives invited WIRED reporter Steven Levy to their early office in San Francisco's Mission District to watch a demo of a new technology. It was a project derived from GPT-3, trained on a massive amount of open-source code from GitHub.
In the live demo, the executives showed how a tool called OpenAI Codex could take natural language instructions and generate simple code snippets.
"It can actually perform operations for you in the computer world," explained OpenAI President and co-founder Greg Brockman at the time. "What you have is a system that can actually execute commands." Even then, OpenAI researchers widely believed that Codex would be key technology for building a "super assistant."
During that period, Altman and Brockman's schedules were almost filled with meetings with Microsoft—the software giant being OpenAI's largest investor. Microsoft planned to use Codex to power one of its first commercialized AI products: a code completion tool called GitHub Copilot, which could be embedded directly into the development environments programmers use daily.
An early OpenAI employee recalled that at that stage, Codex "basically only did autocomplete." But Microsoft executives still saw it as a significant signal of the coming AI era.
When GitHub Copilot officially launched publicly in June 2022, it attracted hundreds of thousands of users within months.
The OpenAI team initially responsible for Codex was later reassigned to other projects. One early employee recalled that the company's judgment at the time was: future models would inherently possess programming capabilities, so there was no need to maintain a dedicated Codex project team long-term. Some engineers were moved to work on DALL-E 2, others shifted to training GPT-4. At the time, this seemed the key path to bring OpenAI closer to AGI.
Then, in November 2022, ChatGPT launched and gained over 100 million users within two months. Virtually every other project inside the company was put on hold. For the next few years, OpenAI effectively did not have a team dedicated to AI programming products. A former member of the Codex project said that after ChatGPT's popularity exploded, AI programming no longer seemed to fit the company's new "consumer product first" strategy. Meanwhile, the industry generally considered this space "covered" by GitHub Copilot, which was essentially Microsoft's turf. OpenAI was mainly just providing the model support.
Therefore, in 2023 and 2024, OpenAI's resources were directed more towards multimodal AI models and intelligent agents. These systems were designed to understand text, images, video, and audio simultaneously and to operate cursors and keyboards like humans. This direction seemed more aligned with industry trends at the time: Midjourney's image generation models went viral on social networks, and the industry widely believed that large language models needed to "see" and "hear" the world to truly advance towards higher intelligence.
In contrast, Anthropic chose a different path. Although the company was also developing chatbots and multimodal models, it seemed to recognize the potential of programming capabilities earlier. In a recent podcast, Brockman also acknowledged that Anthropic was "highly focused on programming capabilities" from a very early stage. He noted that Anthropic trained its models not only on complex programming problems from academic competitions but also on大量 "messy" code problems from real code repositories.
"That's a lesson we learned later," Brockman said.
In early 2024, Anthropic began using this real repository data to train Claude 3.5 Sonnet. When the model was released in June, many users were impressed by its programming capabilities.
This performance was particularly validated at a startup called Cursor. Founded by a group of people in their twenties, the company developed an AI programming tool that allows developers to describe needs in natural language, with the AI directly modifying the code. When Cursor integrated Anthropic's new model, its user base grew rapidly, according to a person familiar with the company.
A few months later, Anthropic began internally testing its own coding agent product, Claude Code.
As Cursor's popularity grew, OpenAI once attempted to acquire the startup. But according to multiple sources close to the company, Cursor's founding team rejected the proposal before negotiations deepened. They believed the AI programming industry had huge potential and wanted to remain independent.
At the time, OpenAI was training its first so-called "reasoning model," OpenAI o1. This type of model could reason step-by-step about a problem before giving an answer. OpenAI stated at release that the model performed particularly well in "accurately generating and debugging complex code."
Mishchenko explained that a key reason for the significant progress in AI models' programming abilities is that programming is a "verifiable task." Code either runs or it doesn't, providing very clear feedback signals for the model. If something goes wrong, the system quickly knows where the problem is. OpenAI used this feedback loop to continuously train o1 on more complex programming problems.
"Without the ability to freely explore codebases, implement changes, and test its own results—these are all part of 'reasoning' capabilities—today's coding agents wouldn't be where they are," he said.
By December 2024, multiple small teams inside OpenAI had begun focusing on AI coding agents. One team was co-led by Mishchenko and Thibault Sottiaux. Sottiaux, formerly of Google DeepMind, is now the head of Codex at OpenAI.
Initially, their interest in coding agents stemmed from internal R&D needs, hoping to use AI to automate a lot of repetitive engineering work, such as managing model training tasks and monitoring GPU cluster status.
A parallel effort was led by Alexander Embiricos. He previously headed OpenAI's multimodal agent project and now serves as the product lead for Codex. Embiricos had developed a demo project called Jam, which spread quickly within the company.
Unlike controlling a computer via mouse and keyboard, Jam could directly access the computer's command line. The 2021 Codex demo only showed AI generating code for humans to run manually; Embiricos's version could execute this code itself. He recalled watching a webpage logging Jam's actions refresh in real-time on his laptop, feeling almost awestruck.
"For a while, I thought multimodal interaction might be the path to our mission. Like humans sharing screens and working with AI all day," Embiricos said. "Then it suddenly became very clear: perhaps giving models programmatic access to the computer is the real way to achieve this."
These scattered projects took months to gradually coalesce into a unified direction. By early 2025, when OpenAI completed training of OpenAI o3, a model further optimized for programming tasks beyond OpenAI o1, the company finally had the technical foundation to build a true AI programming product. But by then, Anthropic's Claude Code was already preparing for public release.
Before Claude Code's release (launched as a "limited research preview" in February 2025 and fully launched in May), the mainstream mode in AI programming was still called "vibe coding." Developers advanced projects with AI-assisted tools, with humans steering the course and the AI supplementing the implementation. Such tools had already attracted hundreds of millions of dollars in investment.
But Anthropic's new product changed this model. Like the Jam demo, Claude Code could run directly through the computer's command line, meaning it could access all of a developer's files and applications. Programming was no longer just "AI-assisted"; developers could hand entire tasks over to the AI agent to complete.
Faced with this change, OpenAI began accelerating the launch of a competing product. Sottiaux recalled forming a "sprint team" in March 2025, tasked with integrating multiple internal teams within weeks to quickly release an AI programming product.
Simultaneously, Altman attempted a "shortcut" through acquisition, offering $3 billion to acquire AI programming startup Windsurf. OpenAI leadership believed the deal would bring a mature AI programming product, an experienced team, and an existing enterprise customer base.
But the acquisition stalled. According to The Wall Street Journal, the problem was OpenAI's largest partner, Microsoft. Microsoft wanted access to Windsurf's intellectual property. Since 2021, Microsoft had been using OpenAI's models to power GitHub Copilot, a product that had become a highlight of Microsoft's earnings calls. But as Cursor, Windsurf, and Claude Code introduced new AI coding agent experiences, GitHub Copilot began to seem like a previous-generation AI tool. If OpenAI launched a new programming product, it might not be good news for Microsoft.
This acquisition negotiation happened during the most tense period in the relationship between OpenAI and Microsoft. The two were renegotiating their cooperation agreement, with OpenAI trying to reduce Microsoft's control over its AI products and computing resources. Ultimately, the Windsurf deal became a casualty of this power struggle. By July, OpenAI abandoned the transaction. Later, Google hired Windsurf's founding team, and the remaining employees were acquired by another AI programming company, Cognition.
"I certainly wanted that deal to happen at the time," Altman said, "but not every deal is within one's control." He said that while he had hoped the Windsurf acquisition "would accelerate our progress to some extent," he was equally impressed by the momentum of the Codex team. While negotiations were ongoing, Sottiaux and Embiricos continued developing the product and releasing updates.
By August, Altman decided to push forward full throttle.
Greg Brockman's favorite way to measure AI capability is a little game he designed himself, the "Reverse Turing Test." He wrote the code for this game years ago and now tasks AI agents with re-implementing it from scratch.
The rules are simple: two human players sit at different computers, each seeing two chat windows on their screen. One window connects to the other human player, the other to an AI. Players need to guess which window is the AI while trying to trick their opponent into thinking they are the AI.
Brockman said that for most of last year, OpenAI's strongest model took hours to build such a game, requiring lots of explicit human instructions and assistance along the way. But by last December, Codex could generate a fully functional version directly from a carefully crafted prompt, using the new GPT-5.2 model underneath.
This change wasn't noticed only by Brockman. Developers around the world also began to realize that AI coding agents' capabilities had suddenly taken a significant leap. Discussions around AI programming, initially focused on Claude Code, quickly broke out of the Silicon Valley tech bubble and became a topic of mainstream media attention.
Even some ordinary users with no programming experience began using AI to create their own software projects directly.
This surge in usage was no accident. During this period, both Anthropic and OpenAI invested heavily to attract more AI coding agent users. Multiple developers told WIRED that their $200 monthly Codex or Claude Code subscription plans actually provided over $1000 worth of usage credits. This rather "generous" quota was essentially a market strategy: get developers accustomed to using AI programming tools in their daily work first, then charge enterprises based on usage volume.
According to multiple sources familiar with the matter, in September 2025, Codex's usage was only about 5% of Claude Code's. But by January 2026, Codex's user base had grown to about 40% of Claude Code's.
George Pickett, a developer with 10 years of experience in tech startups, recently even started organizing in-person meetups themed around Codex.
"I think it's pretty clear we're replacing white-collar work with AI agents," Pickett said. "As for what that means for society, honestly, no one really knows. It's definitely going to be a huge shock, but I'm generally quite optimistic about the future."
Meanwhile, Simon Last, co-founder of the efficiency software company Notion, valued at approximately $11 billion, said that after the release of GPT-5.2, he and the company's core engineering team switched to using Codex, primarily for its stability.
"I found Claude Code would often 'lie' to me," Last said. "It would say a task was running, but it actually wasn't."
Katy Shi, who researches Codex model behavior at OpenAI, said that while some describe Codex's default style as "dry bread," more and more users are beginning to appreciate this unpretentious way of communication. "A lot of engineering work, by its nature, is about being able to accept critical feedback without taking it as an offense," she said.
Meanwhile, some large enterprises have also begun adopting Codex. OpenAI's Applied AI CEO Fidji Simo said: "ChatGPT has become synonymous with AI, which gives us a huge advantage in the B2B market. Enterprises are more willing to deploy technology their employees are already familiar with." She added that OpenAI's core strategy for selling Codex is to bundle it with ChatGPT and other OpenAI products.
Cisco President and Chief Product Officer Jeetu Patel explicitly told employees not to worry about the cost of using Codex, emphasizing the importance of getting familiar with the tool quickly. When employees expressed concern that "using these tools might make me lose my job," Patel's response was: "No. But I can guarantee you, if you don't use them, you will lose your job because you will become uncompetitive."
Today, anxiety around AI coding agents has extended far beyond Silicon Valley's tech circles. The Wall Street Journal last month partially attributed a $1 trillion sell-off in tech stocks to Claude Code, with investors worried that software development might soon be massively replaced by AI. Weeks later, after Anthropic announced Claude Code could be used to refactor legacy systems running COBOL (common on IBM machines), IBM stock had its worst day in 25 years.
Meanwhile, OpenAI is also striving to push AI coding agents into the center of public discourse. The company even spent millions of dollars to run an advertisement for OpenAI Codex during the Super Bowl, rather than promoting ChatGPT.
Inside OpenAI's Mission Bay headquarters, almost no one needs convincing to use Codex. Many engineers I spoke to said they now write very little code themselves, spending most of their time just conversing with Codex. Sometimes, they even "collaborate collectively."
At the headquarters, I sat in on a Codex hackathon. About 100 engineers packed into a large room, each having four hours to create the best demo project using Codex. An OpenAI executive stood at the front, looking at a laptop and announcing team names into a microphone. Team representatives nervously took the stage, introducing their AI projects with slightly trembling voices. The final winner received Patagonia backpacks as prizes.
Many projects were both developed with Codex and aimed to help engineers use Codex better. For example, one team developed a tool to automatically organize Slack messages into weekly reports; another group created an internal AI guide similar to Wikipedia to explain OpenAI's various internal services. In the past, such prototypes often took days or even weeks to complete; now, an afternoon is enough.
As I was leaving, I ran into Kevin Weil at the door, the former Instagram executive now heading OpenAI's "OpenAI for Science" division. He told me Codex was替他通宵 (working overnight) on some project tasks for him, and he would check the results the next morning. This way of working has become routine for him and hundreds of OpenAI employees. One of OpenAI's goals for 2026 is to develop an "automated intern" for researching AI itself.
Simo stated that in the future, Codex won't just be for programming but is intended to become the task execution engine within ChatGPT and all OpenAI products, performing actual work for users. Altman also said he is eager to launch a general version of Codex but remains concerned about security risks.
He said that at the end of January 2026, a non-technical friend asked him for help installing a viral AI coding agent called OpenClaw. Altman refused the request, believing that "it's clearly not a good idea right now," citing risks like OpenClaw potentially deleting important files.
Ironically, weeks later, OpenAI announced it had hired OpenClaw's developer.
Many developers told me the competition between Codex and Claude Code has never been more intense. But as these tools' capabilities continue to improve and are increasingly introduced into workflows by corporate managers, the questions society needs to face extend far beyond "which AI programming tool to use."
Some oversight bodies worry that in the race to catch up with Claude Code, OpenAI might relegate safety concerns to a secondary position. A nonprofit called the Midas Project accused OpenAI of downplaying its safety commitments when releasing GPT-5.3-Codex, failing to fully disclose the model's potential cybersecurity risks.
In response, Glaese refuted this, stating that OpenAI did not sacrifice safety to advance Codex, and the company also said the Midas Project misread its safety commitments.
Even Greg Brockman, the OpenAI co-founder who last year donated $25 million each to a pro-AI super PAC and an organization supporting Donald Trump, and who remains optimistic that "we are on schedule towards AGI," holds complex feelings about this new reality.
Within Silicon Valley engineering circles, Brockman has long been known for his "deeply involved" management style: the kind of boss who would dive into the codebase checking details the night before a product launch. In a way, this more "hands-off" way of working now brings him relief. "You realize your brain was occupied by many details that were actually unnecessary," he said.
But at the same time, when one becomes the "CEO of a fleet of hundreds of thousands of AI agents," with these systems executing your goals and vision, it's also hard to stay deeply involved in the specific details of every problem being solved.
"In a sense, it can make you feel like you're losing the 'pulse' on the problem itself," Brockman said.












