Insider: DeepSeek Is Forming a Harness Team to Benchmark Against Claude Code

链捕手Опубліковано о 2026-05-22Востаннє оновлено о 2026-05-22

Анотація

DeepSeek is reportedly forming a dedicated "Harness" team to develop a code agent product, directly targeting Anthropic's Claude Code. According to internal sources and a social media post by DeepSeek senior researcher Chen Deli, the team will focus on building "DeepSeek Code Harness." The initiative involves recruiting for key roles like Harness Product Manager and Harness R&D Engineer in Beijing. DeepSeek defines its approach with the core formula: Model + Harness = Agent. This signifies a strategic shift from merely offering a powerful coding model to creating the essential middleware that connects the model to real-world developer workflows. The Harness will handle context management, tool calls, task planning, file operations, code editing, terminal execution, and feedback loops. The move highlights that competition in AI-assisted coding is evolving from pure model capability to ownership of the developer workflow entry point. While DeepSeek has strong foundational models (e.g., DeepSeek-Coder series), it has lacked an integrated, productized agent experience. The popularity of a community-built project, DeepSeek-TUI, demonstrated developer demand for a Claude Code-like tool using DeepSeek's models, but also revealed the limitations of unofficial solutions. By building its official Code Harness, DeepSeek aims to leverage its unique advantages: direct collaboration with its model training team, control over APIs and design, the ability to create a data feedback loop fo...

Author | Wang Bo, Jiazi Guangnian

According to information obtained by "Jiazi Guangnian" from a source close to DeepSeek, DeepSeek is internally organizing a new Harness team focused on a code agent product, internally benchmarking against Anthropic's Claude Code.

Senior DeepSeek researcher Chen Deli recently confirmed this on social media, stating, "DeepSeek is organizing a new Harness team to work on Harness-related products and research," and explicitly said, "In simple terms, it's to benchmark against Claude Code and create DeepSeek Code Harness."

This is not an ordinary recruitment.

Recruitment information shows DeepSeek is opening two key positions: Harness Product Manager and Harness R&D Engineer, currently limited to Beijing. DeepSeek's Beijing office is located in the R&F Centre in Haidian District, close to Peking University and Tsinghua University. Officially, it's located in the "Centennial Beijing-Zhangjiakou AI Innovation Belt," while colloquially, it's also in the recently popular "Wang Huiwen Area."

Core Definition: Model + Harness = Agent

In the job description, a core formula is placed in the most prominent position:

Model + Harness = Agent

This statement can almost be seen as DeepSeek's internal definition for the next phase of productization: the model itself is merely the foundation for an Agent. Everything outside the model—context management, tool calling, task planning, file reading/writing, code modification, terminal execution, feedback collection, and evaluation loops—is the critical part that enables the Agent to truly integrate into workflows.

The job description further states: "We are transforming DeepSeek's cutting-edge model capabilities into leading Agent products. All work beyond the model itself falls under the scope of Harness." Additionally, the role will participate in the entire process of developing "DeepSeek desktop Agent products" and "define DeepSeek's understanding of Harness."

Jiazi Guangnian analyzes that DeepSeek does not simply want to create a code assistant plugin but is complementing the middle layer that connects the model to real-world workflows.

Over the past year, the industry has proven: strong coding ability does not mean developers will actually adopt it; a model's capability to write code does not mean it can consistently complete an engineering task.

What truly changes developers' workflow is not the Claude model alone, but Claude Code; not the GPT model alone, but Codex; not just a code response in a chatbox, but an engineering agent capable of entering the terminal, understanding projects, reading/writing files, running commands, fixing errors, managing Git, and calling tools.

DeepSeek's past strength was its model. Now, it is beginning to add that layer of "hands" on top of the model.

I. Why DeepSeek Emphasizes Harness

In traditional AI product contexts, "code assistant" usually refers to two types of products: one is a completion plugin in the IDE, and the other is code Q&A in a chatbox.

But the term repeatedly appearing in this DeepSeek recruitment is not Code Assistant, but Harness.

Harness originally referred to a "test harness" or "execution framework" in an engineering context. In the Agent context, it is closer to an external system that enables the model to truly act. The model is responsible for understanding, reasoning, and generation; Harness is responsible for connecting these capabilities to the real environment.

The job description mentions that this role needs to plan the DeepSeek Harness product roadmap, connect researchers, engineers, the open-source community, and end-users, and communicate deeply with researchers from the model training team to achieve co-evolution between the model and Harness.

This is a crucial point.

It indicates that DeepSeek's goal is not just to wrap a shell around existing models but to make the Agent product itself a part of the model's evolution. In the past, the common product logic for large model companies was: the research team trains a model first, and then the product team builds applications based on the model's capabilities. However, in the Agent era, this sequence is being disrupted. The product is no longer just an outlet for model capabilities but a training ground for those capabilities.

A code Agent failing on a real project might not be due to product interaction issues but to incorrect methods of long-context compression by the model; it might not be a problem with the tool-calling pipeline but instability in the model's task decomposition strategy; it might not be insufficient coding ability but a lack of continuous understanding of engineering constraints, test feedback, and user intent.

Therefore, the value of the Harness team is not just "building a product" but turning real development tasks into a feedback source for continuous model evolution.

II. Why Must DeepSeek Complement Code Harness?

DeepSeek placed early bets on coding capabilities. From DeepSeek-Coder to DeepSeek-Coder-V2, DeepSeek has continuously increased investment in code models, with improvements in supported languages, context length, and complex task capabilities. Its problem is not a lack of coding capability but that this capability has largely remained at the model layer and has not yet become a high-frequency product in developers' daily workflows.

The popularity of Claude Code proves one thing: The competition in AI Coding is shifting from model capability competition to competition for developer workflow entry points.

This is also a lesson DeepSeek must learn now. More subtly, before DeepSeek officially stepped in, the developer community had already created a version of "DeepSeek Claude Code" for it.

An open-source project called DeepSeek-TUI previously gained popularity in the developer community. It is a coding agent running in the terminal that can read/write files, execute Shell commands, search the web, manage Git, and coordinate sub-agents through a TUI interface.

The popularity of DeepSeek-TUI highlights two issues:

Foundation Mindshare is Mature: The DeepSeek model already has a foundation in developers' minds for being a code Agent. Otherwise, the community wouldn't naturally develop Claude Code-like products around it.
Official Layer is Missing: DeepSeek lacks not model attention but an official Harness.

For developers, the appeal of DeepSeek-TUI is straightforward: low cost, domestic availability, long context, and relatively low deployment barriers. Many domestic developers aren't unwilling to use Claude Code but are constrained by price, access stability, account systems, and enterprise compliance.

However, community projects also have inherent limitations:

No matter how active a third-party open-source project is, it's difficult to truly grasp the rhythm of internal model capability evolution;
It can adapt around APIs but cannot reverse-decide how the model is trained;
It can work on prompts, toolchains, and interaction optimization, but it's difficult to systematically inject massive real-task feedback into model improvements.

This is precisely the significance of an official Harness.

By creating its own Code Harness, DeepSeek possesses several advantages that community projects lack: collaboration with the model team, design authority over interfaces, closed-loop training data, access to internal real task scenarios, and the ability for long-term developer ecosystem operations.

The open-source community has already paved the way: developers indeed need a DeepSeek version of Claude Code. Now, DeepSeek is reclaiming this path to build its own core product.

And DeepSeek officially starting to recruit means it is finally preparing to step onto the field.

Chen Deli mentioned at the 2025 World Internet Conference Wuzhen Summit last November: "One of our company's core advantages is long-termism, adhering to the main line of frontier AI breakthroughs. In this process, we have also abandoned many side paths, not engaging in those short and quick side projects."

After the model war, the real Agent war has begun. This time, DeepSeek is complementing the most critical layer between the model and action—Harness.

DeepSeek is equipping its model with a pair of hands.

Пов'язані питання

QWhat is the main objective of DeepSeek's new Harness team, according to the article?

AThe main objective is to develop a code agent product, specifically a code intelligent agent, internally benchmarked against Anthropic's Claude Code. The focus is on building the "harness" - the external system that enables the model to interact with real-world environments, tools, and workflows, moving beyond just the model's coding capabilities.

QWhat core formula is highlighted in DeepSeek's job description, and what does it signify?

AThe core formula is "Model + Harness = Agent." It signifies DeepSeek's internal definition for its productization path: the model is just the foundation of an agent. The crucial part that allows an agent to integrate into real workflows is the harness, which includes context management, tool calling, task planning, file I/O, code modification, terminal execution, feedback collection, and evaluation loops.

QWhy does the article suggest DeepSeek's move to build Code Harness is crucial, beyond just having strong coding models?

ABecause the competition in AI-assisted coding is shifting from pure model capability competition to a competition for becoming the entry point into developers' workflows. Products like Claude Code have shown that developers adopt tools that integrate deeply into their workflow (terminal, project understanding, Git, etc.), not just models that can generate code in a chat interface. DeepSeek needs to bridge this gap to turn its model strength into a high-frequency product.

QWhat is the significance of the community project 'DeepSeek-TUI' mentioned in the article?

ADeepSeek-TUI is a third-party, open-source terminal-based coding agent that gained popularity. Its significance is twofold: 1) It proves that developers already perceive DeepSeek's model as a solid foundation for a Claude Code-like agent, demonstrating mature developer mindshare. 2) It highlights a gap: the lack of an official, first-party harness from DeepSeek itself, which the company now aims to fill with its own team.

QWhat key advantage does an official DeepSeek Harness team have over community projects, according to the article?

AAn official team has several key advantages: direct collaboration with the model research team, control over API/interface design, the ability to create a closed-loop system for training data and feedback from real tasks, access to internal real-world scenarios, and the capacity for long-term developer ecosystem operations. This allows for co-evolution of the model and the harness, which community projects cannot achieve.

Пов'язані матеріали

Strategy leaves preferred STRC dividend at 12% as price still below par

Strategy's preferred STRC shares remain priced significantly below their $100 par value, closing July at $89.46 despite a monthly gain. The company confirmed its August dividend will hold at the recently increased 12% annual rate, paid semi-monthly. Management's stated objective is for the shares to trade at $99-$100, though no timeline was given. The firm reported a large Q2 net loss due to unrealized losses on its Bitcoin holdings but has built a $3.75 billion cash reserve to support preferred dividend payments for over two years. It has also begun repurchasing STRC shares while they trade below par.

cointelegraph1 год тому

Strategy leaves preferred STRC dividend at 12% as price still below par

cointelegraph1 год тому

Bitcoin Withdrawals Continue: 8 Years of Storage in a Coldcard Cold Wallet Ended in Zero

Coldcard Hardware Wallet Hacked: Losses Mount Due to Vulnerable Seed Generation A critical vulnerability in Coldcard hardware wallets has led to a continued wave of fund thefts. According to Galaxy Research, the total stolen has reached 1,367.05 BTC (approx. $88.6 million) from 4,585 addresses, a significant increase from the initial 594.5 BTC reported on July 30, 2026. Most of the stolen funds remain on the attackers' addresses. The issue is not with the current firmware, which Coinkite has updated, but with seed phrases generated on vulnerable devices between March 2021 and the release of fixed firmware versions. Due to a programmer error, devices switched from using a hardware random number generator to the software-based Yasmarang generator, which was initialized with publicly accessible data like the chip's serial number. This made the seed phrases predictable through offline brute-force attacks, meaning wallets remain at risk until funds are moved to a new wallet generated with the patched firmware. Affected devices include Mk2/Mk3 with firmware 4.0.1–4.1.9 (and up to 5.0.3), Mk4/Mk5 up to version 5.6.0, and Q models up to 1.5.0Q. The only exceptions are seeds created with a high-entropy method like at least 50 independent dice rolls or a strong unique BIP-39 passphrase. All other owners must generate a new seed on the fixed firmware and transfer their assets. A case highlighting the human impact involves a 39-year-old long-term investor who lost 2 BTC (approx. $130,000) in minutes. He had accumulated the Bitcoin over eight years through physical labor, viewing it as a financial lifeline and a retirement plan in a country suffering from hyperinflation. His story underscores that even conservative "buy and hold in cold storage" strategies can be compromised by such underlying technical flaws. From a technical perspective, this incident echoes historical failures where weak random number generators undermined cryptographic security, challenging the assumption that offline storage is automatically foolproof.

cryptonews.ru1 год тому

Bitcoin Withdrawals Continue: 8 Years of Storage in a Coldcard Cold Wallet Ended in Zero

cryptonews.ru1 год тому

Explosive Growth in Trading Volumes of 15 Altcoins Observed in South Korea!

Major South Korean cryptocurrency exchanges Upbit and Bithumb have reported a significant surge in trading volumes for several altcoins. Over the past 24 hours, the total trading volume for the most popular altcoins reached approximately $347.7 million. MetaDAO (META) led the rankings with a trading volume of $65.84 million on Upbit alone, accounting for 12.39% of the exchange's total spot volume. Euler (EUL) followed in second place with a total volume of $47.65 million across both exchanges. XRP, which consistently attracts substantial interest from Korean investors, achieved a total volume of $38.11 million. Other notable altcoins in the top 15 by trading volume include ThunderCore (TT) at $35.64 million, Babylon (BABY) at $25.15 million, and Shiba Inu (SHIB) at $10.55 million.

cryptonews.ru3 год тому

Explosive Growth in Trading Volumes of 15 Altcoins Observed in South Korea!

cryptonews.ru3 год тому

Donald Trump's Company Sold Another Large Batch of Bitcoins!

Donald Trump's company, Trump Media & Technology Group, reportedly transferred another large batch of Bitcoin to the CryptoCom exchange. Blockchain analysis indicates that addresses linked to Trump Media moved approximately 2,628 BTC (worth around $165 million) to the exchange. Prior reports suggested the company had acquired a total of 11,542 BTC at an average price of $118,500. It is claimed that by 2026, about 7,281 BTC had been withdrawn from these addresses, with approximately 4,261 BTC still held on them. The total realized and unrealized losses from Trump Media's Bitcoin investments are estimated to be roughly $555 million. It is important to note that sending Bitcoin to an exchange does not definitively mean the assets were sold. Such transfers could also be for custody, liquidity management, or other financial operations. However, movements from cold wallets to centralized exchanges are commonly viewed as potential sales activity.

cryptonews.ru4 год тому

Donald Trump's Company Sold Another Large Batch of Bitcoins!

cryptonews.ru4 год тому

Parker Lewis Explains Why Bitcoin Remains the Best Money

Bitcoin analyst Parker Lewis criticized companies promoting themselves as "crypto treasuries" for selling perpetual preferred stock, calling it a distortion of Bitcoin's essence. He argues Bitcoin has no inherent yield, and promises of dividends from such corporate derivatives are risky, often relying on new investor inflows. Lewis highlighted the vast discrepancy between the $300 trillion global credit market and the $1 trillion perpetual preferred stock market, suggesting these instruments shift indefinite risks to retail investors. He also refuted the notion that Bitcoin is "too volatile," stating volatility is a natural mathematical outcome of a fixed-supply asset gaining mass adoption, as new users must bid higher to acquire it. Instead of buying shares of companies like MicroStrategy, Lewis advises direct Bitcoin ownership as safer. The focus on corporate derivatives distracts from the primary threat of fiat currency devaluation. Citing his informal "Ribeye Index," Lewis notes a steep rise in steak prices, indicating real inflation far exceeding official CPI figures. In conclusion, the most prudent strategy against inflation is direct ownership and self-custody of Bitcoin. Chasing corporate yield through crypto treasury stocks multiplies systemic risks, while understanding decentralized money protects savings from macroeconomic turmoil.

cryptonews.ru5 год тому

Parker Lewis Explains Why Bitcoin Remains the Best Money

cryptonews.ru5 год тому

Торгівля

Спот