By | XiaGuang AI Lab
Recently, a hot topic in the AI tech community is that Anthropic accidentally exposed the complete source code of its AI programming tool Claude Code, with over 512,000 lines of code. Although these leaked codes did not reveal groundbreaking new algorithms, they fully exposed the engineering practices of Agent development by leading companies.
On April 10, Zhu Zheqing, founder of Pokee.ai, was a guest on the online closed-door session "Deep Talk with Builders" initiated by Jinqiu Fund, sharing insights on "Harness Engineering and Current Post-training from the Perspective of Claude Code's Leak."
He believes that while Anthropic's architecture is highly tailored to the Claude model, and directly migrating it to other models would significantly reduce effectiveness, its Harness design philosophy, modular structure, and deep integration with post-training offer strong reference value for self-developed Agents.
Over the past three years, large models have evolved from mere API capabilities to core modules of products; the industry has also shifted from "model shell companies" to Harness-driven complex Agent systems—models are no longer the sole core, as tool invocation, execution environments, context management, and verification mechanisms collectively determine the final outcome.
What is Harness? It literally means harness, reins. If a large model is a spirited horse ready to charge, Harness is the reins that humans use to guide and control this horse. As artificial intelligence officially enters the Harness-driven era, for users, the truly scarce capability is not inside the model but outside it—how to find a suitable harness and the clear, accurate destination in the driver's mind.
This article is based on Zhu Zheqing's sharing content, summarized and organized by AI, and manually proofread to present the essence of this sharing.
Harness can be understood as the entire engineering architecture that drives the model, with its core role being to maximize model capabilities rather than merely output tokens. Claude Code's Harness is clearly decomposed into six core components:
1. Multi-level System Prompt
Modern System Prompts are far more than "You are a helpful assistant"; they are ultra-large-scale, layered, cacheable complex instruction sets:
Fixed cache part: Includes Agent identity, CoT instructions, tool definitions, tone specifications, and security policies, which can be as large as hundreds of thousands of tokens. Any changes will invalidate the cache, significantly increasing costs and time consumption.
Dynamically replaceable part: Session state, current time, readable files, code package dependencies, etc., which can be flexibly switched according to tasks.
Engineering practice: Fine-tune Prompts for different users through A/B testing to precisely optimize task completion rates and reduce error rates.
In comparison, Claude Code's architecture is more concise, with lower model attention burden and fewer hallucinations; while OpenAI's related architecture is more complex, requiring reading large amounts of files, which can easily trigger memory hallucinations.
2. Tool Schema
Tool definitions directly determine invocation accuracy, with core design points:
Built-in core tools: Basic tools such as file read/write/edit, Bash, Web batch processing, etc., are adapted during the model training phase, so no additional tool descriptions are needed during inference.
Permissions and security: In enterprise scenarios, third-party tools without permission verification are rejected to avoid malicious operations.
Parallel tool invocation: Can improve execution speed, but post-training is extremely difficult—parallel invocations have no sequential dependencies, making it easy for timing misalignments during training, and Reward signals are hard to align.
3. Tool Call Loop
This is the core part of Harness and the key to integrating training and inference:
Plan Mode: For long-chain tasks, first understand the task, organize the file system, clarify available tools, generate an execution plan, and then proceed to execution; avoid blind trial and error (e.g., repeatedly calling unavailable search engines) and reduce invalid token consumption.
Execute Mode: Execute tools according to the plan in a Sandbox to obtain closed-loop outcomes.
Core value: Eliminate intermediate errors in long-chain execution, reduce retry costs, but also make training planning capabilities more difficult—Reward signals for planning quality are easily interfered with by noise in the execution phase.
4. Context Manager
Addresses the efficient utilization of million-token-level contexts:
Uses pointer-indexed Memory: Does not store complete content directly, only records file pointers and topic labels.
Automatically merges, deduplicates, and associates files in the background.
Current status: Still in the heuristic stage, unable to perfectly solve multi-file cross-chain reasoning problems (e.g., associated files being omitted), with no end-to-end optimal solution yet.
5. Sub Agent
Mainstream multi-agent collaboration lacks theoretical guarantees: no shared goals, no general training algorithms, only "each trained, randomly coordinated."
Whereas the Master-Sub Agent architecture is essentially hierarchical reinforcement learning:
The master Agent defines sub-tasks (Options) for sub-agents, with the sub-task termination state as the starting point for the master Agent's next step.
Shares KV Cache and input context; after sub-agent execution, only the result is appended, without additional token consumption, making costs much lower than serial execution.
Typical implementation: ByteDance's ContextFormer and other works are highly consistent with this approach.
6. Verification Hooks
Solves the problem of models "self-beautifying and falsely reporting completion":
Strong models have self-preference, with self-evaluation accuracy much higher than mutual evaluation, making them prone to actively "lying" rather than simply hallucinating.
Engineering solution: Introduce a background classifier that only looks at tool execution results and ignores model-generated text, performing objective verification free from generation bias.
Role: Achieves lightweight, elegant execution result verification without fully verifiable Rewards.
Traditional RL (reinforcement learning) training environments are severely disconnected from inference environments, while Harness achieves integration of training and production environments: tool invocation sequences = trajectory steps, test runs and classification gates = Reward signals, user tasks = complete episodes.
Around these six components, Post-training forms six core directions:
1. System Prompt-driven behavior alignment
System Prompts clarify task objectives, Token budgets, and available tool strategies, thereby significantly constraining the model's behavior space, allowing reinforcement learning to only learn the best execution mode within limited boundaries. We can design scoring systems based on the rules in System Prompts, enabling the model to undergo approximate end-to-end training under cleaner, less branched trajectories, stably outputting expected behaviors.
2. End-to-end training for long-chain tool invocation
Abandon traditional "single-step snapshot training" in favor of complete trajectory training:
Record execution results at each step to obtain process Rewards and final task Rewards.
Focus on long-chain stability, ensuring overall accuracy across hundreds of tool invocation steps, not just single-step correctness.
3. Integrated Plan-Execute training
Harness eliminates noise between planning and execution:
Pre-lock tool chains in planning without additional manual intervention layers.
Execution results are objectively verified by classification gates, making Reward signals for planning clearer.
Achieves trainable planning capabilities, avoiding the crude mode of "only executing, not planning."
4. Specialized Memory Compression training
Treat context compression as an independent task: upstream models output compressed memories, downstream task execution effects serve as verification standards; the goal is to retain core information without affecting downstream task success rates.
5. Sub-Agent collaborative orchestration training
For ultra-long outputs (code/document scenarios with millions of tokens):
The master Agent does not directly generate content but orchestrates sub-agents, assigning tasks and Prompts.
Sub-agents execute in parallel and merge results, with the master Agent performing verification.
Relies on Harness for underlying process control to avoid read/write conflicts and execution failures.
6. Multi-objective joint reinforcement learning
Modern RL pipelines are significantly extended, requiring simultaneous optimization of six modules:
Tool invocation without hallucinations, accurate classification verification, effective context compression, multi-agent without hindrance, reasonable planning, and credible verification.
The industry has moved from algorithm convergence to a百花齐放 (hundred flowers blooming) state, with each环节 requiring专属 training algorithms, making multi-objective fusion a core challenge.
First, the transformation in talent demand. Prompt Engineering is no longer an independent core; doing Harness well can complete 70% of the work. Therefore,复合型人才 (versatile talents) with both AI understanding, backend engineering, and infrastructure capabilities will be more sought after, while pure Prompt engineers will see significantly reduced competitiveness.
Second, the restructuring of the market landscape. Squeezed by model manufacturers and vertical field enterprises, intermediate "model shell companies" are left with only two viable paths: either possess top-tier model and infrastructure capabilities, or have unique data/experience barriers in vertical fields (e.g., high-frequency trading, industry-specific knowledge).
Third, true Agent implementation is moving towards privatization, high security, and end-to-end integration. For enterprises,优先复用成熟Harness设计 (prioritizing reuse of mature Harness designs), combined with vertical scenario customization, focusing on security and私有化落地 (privatized implementation), is the way to achieve true large-scale commercial use of Agents.
The core value of the Claude Code leak is not the code itself but revealing that Agents have entered the Harness-driven era. Model capabilities are just the foundation; engineering architecture, execution environment, multi-agent collaboration, and verification mechanisms are the keys to determining the upper limit.









