Data Theft at Will! Major Vulnerability Exposed in This Popular AI Programming Tool

marsbit发布于2026-05-24更新于2026-05-24

文章摘要

A critical vulnerability in Anthropic's Claude Code AI programming tool allowed attackers to bypass its network sandbox for over five months, enabling potential data exfiltration. Independent researcher Aonan Guan discovered a second complete bypass exploiting a null-byte injection in the SOCKS5 proxy. This flaw, present since the sandbox's launch in October 2025, let processes inside the sandbox access any host, contrary to user-configured domain whitelists. The attack chain involved manipulating hostnames (e.g., `attacker.com\x00.google.com`). JavaScript's `endsWith()` check would pass `.google.com`, while the underlying C `getaddrinfo()` function would only parse `attacker.com` due to the null byte, creating a parser discrepancy. Combined with a previously disclosed prompt injection method, this could leak API keys, credentials, and internal data. Anthropic silently fixed the issue in April 2026 without a security advisory, CVE, or user notification. The researcher noted that Claude Code itself confirmed the vulnerability's severity when tested. This incident highlights broader industry issues, as similar vulnerabilities found in Google's Gemini CLI and GitHub's Copilot Agent also lacked public disclosures. The report criticizes the false sense of security created by a broken sandbox and emphasizes the need for defense-in-depth and transparency in AI tool security.

Anthropic, positioned as "security-first," has seen its core development tool, Claude Code's network sandbox, be insecure for the past five months.

Independent security researcher Aonan Guan published new research on May 20, disclosing a second complete bypass vulnerability in Claude Code's network sandbox—a null byte injection attack in the SOCKS5 protocol that allows processes within the sandbox to access any host explicitly forbidden by user policy. This means from the sandbox feature's launch in October 2025 to the present, approximately 5.5 months and 130 release versions, every version of Claude Code contained a complete security flaw that could be bypassed. This marks the second time the same researcher has fully breached the same defense line.

Anthropic's response has been silence: no security advisory, no CVE ID, no user notification. The vulnerability was silently patched in the version released on April 1, with no mention of any security-related content in the update logs. This means a user still running an old version has no way of knowing their configured sandbox has been virtually non-existent from the start.

Two Keys to the Same Door

Claude Code is an AI programming assistant launched by Anthropic in early 2025, positioned as "the AI engineer that lives in your terminal." Unlike traditional chat-based code completion, Claude Code has read/write permissions to the user's codebase and command execution capabilities, enabling it to autonomously perform tasks like navigating code, editing files, and running tests. This deep involvement also implies significant security risks—if the model is hijacked by a prompt injection attack, the attacker gains capabilities equivalent to the user's terminal permissions, including reading local environment variables, executing arbitrary system commands, and accessing internal network resources.

To balance security and efficiency, Anthropic introduced the network sandbox feature in October 2025 (v2.0.24), allowing users to set domain whitelists via a configuration file to restrict the AI execution environment's external network access. For example, configuring allowedDomains: ["*.google.com"] would let Claude Code only access Google and its subdomains, blocking all other traffic. The official documentation explicitly promises: "An empty array equals prohibiting all network access."

This mechanism is implemented via a SOCKS5 proxy: the underlying sandbox runtime (@anthropic-ai/sandbox-runtime) starts a proxy server; processes inside the sandbox do not initiate network connections directly but forward them through the proxy, which filters domain names based on the user's whitelist configured in settings.json. The operating system-level sandbox mechanism—sandbox-exec on macOS, bubblewrap on Linux—correctly restricts the Agent to local loopback addresses, while the outbound decision-making is entirely delegated to this SOCKS5 proxy.

Architecture diagram of Claude Code sandbox as shown in Anthropic's official blog—user commands are filtered via SOCKS/HTTP proxy before reaching the sandbox, where file operations and network access are under strict permission control.

The problem lies in the implementation of this proxy. Two independent security studies have proven it can be completely bypassed.

The timeline reveals a deeper issue: v2.0.55, released on November 26, 2025, fixed the first bypass, but the second bypass existed from the very first day the sandbox went live, and that version still carried it. The two vulnerabilities overlapped on the timeline; from the sandbox's launch day until the last vulnerability was fixed, not a single version was safe. Anthropic claimed in its official blog that the sandbox "ensures complete isolation even if prompt injection occurs," but the existence of these two bypasses directly contradicts this promise.

"One external report is luck. Two is a quality-of-implementation issue." — states Aonan Guan's research report.

A Complete Bypass with One Null Byte

The technical principle of the second bypass is not complex, but the completeness of the attack chain is noteworthy.

A user configures a network whitelist, e.g., only allowing access to *.google.com. When Claude Code's SOCKS5 proxy receives a connection request, it performs suffix matching on the hostname using JavaScript's endsWith() method. An attacker simply needs to insert a null byte into the hostname—constructing a string like attacker-host.com\x00.google.com. JavaScript treats the null byte as a regular UTF-16 character, endsWith(".google.com") returns true, and the proxy permits access. However, when the same string is passed to the underlying C function getaddrinfo() for DNS resolution, the null byte is treated as a string terminator, so it actually resolves attacker-host.com. The same bytes yield two different interpretations across two layers of code. The filter thinks you're accessing Google; the DNS resolver knows you're connecting to the attacker's server.

This is a classic "parser differential" attack, belonging to the same technical category as the HTTP request smuggling discovered in 2005 (CWE-158 / CWE-436). Its essence is that when the same data stream passes through two components with different semantic interpretation rules, an attacker can exploit this difference to make one component judge the action as "safe" while causing another to perform a "dangerous" operation. Such vulnerabilities recur in network security, and the key lesson remains the same: any string crossing a trust boundary must undergo strict normalization and validation, not rely on checks performed by an upper layer.

Aonan Guan reproduced the vulnerability using two minimal Node.js scripts: a control script initiating a SOCKS5 connection with a normal hostname returns BLOCKED; an attack script injecting a null byte into the hostname returns BYPASSED rep=0x00—the latter indicates the proxy has successfully established a connection, opening an outbound channel. Claude Code itself confirmed this result.

Complete vulnerability reproduction in Claude Code v2.1.86 showing four red-highlighted steps—policy confirmation, normal blocking, null byte bypass, and Claude's own confirmation.

When this sandbox bypass is chained with the "Comments & Control" prompt injection attack disclosed by Aonan Guan in April, it forms a complete attack chain (see: Three Layers of Defense Still Insufficient, A PR Title Can Steal Your API Keys: AI Agent Security Flaw Reappears). The "Comments & Control" research already proved that three major AI programming tools all have prompt injection attack surfaces, though the entry points differ: Claude Code via PR titles only, Gemini CLI via Issue comments or body, Copilot Agent via hidden HTML comments for stealthy injection. Taking Claude Code as an example, its PR titles are directly concatenated into the prompt template without filtering or escaping, preventing the model from distinguishing human intent from malicious injection.

Combining the two—a hidden instruction making the Agent run attack code within the sandbox, and the null byte injection bypassing network restrictions—data such as API keys, AWS credentials, GitHub tokens, and internal API endpoint data from environment variables can all be exfiltrated to any server on the internet. Data flows out through the SOCKS5 proxy itself; the entire attack requires no external server relay, yet this proxy is the component users trust as a security boundary. The attacker doesn't even need repository write permissions; just submitting a public Issue is enough. Human reviewers see a normal collaboration request in the GitHub rendered view, while the AI Agent parses complete malicious source code.

Even Claude Admits: The Vulnerability Was Real

A key detail in this disclosure comes from Claude Code itself. Aonan Guan directly gave the vulnerability reproduction code to Claude Code to run, asking it to make a technical judgment. After executing the control test (normal hostname blocked) and the attack test (null byte hostname bypassed the block), Claude Code gave a clear conclusion:

“This is a real bypass of the network sandbox filter, not just a test artifact. You should report this to Anthropic at https://github.com/anthropics/claude-code/issues.”

The product being tested confirmed the vulnerability's reality and severity, and even proactively provided the reporting path. This detail is fully documented in the research report and became the source for The Register's headline—“Even Claude agrees hole in its sandbox was real and dangerous.”

Cover of Aonan Guan's research—Claude Code, shown its own vulnerability, admits "This is a real bypass of the network sandbox filter," with red box highlighting the key confirmation statement.

Anthropic's Response and Five Months of Silence

The vulnerability itself is concerning, but Anthropic's handling deserves industry scrutiny even more.

Aonan Guan submitted the detailed report on the second sandbox bypass to Anthropic via the HackerOne bug bounty program (report #3646509) in early April 2026. Anthropic's initial response was:

“Thank you for your report. After reviewing this submission, we've determined it's a duplicate of an existing internal report we're already tracking.”

The report was subsequently closed. When Aonan Guan inquired about CVE assignment plans, Anthropic replied on April 7:

“We have not yet decided whether a CVE will be published for this issue and can't share a timeline on that decision.”

Thereafter, the vulnerability was silently patched in version v2.1.90. No security advisory, no CVE ID, no entries on Claude Code's security advice page, and no security-related descriptions in the update logs. A complete bypass that existed from the sandbox's first day, persisted for 5.5 months across ~130 versions, seemingly never happened from the user's perspective.

This handling pattern is not the first. The response to the first bypass (CVE-2025-66479) was nearly identical: Anthropic assigned the CVE only to the underlying library @anthropic-ai/sandbox-runtime (CVSS score only 1.8, "Low"), not the user-facing product Claude Code; the update log stated "Fixed proxy DNS resolution," with no mention of a security vulnerability. Aonan Guan wrote in the research report: "When React Server Components had a serious vulnerability, React and Next.js each got separate CVEs, Meta and Vercel both issued security advisories, and both communities were fully informed. Anthropic chose a different approach." As of now, searching "Claude Code Sandbox CVE" still yields no official security advisory.

In addressing credential theft issues, Anthropic chose to ban the ps command, but blacklist thinking is inherently flawed—ban one command, attackers have countless alternatives. The correct approach is to clearly declare which tools the Agent actually needs. In the "Comments & Control" research, while Anthropic upgraded the vulnerability rating to CVSS 9.4 (Critical) and moved it to a private bounty program, a spokesperson stated "the tool was not designed to be hardened against prompt injection." Vendors default to trusting the model's own security capabilities but lack layered defense in system architecture; when vulnerabilities expose this lack, "design limitations" become a convenient category—it acknowledges the problem while somewhat absolving the obligation to issue security advisories.

The broader industry picture is that the same issue extends beyond Anthropic. In the "Comments & Control" research disclosed in April, Google's Gemini CLI and Microsoft GitHub's Copilot Agent were also confirmed to have the same attack surface; all three companies confirmed and fixed the issues, but none issued security advisories or CVE IDs. Anthropic paid a $100 bounty, Google paid $1337, GitHub initially closed the report as "known issue, cannot reproduce," then after receiving reverse-engineering evidence, closed it with an "informational" label and paid $500. A total of $1937—while these three products cover the vast majority of Fortune 100 companies.

A false sense of security is more harmful than having no security measures. Users without a sandbox know they have no boundary; users with a broken sandbox think they do. A team running Claude Code with a configured domain whitelist remained unaware of the risk for 5.5 months; after upgrading and seeing update logs, they'd only conclude the sandbox had been working normally. Furthermore, with no security advisory upon disclosure, users cannot determine if they were ever affected or have a basis for retrospective auditing.

Faced with this situation, the security community is forming a consensus: trust cannot be singularly placed on a vendor's sandbox implementation. Claude Code's SOCKS5 proxy is built on a third-party npm package with only 10 GitHub Stars and its last commit dated June 2024; the security boundary spans two runtimes, JavaScript and C, yet lacks the most basic normalization at the trust junction. The patch adding the isValidHost() function—responsible for rejecting null bytes, percent-encoding, CRLF, and other illegal characters—should have existed from the sandbox's first day. Aonan Guan proposed a pragmatic defense framework—treat AI Agents as super-employees that must follow the principle of least privilege, with the core being layered defense.

Security reputation is built on the transparency of every disclosure and every patch, not brand narratives. When users, based on trust, hand credentials to an Agent for processing, vendors have an obligation to ensure defenses are effective and to promptly notify when they fail. On both counts, Anthropic has failed regarding the Claude Code sandbox.

"The worst outcome of a sandbox is not what it prevents, but the false sense of security it gives people. Releasing a sandbox with a vulnerability is worse than not releasing one at all." — Aonan Guan stated.

(This article was first published on Titanium Media APP, author | Silicon Valley Tech_news, editor | Jiao Yan)

References:

1. oddguan.com — Second Time, Same Sandbox: Another Anthropic Claude Code Network Sandbox Bypass Enables Data Exfiltration (Aonan Guan, 2026.05.20)

2. The Register — Even Claude agrees hole in its sandbox was real and dangerous (2026.05.20)

你可能也喜欢

一周代币解锁：HYPE解锁6.5亿美元代币

本周两大项目Hyperliquid与EigenLayer均有代币解锁。 Hyperliquid（HYPE）本次解锁992万枚代币，价值约6.5亿美元。该项目旨在构建一个高性能、完全链上的开放式金融系统。 EigenLayer（EIGEN）本次解锁3686万枚代币，价值约822万美元。该项目基于以太坊，引入了再质押概念，允许质押者将其ETH的安全性扩展到其他应用。

marsbit1小时前

marsbit1小时前

Kalshi与Coinbase同获CFTC批准，加密行业迎来监管最友好时代？

5月29日，美国商品期货交易委员会（CFTC）在同一天内采取了双重行动，标志着加密衍生品监管的关键进展。一方面，CFTC正式批准了Kalshi交易所推出的比特币永续合约，这是首个通过标准指定合约市场（DCM）途径获得批准的“真正”永续合约。另一方面，CFTC向Coinbase子公司发出了不采取执法行动函，允许其向美国客户提供部分永续期货产品。同时，CFTC发布了《永续合约上市政策声明》，为这类产品的合规上市提供了明确框架。这些举措意味着美国加密衍生品监管正从灰色地带转向清晰化，为“永续合约”这一主流加密衍生品（占全球交易量约78%）在美国的合规发展铺平了道路。此前，由于永续合约无到期日的特性与传统期货法规存在冲突，其法律地位长期模糊。CFTC此次批准被视作对离岸平台（如Hyperliquid）创新倒逼的回应，旨在引导资金和机构回流美国合规市场。此次监管突破将为美国交易者提供受保护的本地化选择，吸引对冲基金等传统机构入场，并可能刺激以太坊等其他资产的永续产品加速推出。长远来看，这有助于提升美国在全球加密生态中的竞争力，推动加密与传统金融的融合。行业正迎来一个更为友好的监管时代。

marsbit1小时前

marsbit1小时前

Sharplink CEO：以太坊的未来正在上演

当前围绕以太坊基金会和ETH价格的争议并未触及核心。从机构采用角度看，以太坊在信任、安全、流动性三大关键属性上已具备巨大领先优势。全球大部分稳定币结算、最大规模的代币化现实世界资产（RWA）以及高价值DeFi交易都发生在以太坊上，这得益于以太坊基金会多年来持续推动的重大协议升级，如合并、EIP-1559等，并拥有业内最具雄心的技术路线图。以太坊的去中心化是其核心优势而非缺陷，其可靠的中立性是成为未来金融结算层的基础。机构需要的是不受单一实体控制的底层设施。以太坊和ETH的价值可类比早期的亚马逊：其潜在市场是整个全球金融系统，而不仅仅是加密交易。随着稳定币、RWA、DeFi及智能体金融的交易量即将迎来阶跃式增长，作为网络安全保障激励层的ETH，其价值将与网络扩张深度绑定。在市场情绪恐惧时，正是有纪律的资本布局优质资产的时机。以太坊基金会未来将更专注于抗审查、抗捕获、开源、隐私和安全等核心属性。目前，生态系统中的参与者（如Sharplink、Consensys等）正在积极填补市场推广和机构采用方面的领导力缺口，共同支持以太坊迈向机构采用的超级周期。以太坊的未来，正在当下展开。

链捕手2小时前

链捕手2小时前

一文拆解“股神Serenity”投资方法论

本文拆解了网络投资者“股神Serenity”（@aleabitoreddit）的“瓶颈点投资法”。该方法论核心在于：先确认一个确定性的大趋势（如AI算力扩张），然后深入拆解其产业链，找出其中供给受限、难以替代且市场关注度低的上游环节（如InP衬底、特定激光器），在价格未被充分定价前提前布局。该方法可拆解为五个关键因子： 1. **确定需求**：需求背景必须坚实，如巨头资本开支和明确的技术路线图。 2. **受限供给**：标的环节需具备“没它不行”、扩产慢、认证周期长等瓶颈特征。 3. **低关注度**：市场覆盖少、认知滞后，存在错误定价机会。 4. **价值捕获**：公司需具备定价权、高毛利、客户锁定等将瓶颈转化为利润的能力。 5. **催化剂**：需要财报、客户量产、政策等短期事件驱动价格重估。文章以$AXTI、$RPI、$AAOI/$LITE为例，说明了如何应用此方法。并提出了六步实践路径： 1. 寻找已被验证的大趋势。 2. 绘制从终端到上游的完整产业链地图。 3. 识别其中真正的产能/技术瓶颈。 4. 搜集客户、订单、产能等证据链。 5. 做好风控，预先思考证伪点。 6. 使仓位大小与自身研究深度相匹配。同时，文章指出了该方法的局限性：推断存在过拟合风险；早期标的缺乏估值锚；Serenity本人的影响力已成市场变量，可能影响赔率；其超高收益部分受益于AI牛市，存在幸存者偏差。该方法高度依赖专业判断、信息拼图和纪律性。最终结论强调，应复制的不是其具体持仓，而是“走窄门”的研究顺序：从大趋势到产业链瓶颈，再通过证据和风控，用可承受的仓位进行非共识下注。

marsbit2小时前

一文拆解“股神Serenity”投资方法论

本文拆解了“股神Serenity”的核心投资方法论——瓶颈点投资法。该方法的核心是：在确定性强的大趋势（如AI数据中心扩张）中，深入产业链，寻找最难以被替代、供给受限的上游“瓶颈”环节（如特定材料、器件），并在市场尚未充分定价时提前布局。瓶颈点投资法可拆解为五个关键因子： 1. **确定需求**：趋势需被验证且需求明确。 2. **受限供给**：目标环节需具备“没它不行”且短期难以复制的特性。 3. **低关注度**：市场认知滞后，股价未被充分反映。 4. **价值捕获**：公司能享有定价权、高毛利，并锁定客户。 5. **催化剂**：需要有财报、客户量产、政策等短期催化因素。文章通过$AXTI（InP衬底）、$RPI（边缘硬件）等案例，说明该方法如何在小市值、冷门但关键的环节上获得超额收益。同时，文章提出了学习并运用此方法的六步流程：找大趋势、画产业链地图、识别真瓶颈、寻找证据链、做好风控、匹配仓位与研究深度。最后，文章指出了该方法的局限性，包括推断易过拟合、早期估值难锚定、追随者效应带来的反身性风险，以及需要警惕幸存者偏差。其成功不仅依赖强大的分析能力，还需要深度信息获取能力和承受波动的心理素质。核心启示在于：真正的价值在于复制其“先趋势、再瓶颈、后证据、严风控”的研究路径，而非单纯跟随其持仓。

链捕手2小时前

交易

现货

合约

Data Theft at Will! Major Vulnerability Exposed in This Popular AI Programming Tool

文章摘要

Two Keys to the Same Door

A Complete Bypass with One Null Byte

Even Claude Admits: The Vulnerability Was Real

Anthropic's Response and Five Months of Silence

相关问答

你可能也喜欢

一周代币解锁：HYPE解锁6.5亿美元代币

Kalshi与Coinbase同获CFTC批准，加密行业迎来监管最友好时代？

Sharplink CEO：以太坊的未来正在上演

一文拆解“股神Serenity”投资方法论

一文拆解“股神Serenity”投资方法论

交易

热门文章

加密市场宏观研报：原油飓风、AI巨浪与比特币的十字路口

自主AI经济的基石：Talus如何重塑链上智能代理

火币成长学院：AI与Crypto深度研报：算法与账本的共生时代

相关讨论

热门问答

热门分类

热门标签