The Year of AI Applications: Saying 'Yes' While Ignoring Risks? A Comprehensive Open Source Log of Software Development's Journey

marsbit发布于2026-06-16更新于2026-06-16

文章摘要

The Year of AI Applications: Blindly Saying "Yes" While Ignoring Risks? A Software Development Log Goes Fully Open Source. AI-generated code harbors risks hidden within seemingly correct programs, potentially leading to data leaks or asset loss. The open-source project "Narwhal AI Code Risks," from Peking University's Narwhal-Lab, compiles real-world cases, early warning signs, and typical risk pathways. Its goal is to help developers identify potential hazards early and avoid repeating past mistakes. In 2026, code is generated faster than ever but deployed with less scrutiny. The danger often lies not in glaring errors, but in code that appears normal—syntactically correct, passing all checks—yet introduces subtle but critical flaws like non-existent dependencies, excessive permissions, or exposed databases. A stark example is the Moonwell cbETH oracle incident. A configuration file error, where a cryptocurrency price was set to ~$1.12 instead of ~$2,200, slipped through 28 checks and a pull request signed by both AI (Claude, Copilot) and human developers. This "semantic deviation" resulted in a loss of $1.78 million. The risk is that AI can produce functionally valid code that is semantically wrong for the business context. As AI moves beyond simple code completion to modifying configurations, installing dependencies, and operating via autonomous agents, it traverses longer, less traceable paths within software engineering, blurring traditional boundaries and oversight ...

The risks of AI-written code lurk within seemingly correct code, potentially leading to data breaches or asset loss. The open-source Narwhal AI Code Risks project compiles real-world cases, early warning signs, and typical risk pathways to help developers identify hidden dangers early and avoid repeating past mistakes.

In 2026, code is being generated at an ever-increasing pace, yet deployed with less and less scrutiny.

More and more often, user requirements are placed in a dialogue box, AI reads the context, completes functions, pulls dependencies, fixes configurations, and even conveniently generates tests.

Before you know it, a piece of code is already sitting in the repository, awaiting merge.

Users have developed a new habit: let the AI write it first and get it running, then see what needs fixing if there's a problem.

But in the software world, the most dangerous things are often pieces of code that appear utterly ordinary: syntactically correct, interfaces valid, tests passing, comments perfect.

Yet it may still introduce non-existent package names, open overly broad permissions, expose databases... or even allow an Agent capable of directly calling system tools to exfiltrate sensitive data from internal systems under prompt injection.

The real danger is not a flashing red error light. It's when all risk indicators show normal.

Risks from AI-generated code used to be scattered: a case buried in a security blog, a clue recorded in an Issue. When the next team encountered a similar problem, they had to piece together the source of risk from scratch and expend immense time and effort conducting large-scale empirical measurements on the code.

Now, Peking University's Narwhal-Lab has just open-sourced Narwhal AI Code Risks, which organizes these information fragments into three categories for researchers to examine: real incidents, early signals, and typical risk paths.

Paper link: https://github.com/Narwhal-Lab/Narwhal-aicode-risks

When All 28 Checks Pass, the System Still Veers Off Course

The first clue was a merged Pull Request, where the signature field prominently featured Claude Opus 4.6, Copilot, and four human developers. All 28 checks passed: No one spotted the issue.

Then, the liquidation bot took a few minutes and seized collateral worth $1,778,044.83.

The configuration file set the price of cbETH to its conversion ratio with ETH, approximately $1.12, instead of the actual price near $2,200.

A semantic price error slipped through development, review, and merge processes, ultimately turning into real loss in the financial system. This is the most glaring aspect of the Moonwell cbETH oracle configuration incident.

The problem lay in code without syntax errors, and human developers not immediately halting the anomalous process. On the contrary, it looked complete, smooth—a normal engineering delivery.

But it is precisely this undercurrent of normalcy that makes it a quintessential example of a security incident.

The risk of AI Coding lies in the fact that it doesn't always manifest as errors.

Often, it cloaks itself in the guise of a correct answer, quietly entering the engineering pipeline. The code runs, checks pass, PRs get merged, but the business semantics have already deviated from reality.

In low-risk projects, such semantic drift might just mean rework. But in sensitive contexts like finance or enterprise data systems, it directly leads to data leaks, exposed permissions, and asset loss.

When AI participates in writing code, modifying configurations, conducting reviews, or even co-signing and entering PRs, can we be sufficiently certain of how each deviation occurs?

The Green Light Doesn't Illuminate Every Corner

Early AI code assistants mostly remained at the level of local completions. If the syntax was wrong, the compiler would error, unit tests would fail, and the CI pipeline would block it.

Today's AI Coding ventures much further, while oversight has lagged behind.

It can read files, modify configurations, install dependencies, generate infrastructure scripts, and plan autonomously across multiple tasks via Agents.

AI is no longer just sitting on the sidelines handing over tools; it's beginning to enter longer chains of the software engineering process.

>The once-clear boundaries in software engineering are being reconnected by AI Agents into longer, harder-to-trace pathways.

Scattered Records Need a Common Logbook

Security incidents rarely start with complete conclusions. Some events have solid evidence and can enter the directory as real cases; some remain at the stage of community screenshots, researcher discussions, or preliminary disclosures, suitable only for continued observation; others are not tied to a single real event but have already formed clear patterns, suitable for proactive scenario planning.

Narwhal AI Code Risks divides the material into three layers: `cases/`, `inferred/`, and `scenarios/`.

`cases/` records real incidents with public sources and evidential chains; `inferred/` stores early signals not yet fully substantiated but worth continuous tracking; `scenarios/` organizes typical scenarios with clear risk paths, not yet bound to a single specific incident.

Without such public records, risks from AI Coding easily become short-term memories on the internet.

Today, everyone remembers a certain package name; tomorrow, they discuss a data exposure incident; after a few months, it's all covered by the next wave of tool hype. When similar problems arise again, teams still blunder like headless flies into waters of unknown risk.

What Narwhal AI Code Risks does is anchor these scattered risk fragments, allowing those who come later to turn to the same page.

Following Seven Index Categories to See Where Risks Come From

The problems brought by AI-generated code are not only in the code itself. They are in dependencies, in permissions, in Agent tool calls, and even more so in the way humans trust AI output.

Currently, Narwhal AI Code Risks categorizes risks into 7 types: Supply Chain, Code-Level Vulnerabilities, Cloud & Infrastructure Configuration, Agent Risks, Vertical Domain Risks, Intellectual Property & Compliance Risks, and Human Factors.

In Supply Chain risks, AI may recommend non-existent dependencies. In Code-Level Vulnerabilities, AI might reintroduce path traversal, missing input validation, or authentication issues into business code. In Cloud & Infrastructure Configuration, AI might grant overly broad permissions, public storage buckets, or exposed ports just to get the code running initially. Agent Risks are even more complex, moving beyond text generation to action execution. AI-generated artifacts are planting hidden dangers in real systems.

The AI Engine Is Firing Up, and the Logbook Is Just Beginning

As AI increasingly steps into the real world, related risk prevention and mitigation should not remain confined to post-mortems or scattered discussions.

The truly important aspect of Narwhal AI Code Risks is transforming risk cases into reusable knowledge.

Developers can use it to identify similar issues; security researchers can treat it as a sample library; tool vendors can extract detection rules and evaluation benchmarks from it; the open-source community can continue to contribute new cases, new evidence, and new risk types.

The AI engine is roaring, and every course deviation should leave its coordinates. Risks never disappear by being ignored, but experience can be recorded and passed on. The real value lies not in discovering a single vulnerability, but in ensuring later voyagers don't have to step into the same trap.

What Narwhal AI Code Risks is doing is providing an open-source logbook for the software world in the Year of AI Applications.

References:

https://github.com/Narwhal-Lab/Narwhal-aicode-risks

This article is from the WeChat public account "New Zhiyuan," author: LRST

你可能也喜欢

加密市场的并购交易正异常活跃

加密市场并购交易异常活跃，近期并购案例占一级市场交易总数比例已达约42%，创历史新高。这主要反映融资市场持续颓势，并购并未取代融资热度，而是在融资收缩后成为最稳定的交易形式。并购持续高涨的原因包括：项目估值足够便宜，买方议价权增强；收购可节省时间与试错成本，快速补齐关键能力；能获取重要牌照与合规资源；有助于巨头打通产业链上下游，实现集团化扩张。当前并购重心集中在四类领域：交易基础设施（如衍生品平台）、支付与稳定币（构建支付网络）、合规牌照、以及资产发行与分销（掌握交易源头）。这表明头部公司正从单点产品向综合金融生态演进。并购升温为创业者提供了除代币上市外的另一条退出路径，鼓励团队更关注产品、收入和可被整合的战略价值。但同时，行业正变得更加中心化，资源向少数巨头集中，合规壁垒抬高，创业门槛显著提升，加密行业可能逐渐形成类似传统金融的格局。

marsbit2分钟前

marsbit2分钟前

Solana 隐私生态全景图，从计算到 AI 的完整隐私栈

Solana隐私生态仍处于早期阶段，但正在快速发展。其独特架构（如ZK压缩）有望实现无需持久Rollup的大规模可组合隐私协议。理想的“最终隐私栈”可能是全同态加密（FHE）与零知识证明（ZK）的结合。目前隐私计算主要由Arcium和MagicBlock提供。Arcium利用多方计算（MPC）构建可定制的执行环境，处理加密数据并正在开发保密代币标准（C-SPL），应用场景包括私密支付、数据分析及医疗保健。MagicBlock则基于可信执行环境（TEE）创建私密临时Rollup，确保交易的机密性、可扩展性和可组合性。两者基础设施催生了诸多应用。私密转账与余额方面，Umbra基于Arcium构建，提供加密代币账户，实现金额、余额和关联关系的隐私，并支持选择性审计。Privacy Cash采用类似Tornado的屏蔽池处理SOL，Hush则整合了质押收益和私密兑换功能。为消除链上痕迹，encifherio通过包装代币和TEE保护兑换隐私；Vanish Trade利用屏蔽流动性路由隐藏交易策略；Darklake构建ZK原生流动性基础设施和暗池，防止前端运行。更高级的应用如私密预测市场（如Melee Markets）利用Arcium加密订单簿，保护参与者策略。在私密AI领域，Loyal结合Magic Block和Arcium技术，在链上加密存储和处理用户数据、对话及交易，确保用户拥有数据控制权。总体而言，Solana隐私生态正从基础计算层向复杂应用层构建，覆盖支付、DeFi、预测市场及AI等多个垂直领域。

Foresight News7分钟前

Foresight News7分钟前

Orbixbit.com 加密货币交易所评测

加密货币市场获利日益困难，促使许多用户开始尝试功能类似的小型交易平台，Orbixbit便是其中之一。该中心化交易所成立于2018年，提供多种主流及山寨币交易，以其简洁界面、低交易费用、快速验证和支持多种交易风格（如现货、合约、保证金交易）吸引用户。平台适合新手和经验者，并提供自动化交易工具。安全方面，Orbixbit采用双因素认证、冷钱包存储、数据加密和多步提现验证等措施，并声称受塞浦路斯证监会（CySEC）监管。平台仅支持加密货币充值与提现，流程简单，但处理速度可能受区块确认时间影响。总体而言，Orbixbit凭借低费率、直观界面和多样交易工具积累了用户。它提供实时行情、图表和被动盈利选项，但小币种选择有限，部分交易对流动性不及大型交易所。对于寻求兼具基础与高级功能的交易平台用户，它是一个值得考虑的选项。

TheNewsCrypto44分钟前

TheNewsCrypto44分钟前

关于「DeepSeek 完成超 500 亿元融资」的 7 个关键问题

DeepSeek首轮融资据外媒报道已完成超500亿元，估值突破500亿美元，但尚未获官方确认。此轮融资启动于2026年4月，是DeepSeek成立以来首次对外融资，打破了其长期“不融资、不上市、不商业化”的原则。融资有七大关键看点： 1. 整体情况：融资额超500亿元，创始人梁文锋投资200亿，其他投资方包括腾讯、宁德时代、京东、网易、IDG资本等。公司估值在两个月内从约100亿美元跃升至超500亿美元。 2. 特殊安排：大多数投资者需将资金注入由梁文锋控制的有限合伙企业，而非直接投给公司，且有5年锁定期。这些投资者无投票权，但享有优先财务信息和后续投资的优先权。唯一的例外是国家人工智能产业投资基金，可直接投资并拥有投票权，无锁定期。 3. 安排目的：核心是为了保障梁文锋对公司的绝对控制权，确保投资方是具备战略定力的“耐心资本”，以支持其推进开源AI和实现通用人工智能（AGI）的长期目标，而非短期盈利。 4. 腾讯入局：腾讯是DeepSeek的早期业务合作方，此次属于战略投资，与其长期“助力但不干预”的投资风格相符。 5. 宁德时代逻辑：投资核心结合点在于AI数据中心所需的能源解决方案。宁德时代将AI视为工具，希望在其储能业务新增长曲线上为AIDC的能源需求做出贡献。 6. 国家队意义：国家人工智能产业投资基金的入局凸显了DeepSeek战略重要性已超越技术范畴。融资条款要求核查基金背后有限合伙人身份，防范风险。 7. 后续发展：融资后，DeepSeek可能推出新模型，增加对图像、音频的支持，并迈向商业化。同时，将加大在基础设施（如自建数据中心）、人才和模型训练上的投入。实现AGI是长远目标，梁文锋需在保持技术初心与满足资本回报期望之间取得平衡。本轮融资仅是漫长征程的开端。

marsbit53分钟前

marsbit53分钟前

世界杯来临，预测市场的入口战争已打响

2026年美加墨世界杯拉开战幕，除了场内的竞技，场外的预测市场也成为了新看点。一种源于加密货币世界的链上工具——预测市场，正让球迷可以就比赛结果、冠军归属等进行预测和交易。预测市场能转化群体智慧为市场概率，但在过去，其复杂的链上操作（如钱包、Gas费等）将大量潜在用户挡在门外。中心化交易所（CEX）如Gate正尝试改变这一局面。Gate通过与知名预测市场平台Polymarket合作，为用户提供了更便捷的入口。用户可直接使用交易所账户和USDT参与，无需处理钱包、跨链等繁琐步骤，大幅降低了参与门槛。 Gate提供了两种主要交互模式：“预测模式”适合普通用户，操作简单直观；“交易模式”则提供订单簿、K线等专业工具，满足高阶用户需求。同时，平台允许用户在事件结算前随时买卖，增加了灵活性。除了体育赛事，其预测市场还涵盖加密货币、宏观经济等多个领域。为了帮助用户决策，Gate还构建了信息辅助体系，包括“聪明钱排行榜”、市场动态监控、实时Live专区以及AI洞察等工具，旨在形成从信息获取到交易执行的完整链路。针对世界杯，Gate专门设立了主题专区，聚合赛程、积分榜和预测市场，方便球迷一站式参与。预测市场让观赛体验发生了变化，用户交易的是对比赛走势的判断和市场共识的变动。预测市场已证明其产品价值，但大规模普及的关键在于能否吸引更多普通用户。降低学习成本、优化体验成为竞争重点。Gate等平台通过简化流程、整合服务，正推动预测市场从加密原生圈走向更广泛的大众市场。当更多人能轻松参与时，预测市场的增长故事或许才真正开始。

Odaily星球日报1小时前

Odaily星球日报1小时前

交易

现货

合约

The Year of AI Applications: Saying 'Yes' While Ignoring Risks? A Comprehensive Open Source Log of Software Development's Journey

文章摘要

When All 28 Checks Pass, the System Still Veers Off Course

The Green Light Doesn't Illuminate Every Corner

Scattered Records Need a Common Logbook

Following Seven Index Categories to See Where Risks Come From

The AI Engine Is Firing Up, and the Logbook Is Just Beginning

References:

相关问答

你可能也喜欢

加密市场的并购交易正异常活跃

Solana 隐私生态全景图，从计算到 AI 的完整隐私栈

Orbixbit.com 加密货币交易所评测

关于「DeepSeek 完成超 500 亿元融资」的 7 个关键问题

世界杯来临，预测市场的入口战争已打响

交易

热门文章

如何购买S

Sonic：Andre Cronje主导升级，逆势上涨的Layer1新星

成长学院：学习“ Sonic“ ，瓜分价值 1000 USDT

相关讨论

热门问答

热门分类

热门标签