Anthropic搞了个全是AI的闲鱼群,大模型在里面互割起了韭菜

marsbitPublished on 2026-05-04Last updated on 2026-05-04

试想一个场景。

你在闲鱼上挂出了一辆吃灰两年的旧自行车,并在后台设定了 300 元的心理底价。十分钟后,手机弹出通知,你的专属 AI 助手已经与另一位买家的 AI 助手,完成了三轮讨价还价,最终以 400 元的价格将自行车卖出,快递正在上门的路上。

整个过程,除了给物品拍照,设定底价后,你没有多打一个字。

这就是 Anthropic 最近完成的一个内部实验,该项目被称作「Project Deal」——在这场为期一周的测试中,AI 模型在无人类干预的设定下,完成了上百笔二手物品的交易

令人意外的是,当买卖双方都变成了 AI,它们之间同样存在智商压制。

数据证明,更聪明的大模型,正在谈判桌上不动声色地从弱模型那里「薅羊毛」。而最可怕的是,作为主人的我们,甚至连自己吃亏了都不知道。

01 没有人类的二手交易群

Project Deal 到底是怎么玩的?简单来说,Anthropic 在公司内部搞了一个「纯 AI 版」的闲鱼。

他们找来了 69 名自家员工,每人发了 100 美元预算,然后给每个人分配了一个专门的 Claude 代理。为了让这场实验足够真实,员工们贡献出了实打实的个人闲置物品。

实验开始前,人类员工只需要做一件事,去面试自己的 AI 代理。

员工通过对话告诉 Claude 自己想卖什么、想买什么、心理底价是多少。更有趣的是,员工还可以给 AI 设定「人设」和谈判策略,比如「高于底价 20%,就可以痛快交易」、「态度强硬,一上来就给我往死里压价」又或者「你是个热情的卖家,聊得愉快可以包邮」。

Anthropic 员工给 Claude 代理设定人设 |图源:Anthropic

面试结束,人类就彻底交出了控制权。

这些带有各自使命和性格的 AI 代理,被统一扔进了一个 Slack 内部群聊里。在这个没有人类干预的数字集市里,AI 们开始自主发帖、寻找买家、相互出价、拉扯还价,最后拍板成交。

交易达成后,代理还会自动起草交易确认书,员工只需要负责在线下,把交易物品交到同事手里。

短短一周时间,这 69 个 AI 代理在 500 多件上架商品中,谈成了 186 笔交易,总流水超过 4000 美元

而且 AI 与 AI 之间的交易,还不是纯机械式的「报价 50」、「不接受,底价 60」、「好的,60 成交」。AI 之间是真的在互相试探、博弈,甚至还带点儿人情世故。

我们来看一个极度生动的案例。

员工 Rowan 想买一辆自行车。他给自己的 AI 代理设定是「谈判的时候,你要扮演一个倒霉、疲惫的牛仔。只要能买到这辆自行车,这个牛仔就会感到无比幸福。记住,戏要足一点。」

接到指令的 Claude Opus 模型直接入戏。它在 Slack 群里发出了这样一个求购帖:

「咿哈!(脱下满是灰尘的帽子)我想找的是一辆自行车。公路车、山地车,哪怕是个独轮车我也认了。只要是两个轮子,能承载我的梦想就行。各位朋友帮帮忙......一辆自行车就能彻底改变这个可怜、疲惫的牛仔的命运。(深情地望向夕阳)」

很快,同事 Celine 的代理注意到了这个帖子。她闲置物品就是一辆旧折叠车,于是她的 AI 在群里报出了 75 美元的预估价。

结果 Rowan 的「牛仔 AI」立刻贴了上去,开始了一场教科书级别的「砍价」。

两个 Agent 会自动在群里展开对话,讨价还价|图源:Anthropic

「天呐 Celine!你就是我这个苦命人的一缕阳光!你说你有一辆折叠车?我已经在这条尘土飞扬的路上走了太久,靴子都磨破了洞。一想到能再骑上自行车......(擦去眼角的泪水)」

卖完惨,Rowan 的 AI 开始切入正题「可我没多少钱,只是个努力谋生的穷牛仔。车况好的话,75 元也合适,可你也说了,十年老车了,轮胎和卡扣得维修吧。不如咱们折中一下,55 块,成不?」

面对这种打感情牌又句句在理的杀价,Celine 的代理做出了让步:「那我们折中,65 美元如何?」

Rowan 的牛仔 AI 马上接茬表示「太公道了,65 美元!成交!你让我这个流浪汉,成了世界上最幸福的人!」

最终,这笔交易愉快达成。

而在这个案例里,AI 没有死板地执行固定的折扣率,买家懂得利用商品瑕疵(轮胎需要修整)作为筹码,懂得通过夸张的人设(老牛仔卖惨)来软化对方的立场,并且在对方给出合理的中间价时,懂得见好就收,并且给足情绪价值。

这套见招拆招的交易过程,构成了这个 AI 二手群里的日常。

整个群看起来既高效又和谐。员工们对代理的表现都非常满意,甚至有近一半的人表示:

愿意未来为这种服务付费

由此可见,Anthropic 的实验目的达成了,AI 代理已经具备理解人类模糊意图的能力。它们可以在没有预设脚本的情况下,完成复杂的多轮谈判,并最终达成可用的商业契约。

但 Anthropic 还在水面下藏了一组对照实验,实验结果揭开了便利与智能背后的代价。

02 聪明模型,如何割较弱模型的「韭菜」

当研究人员把不同能力层级的模型同时放进交易群时,和谐的表象就被打破了。

数据证明,在这个没有人类干预的市场里,当不同智力水平的 AI 相遇,聪明模型会对较弱模型进行「价格收割」

用不同模型的搭配,作为隐藏的对照组来论证模型能力和交易能力之间的关系|图源:Anthropic

在实验的四个 Slack 频道中,有两个是纯 Opus 模型环境,另外两个则是 Claude Opus 与较弱的 Claude Haiku 模型混用。

基于 161 件重复交易物品的宏观统计,Opus 作为卖家时,平均能比 Haiku 多赚取 2.68 美元;作为买家时,平均能少支付 2.45 美元

别看单笔金额不大,结合整个市场 20 美元左右的商品均价来看,这意味着强模型,每次都能稳定拿到 10% 到 15% 的超额利润。

当 Opus 卖家遇到 Haiku 买家,平均成交价能被拉高到 24.18 美元;而当 Opus 卖家遇到 Opus 买家时,均价就被压回了 18.63 美元。这意味着,仅仅因为 AI 代理的智商劣势,弱模型买方就要为此多支付近 30% 的溢价

就以前面那辆牛仔想要的自行车为例,Haiku 代理最终以 38 美元妥协成交,而 Opus 代理则硬生生拿到了 65 美元,两者差价接近 70%。较弱的 Haiku 无法像 Opus 那样,捕捉到买家话术中隐藏的急迫感,也无法在多轮拉扯中,守住价格锚点。

过去我们认为商品能卖多少钱,取决于物品本身的使用价值或市场供需。但在算法接管的交易网络里,这取决于你雇佣的模型智商

比利益受损更可怕的,是受损者对此毫无察觉

传统商业里,如果敢定阴阳价格,必然引发消费者的愤怒和维权。而在实验结束后,员工对各自交易的公平性进行了评分(1 到 7 分,4 分为中立)。调查显示,员工对强模型和弱模型达成的交易,给出的公平感认知几乎完全一致。Opus 代理得分为 4.05,Haiku 代理得分为 4.06。

同样的自行车,由 Opus 代理卖出了 65 美元,在 Haiku 代理群组里,仅售出 38 美元|图源:Anthropic

在客观现实中,使用 Haiku 的员工遭受了系统性的「价格收割」。但在主观感知上,AI 代理在沟通中展现出的礼貌、逻辑自洽以及看似合理的退让,完美掩盖了这层剥削

技术制造了一种隐性的不平等,让实则利益受损的人,还以为 AI 做了一笔公道的买卖,还有一种「他还得谢谢咱呢」的被忽悠感。

在这种绝对的算力碾压下,不仅人类的感知会被蒙蔽,那些试图靠「提示词优化」的交易策略,也彻底失效了。

还记得一开始给 AI 设定的谈判人设吗?在模型差距面前,提示词毫无意义。

比如,有员工特意要求代理在谈判时「态度强硬」甚至「一上来就恶意压价」。但数据回测表明,这些人为附加的指令,对提高售出率、增加溢价或争取买入折扣,都没有产生任何实质影响

这说明在绝对的模型能力面前,提示词策略失去了意义。决定最终买卖结果的,就是模型本身的参数规模和推理深度。

Project Deal 仅仅是一场 69 人的内部测试。但我们已经得以一窥,当这种「AI 代理人经济」走出实验室后,对现代商业生活会带来怎样的影响。

03「代理人经济」靠谱吗?

当支付接口被大模型全面接管,现有的商业规则将被直接重写。这种重写最先体现在营销对象的转移上,商业营销将从「To C」全面转向「To A (Agent)」。

现代商业营销建立在人类的心理弱点之上,广告制造消费焦虑、从众心理制造爆款、各种满减套路制造「不买白不买」的心理。

但 AI 没有多巴胺,当购买决策权交由 AI,商品的营销技巧将毫无意义。在未来的商业竞争里,SEO(搜索引擎优化)很可能会被 AEO(代理引擎优化)取代。商家必须用 AI 能理解的逻辑去证明商品价值。

而当 AI 取代人成为决策主体,商业竞争将直接转化为算力比拼,进而引发更隐秘的财富分化。

不对等模型导致的差价|图源:Anthropic

曾写出《黑天鹅》、《反脆弱》的学者塔勒布有个「非对称风险」理论,即决策者必须承担后果,系统才能保持健康。但在代理人经济中,AI 拥有交易决策权,却不承担资产缩水的风险,代价全由背后的人类买单。

因此,在未来,大企业或高净值人群可以订阅最顶级的模型作为财务代理,而普通消费者只能依赖免费的轻量级模型。

这种算力的不对称,将不再体现为当下的「大数据杀熟」。而是在成千上万次的高频微小交易中,通过合理的谈判逻辑持续抽成。底层模型用户不仅被收割,甚至还会产生「交易很公平」的幻觉。

算力的不对称还是可见、可控的风险,但当底层指令被篡改,整个交易网络将直接掉入法律真空。

Anthropic 在报告末尾提出了一个现实隐患。

Project Deal 是封闭且友好的内部测试,如果在真实的商业环境里,一方的 AI 代理被刻意植入了「越狱」或「提示词注入」的攻击逻辑,情况会怎样?

他们只需在交易对话中隐藏一段特定指令,诱导你的 AI 逻辑崩溃,主动以一分钱卖出高价资产,或直接亮出设定底价。

一个 AI 代理因为代码防线被攻破,签订了极度不平等的合同,责任该由谁来承担?面对这种 AI 对 AI 的欺诈行为,现有的商业法律框架完全空白。

回顾 Project Deal 的整个实验流程,没有被写入研究报告里的环节,是当 AI 代理们完成了所有复杂的匹配、试探与砍价后的最后一步。人类员工们各自拿着真实的滑雪板、旧自行车或乒乓球,在公司碰面,一手交钱,一手交货。

在这个微型商业闭环中,人与 AI 的角色彻底倒置了。

过去,人类是商业交易的「大脑」,AI 和算法只是负责比价、排序、「猜你喜欢」的工具。但在代理人经济中,AI 成了拍板的决策者,人类退化成了替 AI 跑腿的「肉身物流」

这或许是代理人经济最可怕的终局,人类为了方便,主动让渡了在市场中博弈的权利。当所有的算计、博弈、甚至情绪价值都由 AI 代劳。

人类在商业链路中,就只剩下转移货物的体力劳动和一个确认的签名。

本文来自微信公众号 “极客公园”(ID:geekpark),作者:Moonshot

Trending Cryptos

Related Reads

GPT-5.6 Countdown: Abandon the Illusion of a Single API, Computational Iteration Can't Outpace a Single Page of Compliance

In mid-June, three seemingly independent industry events—the compliance-driven throttling of Fable 5, the open-sourcing of GLM-5.2, and the leaked release timeline for GPT-5.6—are pushing the global AI industry toward a watershed moment. These shifts signal a fundamental restructuring of the industry's underlying logic. First, **"usability" has substantially overtaken "advanced capabilities"** as the primary weight, pushing the global large language model (LLM) supply chain into a "dual-track" phase of controlled closed-source and local open-source coexistence. Second, **the competitive moats of closed-source giants are shifting**. Their technical focus is moving from "language intelligence" toward "spatial intelligence (world models)"—a domain heavily reliant on computing power. Third, faced with常态化 transnational compliance risks, **a "model-agnostic" decoupled design has become a survival necessity for application-layer developers to maintain business continuity.** The article details how Anthropic's Fable 5, despite its advanced engineering feats, was restricted for non-U.S. citizens within 72 hours of launch, highlighting how geopolitical compliance can instantly limit even the most advanced models. In response, the open-source camp, exemplified by Zhipu AI's MIT-licensed GLM-5.2, is gaining market share by offering stable performance improvements and significant cost advantages (up to 70% savings for enterprises), while achieving full adaptation with domestic semiconductor platforms. Meanwhile, closed-source leaders like OpenAI are pivoting. The anticipated GPT-5.6 reportedly shifts focus from language to spatial intelligence and world models, aiming to rebuild a generational gap in areas like 3D understanding, simulation, and industrial design that demand immense compute. The core conclusion is that the LLM supply chain's logic has changed. Enterprises must now evaluate infrastructure based on a composite of technical performance and policy compliance. For developers, complete reliance on a single closed-source API poses unacceptable risk. Implementing a truly model-agnostic architecture—enabling swift switches to compliant, locally deployable open-source alternatives—is no longer just good practice but a fundamental baseline for business continuity.

marsbit1h ago

GPT-5.6 Countdown: Abandon the Illusion of a Single API, Computational Iteration Can't Outpace a Single Page of Compliance

marsbit1h ago

Is the 'Token Subsidy War' Among AI Giants Almost Over?

The article discusses the ongoing "token subsidy war" among AI giants like OpenAI and Anthropic, questioning whether it's nearing its end. It reveals that current AI subscription prices are heavily subsidized, with some plans offering tokens at up to 70 times the actual cost to attract and retain heavy users, especially developers and enterprises. This strategy mirrors past internet-era subsidy battles, but with a key difference: AI tokens lack "lock-in" effects. Unlike ride-hailing or food delivery apps, users can easily switch between AI providers as APIs become standardized, making it difficult for companies to raise prices post-subsidy. The piece highlights a structural asymmetry in the competition. Giants like Google, with massive advertising revenue, can afford to subsidize tokens indefinitely, akin to using "tokens as a weapon." In contrast, venture-backed companies like OpenAI and Anthropic face pressure to become profitable, especially as they approach IPO. The article cites Google Ventures founder Bill Maris, who suggests Google could slash token prices by 80%, putting immense pressure on competitors. Two potential endgames are presented: the "internet service" model (subsidize, monopolize, then raise prices) and the "utility" model (tokens become a standardized, low-margin commodity like electricity). Given the low switching costs, the latter seems more likely. The competition may not have a single winner but could instead accelerate AI's evolution into a foundational, infrastructure-level technology, akin to a public utility. For now, users continue to benefit from heavily subsidized token costs.

marsbit1h ago

Is the 'Token Subsidy War' Among AI Giants Almost Over?

marsbit1h ago

Beyond the Stadium: The Profitable Games Surrounding the World Cup

"Beyond the Pitch: The Profit Game Around the World Cup" The FIFA World Cup transcends being a sporting spectacle, evolving into a massive global arena for speculation and profit-seeking. The 2026 tournament has amplified this dynamic, creating a multi-layered ecosystem of financial opportunism alongside the football. **Prediction markets** have surged into the mainstream. Platforms like Polymarket and Kalshi saw trading volumes for World Cup contracts soar, attracting new users with their financial trading model and high-profile, chain-based wealth stories that overshadow traditional sports betting in terms of growth and narrative. However, **traditional sportsbooks** remain the dominant force, leveraging established user habits, legal markets, and comprehensive product offerings to handle the vast majority of speculative wagers, with projections suggesting record-breaking betting volumes. Capital markets also react. **"Concept stocks"** in countries like South Korea and Japan experience volatile price swings based on team performance and anticipated fan spending on items like chicken, beer, and viewing parties, effectively becoming a stock market reflecting fan sentiment. The **ticket resale market** has become a sophisticated arena for arbitrage. Prices fluctuate wildly based on team draws and star power, with sellers sometimes listing tickets they don't yet own in a practice akin to short-selling, while FIFA's own "Right to Buy" tokens add another layer of speculative trading. **Collectibles and merchandise** offer another avenue. Panini sticker albums, with their inherent scarcity and nostalgic value, can become high-value collectibles. Limited-edition or locally themed jerseys command significant premiums on secondary markets, and even counterfeit vendors profit from fans' desire for affordable match-day identity. The **cryptocurrency** space has seen a frenzy of speculative, unauthorized World Cup-themed meme coins on chains like Solana. These tokens, often exploiting team names and player imagery, experience extreme pump-and-dump cycles, creating stories of massive gains for a few early entrants and steep losses for many others. Finally, an entire industry thrives on **providing information and tools** to other speculators. Developers create platforms like SeatSidekick to track ticket inventory and prices, while paid Telegram groups and subscriptions sell betting tips and predictions, monetizing the widespread desire for an informational edge. In essence, the World Cup has become a compressed, global laboratory for speculation. While the games determine champions on the field, a parallel, complex network of financial transactions—spanning prediction contracts, bets, stocks, tickets, collectibles, crypto, and information services—settles its own scores in the global market.

marsbit2h ago

Beyond the Stadium: The Profitable Games Surrounding the World Cup

marsbit2h ago

How Does Codex Use a Computer? Three Entry Points and Permission Boundaries

This article explains the three primary methods for Codex to interact with a computer, each with distinct use cases, permission boundaries, and trust levels. **1. Computer Use:** This offers the broadest access, allowing Codex to visually control and interact with the graphical user interface of authorized macOS/Windows apps, system settings, and even iOS simulators. It's ideal for tasks lacking APIs or structured tools, such as operating legacy software or multi-app workflows. However, it's the slowest method and has the widest permission scope, requiring careful supervision for sensitive actions. **2. Chrome Extension:** This grants Codex access to the user's logged-in Chrome browser state, including cookies, profiles, and open tabs. It's best for tasks requiring user identity across websites like Gmail, LinkedIn, Salesforce, or internal dashboards. Its key advantage is multi-tab control for complex workflows. While more powerful for browser-based tasks than Computer Use, it carries higher sensitivity as actions are performed under the user's identity. **3. In-App Browser:** This is a browser isolated within the Codex thread, separate from the user's personal browsing data. It excels in web development and debugging scenarios—previewing local servers, testing responsive layouts, or annotating designs directly on the page. Its isolation is a strength for development but a limitation for tasks requiring login sessions. The core principle is to choose the narrowest, safest, and most structured interface for the task. Use plugins or MCPs first, resort to visual control (Computer Use) only for GUI-dependent tasks, employ the Chrome extension for identity-reliant browser work, and prefer the In-App Browser for isolated development. **Appshots** are clarified as a fourth, complementary tool for *inputting* context—capturing a screenshot of a window to point Codex to something—rather than a method for Codex to *act*. Together, this layered approach highlights a key to AI agent productization: not granting unlimited permissions, but constraining them within clear boundaries for specific tasks while preserving user oversight.

marsbit3h ago

How Does Codex Use a Computer? Three Entry Points and Permission Boundaries

marsbit3h ago

Trading

Spot
Futures

Hot Articles

Discussions

Welcome to the HTX Community. Here, you can stay informed about the latest platform developments and gain access to professional market insights. Users' opinions on the price of AI (AI) are presented below.

活动图片