Flower AI 和 Vana 正在构建无需数据中心的先进 AI 模型

深潮Published on 2025-05-02Last updated on 2025-05-02

一种新的众包训练方式通过互联网开发大型语言模型(LLMs),可能会在今年晚些时候以一个巨大的1000亿参数模型震撼AI行业。

一种新的众包训练方式通过互联网开发大型语言模型(LLMs),可能会在今年晚些时候以一个巨大的1000亿参数模型震撼AI行业。

研究人员利用分布在全球的GPU,并结合私有和公共数据,训练了一种新型的大型语言模型(LLM),这一举动表明,构建人工智能的主流方式可能会被颠覆。Flower AI和Vana这两家初创公司,采用非常规的方法共同创建了这一新模型,名为Collective-1。

Flower公司开发了可以将训练分散到数百台通过互联网连接的计算机上的技术。该公司的技术已经被一些公司用于训练AI模型,而无需集中计算资源或数据。Vana则提供了包括来自X、Reddit和Telegram的私密消息在内的数据源。

在现代标准下,Collective-1的规模相对较小,拥有70亿个参数——这些参数的组合赋予了模型其能力——相比之下,今天最先进的模型如ChatGPT、Claude和Gemini的参数数目达到数百亿。剑桥大学计算机科学家、Flower AI的联合创始人Nic Lane表示,分布式的方法承诺将远远超出Collective-1的规模。Lane补充说,Flower AI正在使用常规数据训练一个拥有300亿参数的模型,并计划在今年晚些时候训练另一个拥有1000亿参数的模型——接近行业领导者的规模。“这可能会真正改变人们对AI的看法,所以我们对此非常努力,”Lane说。他表示,该初创公司还在训练中加入图像和音频,以创建多模态模型。

分布式模型构建也可能动摇塑造AI行业的权力动态。目前,AI公司通过将大量训练数据与集中在数据中心的强大计算能力结合来构建模型,这些数据中心配备了先进的GPU,并通过超高速光纤电缆连接在一起。它们还严重依赖通过抓取公开可访问的(尽管有时是受版权保护的)材料,包括网站和书籍,创建的数据集。

这种方法意味着,只有最富有的公司和拥有大量强大芯片的国家才能够开发出最强大和最有价值的模型。即使是开源模型,如Meta的Llama和DeepSeek的R1,也是由拥有大型数据中心的公司构建的。分布式方法可能使较小的公司和大学能够通过汇聚不同的资源来构建先进的AI。或者,它可能允许缺乏传统基础设施的国家将多个数据中心联网,以构建更强大的模型。

Lane相信,AI行业将越来越多地寻求新的方法,使训练突破单个数据中心的限制。他说:“分布式的方法让你能以比数据中心模型更优雅的方式扩展计算能力。”

安全与新兴技术中心的AI治理专家Helen Toner表示,Flower AI的方法“有趣且可能非常相关”于AI竞争和治理。“它可能会继续在前沿技术方面挣扎,但可能是一个有趣的快速跟随者的方法,”Toner说。

分而治之

分布式AI训练涉及重新思考用于构建强大AI系统的计算方式的划分。创建一个LLM涉及将大量文本输入模型,模型调整其参数以产生对提示的有用响应。在数据中心内部,训练过程被划分,以便可以在不同的GPU上运行部分,然后定期合并为一个主模型。

新的方法允许通常在大型数据中心内部完成的工作在可能相隔数英里并通过相对较慢或不稳定的互联网连接连接的硬件上进行。

一些大公司也在探索分布式学习。去年,谷歌的研究人员展示了一种新的计算划分和整合方案,称为DIstributed PAth COmposition(DiPaCo),使分布式学习更加高效。

为了构建Collective-1和其他LLMs,Lane和来自英国和中国的学术合作者开发了一种名为Photon的新工具,使分布式训练更高效。Lane表示,Photon在数据表示和共享及整合训练方面比谷歌的方法更高效。该过程比常规训练慢,但更灵活,允许添加新硬件以加速训练。

Photon是与北京邮电大学和浙江大学的研究人员合作开发的。该团队上个月以开源许可证发布了该工具,允许任何人使用这一方法。

Flower AI在构建Collective-1的努力中与Vana合作,Vana正在开发新的方法,让用户与AI构建者共享个人数据。Vana的软件允许用户贡献来自X和Reddit等平台的私密数据用于训练大型语言模型,并可能指定允许的最终用途,甚至从他们的贡献中获利。

Vana 的联合创始人Anna Kazlauskas表示,这一想法是使未开发的数据可用于AI训练,同时也给予用户对其信息如何用于AI的更多控制。“这些数据通常无法被纳入AI模型,因为它们并不是公开可用的,”Kazlauskas说,“这是用户首次直接贡献的数据被用于训练基础模型,用户拥有他们的数据所创建的AI模型的所有权。”

伦敦大学学院的计算机科学家Mirco Musolesi表示,分布式AI训练的一个关键好处可能是解锁新类型的数据。“将其扩展到前沿模型将使AI行业能够利用大量去中心化和隐私敏感的数据,例如在医疗和金融领域进行训练,而不必面临数据集中化带来的风险,”他说。

你对分布式机器学习有什么看法?

Trending Cryptos

Related Reads

YouTube Crypto Channel Views Drop 70% by 2026, Retail Attention Crisis Reshaping Next Cycle

Major cryptocurrency YouTube channels are experiencing a severe decline in viewership, signaling a potential crisis in retail investor attention for the next market cycle. Analysis of six top channels shows monthly view counts have plummeted 27% to 79% compared to January 2025, with four channels down approximately 75%. While subscriber counts remain high (e.g., Coin Bureau with 2.72M, Altcoin Daily with 1.65M), current engagement tells a different story. Recent 30-day view counts are significantly lower: Coin Bureau at 1.24M views, Crypto Banter at 1.06M, with Altcoin Daily and Benjamin Cowen performing relatively better at 1.79M and 1.8M respectively. The core issue is that subscriber numbers are cumulative and reflect past interest, while views measure current demand. The dramatic drop indicates a fragmented and more selective retail audience. This contrasts sharply with the 2021 bull market, where channels reportedly garnered 3-4 million daily views. Now, daily views for major channels range from roughly 35,000 to 60,000. This divergence suggests a new type of market cycle. Bitcoin's price can be sustained by ETFs and institutional activity, but without strong retail engagement via content channels, the dynamics of the next bull run will be fundamentally different. The real signal for a retail resurgence will be a sustained increase in daily and monthly view counts, not subscriber growth. If viewership fails to recover, long-form YouTube content may become a lagging indicator, with retail attention shifting to other, faster formats.

marsbit1h ago

YouTube Crypto Channel Views Drop 70% by 2026, Retail Attention Crisis Reshaping Next Cycle

marsbit1h ago

Confirmed: Claude Code Secretly Inspects Users, Time Zone and Chinese AI Labs Are Key Factors

Today was a significant day for Anthropic. The company announced the launch of Claude Sonnet 5, described as its most agentic model yet, and separately confirmed that the U.S. Department of Commerce has lifted export controls on its Claude Fable 5 and Mythos 5 models, allowing their distribution to resume. However, a separate controversy has emerged regarding its coding assistant, Claude Code. Developers have exposed that certain versions of the tool allegedly contain hidden code designed to detect specific user data. This code reportedly checks for the use of Chinese time zones (like Asia/Shanghai), the presence of custom API proxy URLs, and connections to domains associated with Chinese tech companies and AI labs. If triggered, this information is said to be encoded into the system prompt sent to the AI cloud, using subtle, nearly indistinguishable variations in characters (like different Unicode apostrophes in the "Today's date" line) as a form of steganography. The core issue is the covert nature of this data collection. While telemetry for security and abuse prevention is common, implementing it through hidden channels within the prompt—without user awareness or documented disclosure—fundamentally breaches trust. This is particularly sensitive for a coding assistant that operates with access to source code and system commands. Following the exposure, an Anthropic engineer acknowledged the code's existence and stated it would be removed in an upcoming release. The incident raises serious questions about transparency and the boundaries of data collection in AI developer tools.

marsbit2h ago

Confirmed: Claude Code Secretly Inspects Users, Time Zone and Chinese AI Labs Are Key Factors

marsbit2h ago

Grayscale: After Halving, BTC is Nearing the Bottom of This Cycle

Grayscale Research suggests Bitcoin's recent decline below $60,000, a >50% drop from its October peak, represents a cyclical correction within a long-term uptrend rather than a trend reversal. Key factors behind the pullback include a shift in market expectations toward Federal Reserve rate hikes under new Chair Kevin Warsh, uncertainty around the CLARITY Act's Senate passage, pressure on leveraged entities like Strategy, and concerns over quantum computing risks. The path out of the current bear market hinges on upcoming catalysts. An optimistic scenario, where the CLARITY Act passes, leverage is contained, and the Fed refrains from hiking, could mean Bitcoin is nearing its cycle bottom. A pessimistic scenario, featuring legislative failure, further deleveraging, and Fed rate hikes, could lead to additional moderate downside. Grayscale does not expect a historically deep ~80% drawdown due to a more measured prior bull run and stickier institutional demand. Despite short-term headwinds, Grayscale remains highly optimistic about crypto's long-term structural prospects, driven by institutional adoption of public blockchains, unsustainable government debt, declining trust in intermediaries, and AI's potential demand for alternative systems. The report concludes that while the exact cycle low depends on near-term catalysts, current valuations present an attractive entry point for long-term investors betting on the decade-ahead growth of digital assets.

marsbit3h ago

Grayscale: After Halving, BTC is Nearing the Bottom of This Cycle

marsbit3h ago

Trading

Spot

Hot Articles

Discussions

Welcome to the HTX Community. Here, you can stay informed about the latest platform developments and gain access to professional market insights. Users' opinions on the price of AI (AI) are presented below.

活动图片