抱抱脸模型TOP榜，我现在只服yuxinlu1

marsbitPubblicato 2026-06-28Pubblicato ultima volta 2026-06-28

Introduzione

个人开发者yuxinlu1凭借两个开源模型冲上Hugging Face热榜前列，力压多家大厂模型，总下载量超70万。其模型基于Gemma 4-12B，主打代码编程（V1）和智能体工具调用（V2）能力，并以GGUF量化格式发布，最低仅需约4.5GB显存即可本地运行，兼顾隐私与免费。V1专注生成可验证代码，V2增强了多步任务处理能力，在特定基准测试上表现达到基座的3.5倍。作者逯雨鑫是在美AI方向研究生，项目纯属自费的个人探索。他投入约40小时，重点处理了高质量训练数据，并积极响应用户反馈。他认为个人开发者的优势在于能更专注解决具体痛点，而非追求全能。其成功也源于对本地、低门槛AI助手的定位，满足了大量用户对隐私和免费使用的需求。除代码模型外，他早期还发布过中文网文生成LoRA等。他强调开源需真诚说明模型能力，并坚持应对挑战。目前其V3及基于Qwen3.6-27B的更大版本已在计划中。模型最适配llama.cpp平台。

一位个人开发者,竟然在一众大厂中,杀进了抱抱脸Models Trending榜的前排?!

这是普通的一天,我也普通地刷着抱抱脸的Trending榜。

第一是GLM-5.2,智谱最新开源模型,老熟人了,下载量6万多,不足为奇。

第二是百度的无限OCR,最近悄悄开源的,一口气能解析40多页文档,下载量也来到了7万。

再往下看,突然出现了一个个人账号:yuxinlu1。

嗯......嗯?!

而且一占就是两个位置。

再一看下载量——最新数据已高达20.7万和53.6万。好家伙,这是什么神仙模型来了?

甚至在此前一周,这位个人开发者的模型一度霸榜抱抱脸,力压GLM-5.2一头,连智谱负责人都在X上公开推荐:

也就是说,在智谱、百度、Qwen、NVIDIA...这些名字中间,一个个人开发者账号硬生生挤进了TOP,而且下载量还这么高。

不禁令人好奇:luyuxin究竟是谁?怎么能量这么大?

“素人模型”冲上抱抱脸热榜

这波Hugging Face热榜,前排基本是大厂、明星团队和热门赛道在卡位。

比如智谱GLM-5.2,753B超大参数,国产明星大模型;百度Unlimited-OCR,踩中了最近很火的OCR和文档理解方向。

再往下还有Qwen的AgentWorld、英伟达的 LocateAnything、微软的FastContext。

国产开源大模型的熟面孔也都在列:MiniMax M3、Kimi-K2.7-Code、DeepSeek-V4-Pro。

图像生成方向也有Krea,新模型Krea-2-Turbo和Krea-2-Raw都在榜上。

结果里面还夹了两个luyuxin的12B GGUF模型。

不er...luyuxin你也太醒目了吧...

仔细一看,这两个新模型,主要把Fable 5的编程推理能力,蒸进了一个本地能跑的Gemma4-12B小模型里。

4.5GB显存就能跑,本地、离线、零API成本。普通玩家一张消费级显卡,甚至一台带统一内存的Mac,就能把它跑起来。

两个模型的分工也不同。

V1是Coder版,主打写代码、解题、生成可运行代码。

据模型卡,它的训练数据是“可验证”的代码推理:每条思维链对应的代码,都得真跑过测试、通过了才留下。

教师数据主要来自Cursor的Composer 2.5,外加Fable 5——Composer 2.5做错的题,会交给Fable 5重新推一遍,生成新的推理链和正确代码。

V1发布后,曾连续多日霸榜抱抱脸Trending榜榜首。

V2是agentic版,加了多步工具调用能力,能当本地Agent用,会自己读、推理、动手、再验证。

作者还跑了benchmark——在tau2-bench的telecom子集上,基座gemma-4-12B得分15%,V2版模型得分55%,大概是基础性能的3.5倍。

不过作者也表示,这是本地自测、单一领域、20个任务跑出来的相对值,不能跟官方榜直接比,他也坦白跟frontier大模型还有不小差距。

作者还提到:Fable 5后来被下线了,只有他自己的数据集还保留着Fable 5“原始”的那份推理过程。

而社区贡献数据里缺失的那部分reasoning,他改用Claude Opus 4.8(xhigh)重新生成、一条条补了回来。

他也承认,重建出来的轨迹“可能和原版Fable 5有出入”,但这是当时唯一可行的方案。

他还在discussion里透露,这套微调数据其实只有约1万条examples。他强调,数据量没有大家想象得那么重要,真正关键的是质量、筛选和验证。

这套模型之所以能在抱抱脸上有这么高的热度,还有一个很现实的原因:本地能跑。

这两个模型都是GGUF量化版。

GGUF是llama.cpp生态里常见的本地模型格式,用户可以用llama.cpp、Ollama、LM Studio、Jan等工具直接加载。

这对coding场景尤其有吸引力。毕竟写代码、看仓库、跑命令、调bug,经常涉及私有项目和本地环境。能在自己机器上跑,就意味着不用把代码传到云端,也不用每次都付API调用成本。

更关键的是,它门槛不算高。

V1模型卡里写到,最小的Q2_K版本约4.5GB,只要有约4.5GB显存或统一内存,就能跑一个私有、离线的编程助手。

作者推荐的甜点位是Q4_K_M,大小约6.87GB;更高质量的Q8_0则约11.8GB。

V2因为更偏agentic,作者没有放Q2_K。理由是压力测试没过,不够可靠。

所以V2的最小可靠版本从Q3_K_M开始,约5.7GB;推荐的Q4_K_M依然是约6.87GB。

作者还提前剧透了后续计划——V3已经在路上。

他表示,V3仍然会沿着12B这条线继续做coding+agentic方向。作者说,自己也没想到这次后训练的提升会这么大,所以接下来会继续往前推。

尤其是在tau2-bench telecom上,V2还有一些“过度尝试、反复retry”的问题,V3会继续通过更多训练来改。

另一方面,他还在做一个更大的版本:Qwen3.6-27B。相当于把同一套coding+agentic配方放到更大的底座上,给显存更宽裕的用户用。

一个人,40小时,杀进大厂中间

能单枪匹马冲上抱抱脸热榜,下载量加起来超70万,在一众大厂机构间杀出一席之地。

这位作者究竟是何方神圣?

量子位与作者取得联系后,也得知了他的故事。

他叫逯雨鑫,目前是美国一所高校在读的AI方向研究生,本科念的是数据与商业分析,中间还专门去补过一轮全栈开发,把前后端、软件开发、数据处理都学了。

这两个爆火模型,并不是他的主业,而是纯自费的个人项目。

“开源这东西其实只是花钱,并不会让你有任何收入。”他很清楚这一点,因此他做V1的最初动机,反而是“自我提升”:

学校教的知识更新太慢,他读研时教授讲的还是两三年前的内容,而AI日新月异,他干脆拿这个项目来逼自己追上最新的东西。

为了做这些模型,他烧掉了整整一个Claude Max 20×套餐,单是V2就花了40多个小时。

一条条合成数据、手动清洗、训练、评测、再训练,几乎全是一个人扛下来的。

硬件上,他用的是一张RTX 5090,显存为32GB VRAM;另外还有约96GB的本地SSD资源可配合使用。实际能调动的资源规模大约在128GB左右。

对个人开发者来说不算差,但跟大厂和AI Lab的算力池完全不是一个量级。

他告诉量子位,整个过程里最耗时的其实不是训练,而是数据处理。

尤其是agentic数据,真实对话往往很长,一个任务可能有十几步,几千甚至几万个token。但受限于显存,他训练时一次最多只能喂2048 token。

所以他做了类似“滑动窗口”的处理:在每段多轮会话里,以最近一次用户消息为锚点,围绕一次工具调用,把上下文裁到预算以内。

V1和V2都以Gemma 4-12B为底座。选它不是因为好做,恰恰相反,Gemma 4的格式和工具协议都比较特殊,适配起来很麻烦,甚至很多客户端支持并不完善。

逯雨鑫表示,一方面是挑战自己;另一方面,是因为12B这个尺寸很有吸引力。

他算过,如果量化到3bit左右,很多8GB统一内存的Mac用户也能跑起来,还能留出一定上下文窗口。

我现在知道,很多人使用的电脑还是8GB左右的统一内存。所以我想在最大可能的参数量下,让更多人使用到。

逯雨鑫把本地模型的价值总结成两个词:

隐私,免费。

他觉得,很多人只是想让AI帮自己整理文件、处理数据、做PPT,或者体验一下agent,并不一定愿意每个月为Claude、GPT付费。

人可能就是想玩一玩,为什么非得要收费呢?

V1发布后,他一开始没太关注榜单,只是像往常一样在模型卡里说:如果大家喜欢、下载量和likes多,他就继续做V2。

没想到两三天后,模型突然从不知道多少名跳到第八;睡了一觉,又冲到第一。

随后,评论和issue大量涌进来。

他几乎每条都看。最多的时候,每天花三四个小时看Hugging Face评论、回复问题、测试用户反馈,再把结果告诉对方。

他表示:“社区有需求,我是真的在去做,这才是最关键的。”

原来还是个爱看网文的...

在HF上,逯雨鑫总共发布了9个公开模型,除了两个爆火模型,他还做过“直接蒸Claude”的模型。

比如gemma-4-12B-it-Claude-4.6-4.8-Opus-GGUF,可以理解成通用版Gemma4-12B蒸馏模型。

它不只限定编程,更像是在把Claude Opus的回答风格、推理习惯、thinking能力,往这个12B本地模型里压。

另一个模型则干脆换上JetBrains的编程模型Mellum2当底座,专做推理蒸馏。

再继续往下看...

等等,怎么还有网文的微调模型啊?

好家伙,还分了四个题材,都是中文网文LoRA,而且全都基于Qwen3.6。

逯雨鑫告诉量子位,这其实是他最早开始做Hugging Face模型的入口。

因为他自己本来就喜欢看小说。追一本没完结的小说时,读者焦虑;作者日更码字也很辛苦。

于是,他想做一整套免费的小说生成pipeline,用不同风格的中文小说LoRA,让作者能用AI提速,读者也能更快看到内容。

但中文小说LoRA在HF上并不算热门,后来他发现用户更关注coding和agentic,于是方向慢慢转到了现在这条线上。

当问及他对其他个人开发者有什么建议时,逯雨鑫说:真诚和坚持最重要。

真诚,是不要夸大模型能力。哪里强,哪里弱,都说清楚。

你要如实告诉大家。我骗你说我这有多强,但真实使用下来出现很多问题,下次我一发东西,你就不相信我了。

坚持,则是开源作者必须接受这件事:你一定会遇到不好的声音。

模型火了以后,逯雨鑫也遇到过质疑,但他还是决定坚持下去。

在他看来,开源这条路本来就很难。

就算登顶Hugging Face热榜,也不会直接带来收入。更多时候,是自己花钱买算力、花时间处理数据、回复评论、修bug,然后还要面对少数负面声音。

而支撑他一路做下来的,还有一种很个人的工作节奏。

逯雨鑫提到,自己患有ADHD。

过去这可能意味着很难长期按部就班推进一件事,但在AI这个变化极快的领域,快速切换兴趣、迅速进入hyperfocus,反而成了某种优势。

他甚至认为:“AI时代是ADHD的天下。”因为一个方向凉下来后,如果还一直钻在里面,等再转去学新的东西,可能已经晚了。

聊到最后,我们也抛出了那个最初的问题:

作为个人开发者,凭什么能在大厂中间挤进前排?

逯雨鑫的回答很中肯。

他认为大厂当然能做得更好,有更多researcher,也有更强算力。

但大厂发布开源小模型,往往还承担品牌宣传、API引流等目标;而个人开发者没有这些包袱,反而可以更专注地解决一个具体痛点。

我很高兴,但不是说我真的全面打败了他们,只是可能更认真一些。

在他看来,这正是个人开源作者的机会:不必做全能模型,而是把一个足够具体的问题做到好用。

如果你也想体验一下这款本地模型,链接已经放在下方。

温馨提示:目前最适配的平台是llama.cpp,优先推荐大家使用~

HF地址:https://huggingface.co/yuxinlu1

本文来自微信公众号 “量子位”(ID:QbitAI),作者:关注前沿科技

Crypto di tendenza

CitreaCTR

wrapped stUSDTWSTUSDT

Velodrome FinanceVELODROME

BrevisBREV

ZRX（0X）ZRX

PancakeSwapCAKE

Domande pertinenti

Q文章中提到，个人开发者yuxinlu1在抱抱脸Models Trending榜上取得了什么成就？

A这位个人开发者yuxinlu1（本名逯雨鑫）在抱抱脸Models Trending榜上，凭借两个基于Gemma 4-12B的微调模型（Coder版和agentic版）进入了榜单前列，下载量合计超过70万，一度力压GLM-5.2等大厂模型成为榜首。

Qyuxinlu1发布的这两个热门模型主要针对什么需求，有何特点？

A这两个模型主要针对本地编程和AI助手（Agent）需求。其核心特点是将强大的编程推理能力（融合了Fable 5和Composer 2.5等技术）蒸馏到了一个较小的Gemma 4-12B模型中，并以GGUF格式发布。这使得模型只需数GB显存或统一内存即可在本地离线运行，兼顾了性能、隐私和零API成本，特别适合代码开发、私有项目处理等场景。

Q根据文章，作者逯雨鑫制作这些模型的动机和过程是怎样的？

A逯雨鑫制作这些模型的初始动机是自我提升，以跟上AI领域的最新进展。这是一个纯自费的个人项目，他投入了大量时间进行数据处理、清洗、训练和评测，其中V2版本就花费了40多小时。整个过程最耗时的是数据处理，尤其是处理长对话的Agentic数据。他使用了一张RTX 5090显卡和约128GB的硬件资源，单枪匹马完成了所有工作。

Q逯雨鑫如何看待个人开发者模型能在大厂模型中脱颖而出的原因？

A他认为，大厂有资源和能力做得更好，但其开源小模型往往还承担品牌宣传、API引流等商业目标。而个人开发者没有这些包袱，可以更专注、更真诚地解决一个具体、明确的用户痛点（如本地、免费、好用的编程助手），并把这个问题做到足够好用。这种专注和解决实际问题的态度，是他认为模型能受欢迎的关键。

Q除了编程模型，逯雨鑫还在抱抱脸上发布过什么其他类型的模型？

A除了编程和Agent模型，逯雨鑫还发布过基于Claude Opus进行通用能力蒸馏的模型，以及一系列基于Qwen3.6的中文网络小说题材LoRA模型（如玄幻、都市等）。网文LoRA是他最初进入Hugging Face的切入点，旨在为作者和读者提供一个免费的小说生成辅助工具链。

Letture associate

Bitmine’s Ethereum stash rises to $9.8B: ‘The best years for crypto remain ahead’

Bitmine Immersion Technologies significantly increased its Ethereum holdings, adding 27,084 ETH last week to reach a total of 5,700,040 ETH, valued at approximately $9.01 billion. This accumulation occurred despite a decline in ETH's price and net outflows from Ethereum ETFs. The article draws parallels between Bitmine and Michael Saylor's company, MicroStrategy, noting similar scrutiny over large crypto holdings and market volatility. However, Bitmine's Chairman Tom Lee remains optimistic, citing the company's strong financial position with substantial staking revenue, cash reserves, and recent inclusion in the Russell 1000 Index. He asserts that the best years for crypto are ahead, driven by tokenization and AI advancements.

ambcrypto59 min fa

Bitmine’s Ethereum stash rises to $9.8B: ‘The best years for crypto remain ahead’

ambcrypto59 min fa

UK FCA unveils crypto rulebook: Risk-based approach starts October 2027

UK's Financial Conduct Authority (FCA) has unveiled a new, risk-based regulatory framework for cryptoasset companies, set to take effect in October 2027. This approach moves away from rigid, uniform rules to a system where capital requirements and disclosure obligations will vary based on a firm's individual risk profile. Companies will conduct their own annual risk assessments and stress tests, subject to FCA review. The rules aim to lower compliance costs for smaller firms, boost market confidence, and attract millions of new UK crypto users. The framework also establishes baseline rules for stablecoin issuers, including consumer protections, while allowing for stricter oversight of larger, systemic players. FCA executives state the rules provide needed clarity, though experts caution that regulation reduces but does not eliminate consumer risks.

ambcrypto1 h fa

UK FCA unveils crypto rulebook: Risk-based approach starts October 2027

ambcrypto1 h fa

You Use Claude and Codex Every Day, but Meta Has Restricted Internal Use

In May, Meta imposed internal restrictions on its engineers regarding the use of Claude Code and Codex, two widely used AI programming tools. Despite being a major client, Meta's guidelines, still in effect, prohibit these external models from being used for specific tasks to prevent potential "escalations with partners." The core concern is "distillation"—the risk that outputs from Claude or Codex could inadvertently contaminate the training data and evaluation processes for Meta's in-house AI coding assistant, MetaCode. If MetaCode is trained or evaluated using data generated by these external models, it risks learning their capabilities rather than developing its own, blurring the line of intellectual origin. The restrictions are precise: engineers cannot use the external models to generate test questions, debug source code, or suggest test cases. AI-generated content is also barred from environments accessible to MetaCode. However, AI can still assist with peripheral tasks like workflow setup and code organization, provided all outputs are manually reviewed. This caution reflects a broader industry dilemma. While distillation is a common technique, using a competitor's model output for training raises legal and ethical questions about the ownership of derived capabilities. Contractual terms from companies like OpenAI and Anthropic explicitly forbid using their outputs to build competing products, putting enforcement power in the hands of rivals. The move is also financially motivated, as Meta seeks to reduce its hefty internal AI spending, estimated in the billions this year. Meta's policy illustrates the delicate balance companies must strike: leveraging powerful external AI tools while safeguarding the integrity and independence of their own AI development. As AI systems increasingly help build other AIs, distinguishing the origin of capabilities becomes a fundamental challenge for the entire industry.

marsbit1 h fa

You Use Claude and Codex Every Day, but Meta Has Restricted Internal Use

marsbit1 h fa

Why Do We Need an AI Content Perspective Today?

The article "Why Do We Need an AI Content Perspective Today?" explores the complex and often contentious integration of AI into the cultural and creative industries, particularly film and television. It begins with the cancellation of Amazon's AI-generated animation "Punky Duck," highlighting the ethical debates surrounding AI content. AI's rapid advancement is transforming video production, enabling cost-effective, full-length AI films (e.g., "RAPHAEL," "Dreams of Violets") while sparking industry resistance over issues like "synthetic actors." The core debate has shifted from whether to use AI to how to use it responsibly. The article analyzes why AI's entry into film is uniquely unsettling. It distinguishes between "cultural fast food" (short-form, fast-paced content like micro-dramas) and "cultural main courses" (traditional, long-form film/TV). AI currently excels at the former, matching its fragmented narratives, shallow emotional needs, and free-to-consumer models. However, venturing into the latter challenges the human-centric essence of storytelling—creativity, emotional depth, and the unique value of human labor and experience. While AI can generate massive volumes of content and lower costs, it risks devaluing human creativity, leading to homogenized output, and creating unfair competition through potential intellectual property infringement. Its efficiency also amplifies content safety risks, making preemptive governance crucial. To counter these risks, the article proposes establishing clear boundaries guided by a human-centered AI content perspective. It outlines four principles: 1) Amplify, rather than displace, human creative space; 2) Respect and protect human creative output; 3) Ensure human creative control and responsibility remain paramount; and 4) Guarantee transparency and traceability in AI creation. The conclusion emphasizes that humans must act as the "helmsmen" of technology, steering AI development to enhance, not replace, the core human values at the heart of cultural expression.

marsbit2 h fa

Planck Retracted? The Father of Quantum Tripped by an Algorithm

The recent discovery that two articles (published in 1940 and 1942) by Max Planck, the Nobel laureate and founder of quantum theory, are marked as "retracted" on Springer's digital platform highlights a curious clash between historical publishing practices and modern automated systems. An investigation suggests these retractions are algorithmic errors, not due to fraud or misconduct. The papers, philosophical reflections on science published in *Die Naturwissenschaften*, were likely flagged by the platform's systems. One article, a republished lecture, may have been mistaken for duplicate publication. Another, sharing a title with a prior article by a different author (a common practice for continuing debates at the time), may have triggered a similar automated check. The digital versions have even been replaced with blank pages, contrary to normal practice of preserving retracted texts. This incident underscores how contemporary digital infrastructure, built around concepts like "self-plagiarism" and strict copyright, can misclassify and obscure legitimate historical scholarly communication. It serves as a warning that digital archives are not neutral mirrors of the past but are filtered by platform rules, potentially distorting the scientific record. As AI systems increasingly rely on such databases, such erroneous metadata could propagate, affecting how future tools interpret and access historical knowledge.

marsbit2 h fa

Planck Retracted? The Father of Quantum Tripped by an Algorithm

marsbit2 h fa

Trading

Spot

Articoli Popolari

Come comprare TOP

Benvenuto in HTX.com! Abbiamo reso l'acquisto di TOP AI Network (TOP) semplice e conveniente. Segui la nostra guida passo passo per intraprendere il tuo viaggio nel mondo delle criptovalute.Step 1: Crea il tuo Account HTXUsa la tua email o numero di telefono per registrarti il tuo account gratuito su HTX. Vivi un'esperienza facile e sblocca tutte le funzionalità,Crea il mio accountStep 2: Vai in Acquista crypto e seleziona il tuo metodo di pagamentoCarta di credito/debito: utilizza la tua Visa o Mastercard per acquistare immediatamente TOP AI NetworkTOP.Bilancio: Usa i fondi dal bilancio del tuo account HTX per fare trading senza problemi.Terze parti: abbiamo aggiunto metodi di pagamento molto utilizzati come Google Pay e Apple Pay per maggiore comodità.P2P: Fai trading direttamente con altri utenti HTX.Over-the-Counter (OTC): Offriamo servizi su misura e tassi di cambio competitivi per i trader.Step 3: Conserva TOP AI Network (TOP)Dopo aver acquistato TOP AI Network (TOP), conserva nel tuo account HTX. In alternativa, puoi inviare tramite trasferimento blockchain o scambiare per altre criptovalute.Step 4: Scambia TOP AI Network (TOP)Scambia facilmente TOP AI Network (TOP) nel mercato spot di HTX. Accedi al tuo account, seleziona la tua coppia di trading, esegui le tue operazioni e monitora in tempo reale. Offriamo un'esperienza user-friendly sia per chi ha appena iniziato che per i trader più esperti.

160 Totale visualizzazioniPubblicato il 2024.12.10Aggiornato il 2026.06.02

Discussioni

Benvenuto nella Community HTX. Qui puoi rimanere informato sugli ultimi sviluppi della piattaforma e accedere ad approfondimenti esperti sul mercato. Le opinioni degli utenti sul prezzo di TOP TOP sono presentate come di seguito.