Yao Shunyu's 88 Days

marsbit2026-04-23 tarihinde yayınlandı2026-04-23 tarihinde güncellendi

Özet

Yao Shunyu, a 27-year-old AI expert with a background from Princeton and OpenAI, joined Tencent in September 2025. Within 88 days, he led a major overhaul of Tencent’s AI strategy and organization, resulting in the release of Hunyuan Hy3 preview—a MoE model with 295B total parameters and 21B active parameters, supporting up to 256K context length. The launch came after Tencent leadership, including CEO Ma Huateng and President Martin Lau, openly criticized Hunyuan's earlier underperformance—citing slow development, over-reliance on superficial benchmark optimization, and poor generalization in real-world applications. Internal adoption was low, with key business units like WeChat and gaming seeking external AI solutions. Yao reshaped Tencent’s AI approach by integrating previously siloed teams, dissolving the ten-year-old Tencent AI Lab, and establishing new units focused on AI infrastructure and data. Hy3 preview was developed using co-design principles, closely aligned with product teams to ensure practical usability from the start. It has already been integrated into core products like Yuanbao, QQ, and enterprise tools. The release signals a shift from chasing rankings to building usable, scalable AI grounded in Tencent’s ecosystem. While external partnerships (like with DeepSeek and OpenClaw) helped retain users temporarily, the focus is now on making Hunyuan a reliable internal foundation. The real test lies in sustaining this new organizational momentum amid fierce c...

By | Beyond the Headlines, Written by|Huahua

Tencent Hunyuan Hy3 preview has been released.This is the first product delivered by Yao Shunyu since joining Tencent.

MoE architecture, total parameters 295B, activated parameters 21B, maximum support for 256K context length. Training started at the end of January, released in less than three months.

A model of this scale today can easily be drowned out.

But it becomes interesting when viewed against the backdrop of three months ago.

This release appears to be a model upgrade, but the real change happened outside the model. Tencent has begun using AI to rewrite its own organizational methods.

On January 26 this year, at the Tencent annual meeting, Tencent President Martin Lau did something executives rarely do: publicly复盘 why the Hunyuan large model wasn't working.

He used an analogy: a high school student memorizing answers for an exam. The report card looks good, but they are exposed once they actually take the test. The review found missing key modules.

Pony Ma's wording was more direct: Too slow. Slower by 9 months to a year.

From that annual meeting to the launch of Hy3 preview today: 88 days.

I. Memorizing Answers Doesn't Score Real Points

The story of Tencent Hunyuan begins in September 2023, when Tencent officially released the Hunyuan large model at the Global Digital Ecosystem Conference. A major player entered the field with significant fanfare.

Then it entered a logic of self-justification.

It wasn't a lack of investment or people. The problem was the path.

Martin Lau's review at the annual meeting provided the clearest diagnosis. The base model's capabilities were limited, so the team took a shortcut, using SFT (Supervised Fine-Tuning) to climb leaderboards. The effect was immediate, the scorecard was beautiful, but once it entered real business scenarios, it was exposed: poor generalization, the model's capabilities couldn't be reused in products.

Digging deeper, there were found cracks at every layer. Not enough data work, unstable pre-training, AI Infra unable to scale, lack of factors and objectives for reinforcement learning, the base model couldn't support upper-layer applications.

This directly led to a product-side dilemma. Yuanbao, Tencent's AI assistant application, had about 57 million monthly active users in the first quarter of this year. It sounds like growth, but同期 Doubao had 345 million MAU, Qwen had 166 million. The gap wasn't shrinking; it was widening.

The gap is no longer at the scale level, but in the definition power of the entry point.

More troublesome was internally. Business lines within the Tencent system—WeChat, games, advertising, enterprise services—needed AI capabilities, but the previous Hunyuan couldn't handle it. It wasn't that the businesses didn't want to use it; it was that the results weren't up to standard. Some core businesses even dared not connect to Hunyuan, preferring to find their own solutions.

A self-developed large model from a major company couldn't even get a seat at the main table in its own house. This is the most direct portrayal of Hunyuan's past困境.

At that time, Tencent's organization hadn't kept pace with the development of large models. Tencent has long been product-engineering-centric, with AI teams playing a supporting role. First make the product, then adapt the AI to it. In Martin Lau's words, Tencent's AI development was like having a product without a product manager; the R&D team had no one to control the direction, much work was done in vain.

During the same period, ByteDance spent about 90 billion yuan on AI chip procurement alone, DeepSeek shook the world with the extreme efficiency of a small team creating R1, and Alibaba's Qwen's global API call volume climbed to the forefront.

Hunyuan didn't lose to any single opponent; the organizational structure itself prevented it from entering the battlefield.

II. Borrowing a Life First

Around the Spring Festival of 2025, DeepSeek R1 exploded in popularity, hijacking the global AI industry's attention.

Tencent made an extremely pragmatic and clever decision. On February 13, Yuanbao fully integrated DeepSeek-R1 full version, opening it for free.

Yuanbao's daily active users surged more than 20 times within a month. On February 22, it surpassed Doubao to reach second place on Apple's China free App download chart, even briefly taking the top spot.

The entire industry watched Tencent's reaction speed during that window. WeChat Search, QQ Browser, Sogou Input Method, ima—a whole suite of product matrices intensively integrated DeepSeek. Even the mobile game "Peace Elite" stuffed DeepSeek into its digital spokesperson.

While the entire industry was watching DeepSeek, Tencent was the first major company to fully integrate it into its own ecosystem.

But Tencent knew better than anyone that this opportunity was borrowed.

DeepSeek helped Yuanbao attract users, but retention was another matter. The search chain was split,一部分 going through Hunyuan,一部分 going through DeepSeek, the experience was not unified.

Simply put, the essence of embracing DeepSeek was, while Hunyuan itself wasn't ready, using external capabilities to first catch the users and支撑 the scenarios.

But the problem was that Tencent's core businesses—WeChat ecosystem, enterprise services, game AI, advertising intelligent delivery—required deeply customized, controllable, and adjustable AI capabilities. A generic API couldn't solve this.

Hunyuan had to stand on its own. The question was how.

III. The House Demolisher

In September 2025, a 27-year-old young man quietly joined Tencent.

Yao Shunyu, undergraduate from Tsinghua Yao Class, Ph.D. from Princeton University, advised by Karthik Narasimhan, one of the core authors of the GPT foundational paper. During his Ph.D., he proposed the ReAct framework and Tree of Thoughts, both foundational works in the global AI Agent field.

After graduating in 2024, he joined OpenAI, deeply participating in the R&D of two core agent projects, Operator and Deep Research.

But his resume wasn't the key; more important was the architectural-level changes he brought after joining.

In December 2025, Tencent issued an internal organizational adjustment announcement, officially appointing Yao Shunyu as Chief AI Scientist of the CEO/President's Office, concurrently serving as head of the newly established AI Infra Department and Large Language Model Department,实行 dual-line reporting, directly reporting to Tencent President Martin Lau and TEG (Technology and Engineering Group) head Lu Shan.

At 27 years old, going straight to Tencent's number two and commanding two core AI departments—such promotion and authority are extremely rare in Tencent's development history.

According to media reports, the first thing he did after joining was to investigate the reasons for Hunyuan's long-term poor performance module by module, often communicating with colleagues and interns until midnight. The diagnostic results were reported to Martin Lau, directly prompting a series of subsequent organizational surgeries.

He took on not a model optimization task, but an entire set of working methods that needed to be overturned.

In December 2025, Tencent established three new core departments in one go: AI Infra Department, AI Data Department, and Data Computing Platform Department. Infrastructure first, demolishing and rebuilding the underlying technical foundation. At the same time, the company全面 accelerated the introduction of global top AI talent to补齐 technical shortcomings.

On March 20 this year, the ten-year-old Tencent AI Lab was officially disbanded. Core R&D personnel were全部并入 the Large Language Model Department, incorporated into the Hunyuan large model R&D main line, uniformly reporting to Yao Shunyu.

Since then, Tencent no longer retains dedicated AI research institutions independent of the large model system. All AI research forces were gathered, focusing on the single main line of Hunyuan.

This was a full-link rebuild, from underlying Infra to data pipelines to training processes to organizational structure. It wasn't patching the old system; it was demolishing and rebuilding from scratch, setting up a complete R&D closed loop.

In the words of Yao Shunyu's team, Hy3 preview is the beginning of Hunyuan large language model's journey from reading ten thousand books to traveling ten thousand miles.

Contrasting with the reality of the past two years where Hunyuan read books but couldn't solve problems, this statement points clearly: no more self-congratulation on test sets; go do things in the real world.

IV. Preview, Not Answer

Back to the product itself.

MoE architecture融合 fast and slow thinking, total parameters 295B, activated parameters 21B, maximum support for 256K context. Training started at the end of January 2026, launched in April.

Less than three months, from zero to usable. This itself is an important signal of Hunyuan's accelerated R&D evolution.

Industry model R&D typically includes high-quality data preparation, pre-training, post-training, and reinforcement learning/fine-tuning. If including前期 architecture exploration and后期 evaluation optimization, the cycle from 0 to 1 for a complete major version is about 6-12 months.

Tencent went against the grain, not chasing homogenized models, but结合 Tencent's core business scenario needs in social, gaming, advertising for Co-design. The advantage is that Tencent's huge investment in AI can be quickly validated by the market.

Completely opposite to the past Hunyuan.In the past, it was先打榜 then find scenarios, only to find they couldn't be used. Now it's先进场景, then show the outside world.

Before release, Hy3 preview had already undergone actual testing and collaborative adaptation in core Tencent products like Yuanbao, WorkBuddy, CodeBuddy, ima, and QQ. The model and product advanced同步 from the design stage.

This is Co-design: train while using, letting product feedback force model iteration.

In a sense, this is a direct response to Martin Lau's statement "a product without a product manager."

For Tencent internally, the changes brought by Hy3 preview might be greater than perceived externally. In the past, business lines didn't dare or want to connect to Hunyuan, each finding their own way out. A wall stood between the model team and the product team.

This time, Hunyuan truly became the model foundation for Tencent's internal businesses, no longer a vanity project requiring business line cooperation for leaderboards.

When internal businesses are willing to stake their product experience on Hunyuan, that in itself is a signal.

But Preview is Preview. The meaning is very candid: this is the first version, take it to real users and businesses to grind, use feedback to iterate.

The attitude is right, the direction is set, the product is launched. As for the result, the exam has just begun.

V. Lobster is the Bridge, Hunyuan is the Foundation

In fact, before the launch of Hy3 preview, Tencent did something else that was easily overlooked.

Earlier this year, OpenClaw exploded in popularity, the lobster frenzy swept the entire AI industry. Tencent's reaction speed was once again surprising, being almost the earliest and most comprehensive major company to embrace lobster.

WorkBuddy, QClaw, Lighthouse—a series of products based on the lobster protocol were密集上线, Tencent's product matrix was fully integrated in a short time.

Looking back now, although the lobster frenzy has slowly cooled down. But for Tencent, the value of this matter was not in lobster itself; it was more like a transition device.

It did two things. First, it allowed Tencent's product strength, scattered across various business lines, to重新 form a合力. WeChat, Yuanbao, enterprise services, developer tools—they truly协同 for the first time on this public protocol layer of lobster. Second, and more crucially, it bought time for Hunyuan.

When users poured in through various Agent entry points, Tencent used the lobster ecosystem to catch them first, while Hunyuan completed the rebuild from Infra to model behind the scenes.

There can be many Agent entry points. But what ultimately determines whether users stay is the capability of the underlying model. Lobster is the bridge, Hunyuan is the foundation. The bridge is built, and the foundation has finally caught up.

VI. The Window Won't Wait

April this year might be the most crowded month in the history of Chinese AI.

Alibaba released three strategic-level models within 72 hours, Kimi released and open-sourced the Kimi K2.6 model, with comprehensive improvements in general Agent, code, visual understanding capabilities. ByteDance's Seed continued to iterate, Doubao's ecosystem扩张不停. DeepSeek V4 was also rumored to be released in late April.(Reference reading: Liang Wenfeng and Yao Shunyu, Submitting Papers in April)

Hunyuan chose to submit its paper during this window, facing not just a technical competition, but a practical question: How long is the window period?

Tencent has China's largest social ecosystem, the most user touchpoints, and the richest application scenarios. WeChat has over 1.4 billion monthly active users. QQ, Tencent Meeting, Tencent Docs, Enterprise WeChat are all natural AI landing entry points.

But for these resources to work, the prerequisite is that the underlying model can支撑得住.

For over a year, Hunyuan's product capabilities were weak. Tencent had to borrow the heat and power of DeepSeek, had to watch Doubao pull away from itself on the user side.

Hy3 preview shows that Yao Shunyu heard Pony Ma's criticism. He heard it, and he acted.

In less than 90 days, dismantling the old pipeline, rebuilding Infra, disbanding AI Lab, merging teams,挖来 core talent, Co-design with products, delivering a usable version.

This speed itself is evidence of changed organizational efficiency.

But there is distance between hearing and doing.

Whether Hunyuan catches up fast enough ultimately doesn't depend on the parameter count of one Preview, but on whether this rebuilt organizational efficiency can be sustained.

This time, Yao Shunyu's answer sheet says Preview. Clearly, there are bigger moves behind.

Words from 【Beyond the Headlines】:

Hunyuan's biggest problem in the past was not that the model wasn't big enough, but that the organization wasn't right.

A large model that even its own businesses are unwilling to connect to is self-congratulatory no matter how many parameters it has.

The most important change with Hy3 preview is not that the parameters changed, but that the walls were demolished: the wall between model and product, the wall between research and engineering, the wall between Hunyuan and the Tencent ecosystem.

Demolishing walls is much harder than stacking parameters.

But the significance of this matter is not only for Tencent. In the large model competition, parameters, algorithms, and talent can all be chased.

What is truly difficult to replicate is whether a company has the determination to rewrite itself for AI.

İlgili Sorular

QWhat was the main issue with Tencent's Hunyuan model before the recent changes?

AThe main issue was that the model was optimized for benchmark performance through supervised fine-tuning (SFT), but it lacked generalization ability and performed poorly in real-world business scenarios. This led to internal business units being reluctant to use it, and key modules like data, pre-training, AI infrastructure, and reinforcement learning were underdeveloped.

QWho is Yao Shunyu and what role did he play in the transformation of Hunyuan?

AYao Shunyu is a 27-year-old AI scientist with a background from Tsinghua Yao Class and Princeton University, who previously contributed to foundational AI Agent work and worked at OpenAI. He was appointed as Tencent's Chief AI Scientist and head of both the AI Infra and Large Language Model departments. He led a comprehensive overhaul of the organization, infrastructure, and development process, resulting in the rapid release of Hy3 preview.

QWhat organizational changes did Tencent make to support the new development approach for Hunyuan?

ATencent established new core departments including AI Infra, AI Data, and Data Computing Platform, dissolved the ten-year-old Tencent AI Lab and integrated its researchers into the large language model team under Yao Shunyu, and focused all AI research efforts on the Hunyuan project to create a unified, efficient development pipeline.

QWhat is the significance of the 'Co-design' approach mentioned in the development of Hy3 preview?

AThe 'Co-design' approach means that the model and products were developed simultaneously, with real-world product feedback directly influencing model iteration. This ensured the model was built to meet actual business needs in Tencent's core products like Yuanbao, WorkBuddy, and QQ, moving away from just optimizing for benchmarks.

QHow did Tencent use external AI models like DeepSeek and OpenClaw during Hunyuan's transformation period?

ATencent pragmatically integrated DeepSeek-R1 into products like Yuanbao to quickly gain users and maintain market presence while Hunyuan was being rebuilt. Similarly, they adopted the OpenClaw protocol across their product matrix to unify various business lines and buy time for Hunyuan to complete its underlying infrastructure and model upgrades.

İlgili Okumalar

Can DeepSeek Save China One Trillion Dollars?

"DeepSeek and the $1 Trillion Infrastructure Question" The article examines whether DeepSeek's AI optimization breakthroughs could potentially save China $1 trillion in future AI infrastructure costs. The analysis begins with Nvidia's upcoming Vera Rubin AI platform, costing ~$7.8 million, where memory (HBM4/LPDDR5X) constitutes $2 million—a 435% cost increase in one year, highlighting how AI hardware spending is shifting toward expensive memory components. DeepSeek's approach works in the opposite direction. Through three key technical innovations showcased in DeepSeek V4, the company dramatically improves hardware efficiency: 1. **Memory Compression (MLA)**: Re-engineers the attention mechanism to compress long-context memory (KV Cache) by over 90%, drastically reducing expensive HBM usage. 2. **Selective Activation (MoE)**: Employs Mixture-of-Experts architecture where only a small fraction of parameters (e.g., 49B out of 1.6T in V4-Pro) are activated per token, allowing most parameters to reside in cheaper memory/SSD. 3. **Computation Caching**: Reuses previously computed results via cache hits, replacing expensive GPU computations with cheap memory reads. Combined, these optimizations allow the same hardware to produce approximately 4x more tokens, effectively reducing required hardware investment by 75%. DeepSeek's pricing reflects this: a 10-billion token workload costs ~$522 monthly versus ~$9,000-$10,000 for competitors. The $1 trillion savings projection stems from McKinsey's estimate that global AI infrastructure will require ~$5.2 trillion investment by 2030. As China's daily token consumption grows toward quadrillions, even marginal efficiency gains scale massively. With a conservative 4x throughput improvement, China could avoid building tens of thousands of AI data centers equivalent to ~7 trillion RMB ($1 trillion) in saved investment. Critically, this strategy shifts dependency from scarce, expensive GPU/HBM—where China lags—toward more accessible storage, caching, and systems engineering where domestic suppliers like CXMT are gaining strength. Rather than "replacing Nvidia," DeepSeek rebalances AI's value chain away from monolithic hardware dependency. Ultimately, DeepSeek's technical breakthroughs could lower the barrier to AI adoption across Chinese industries by making advanced capabilities affordable at scale—transforming who can access next-generation AI.

marsbit41 dk önce

Can DeepSeek Save China One Trillion Dollars?

marsbit41 dk önce

Overturning the Mainstream Approach to Hallucinations: Metacognition is the New Solution for Large Models to Break the Hallucination Barrier

This paper, "Hallucinations Undermine Trust; Metacognition is a Way Forward," proposes a paradigm shift in combating AI hallucination. It argues that the current mainstream approaches—striving for omniscience by scaling data/models or having AI abstain from uncertain answers—are fundamentally flawed. The former has inevitable knowledge gaps, while the latter imposes a crippling "utility tax," requiring the rejection of many correct answers to achieve high accuracy, due to models' poor "discrimination" (the ability to distinguish correct from incorrect answers internally). The core contribution is redefining hallucination not as "being wrong," but as "expressing false information with unwarranted certainty." The proposed solution is **Faithful Uncertainty** or **Metacognition**: enabling AI to accurately perceive its internal uncertainty and honestly express it in its language (e.g., using hedging phrases when unsure). This creates a more reliable assistant that provides useful information while signaling its confidence, minimizing harm from errors. The paper emphasizes that metacognition is critical for the era of AI Agents. Without it, Agents cannot intelligently decide when to use tools like search engines, leading to inefficiency and misuse. Key implementation challenges are highlighted: the "bootstrapping paradox" of training with static uncertainty data, the "alignment distortion signal" where human preference training suppresses internal uncertainty cues, and the difficulty of causally evaluating true metacognition vs. its superficial imitation. The paper concludes that the goal should not be an infallible AI, but one that is honest about the limits of its knowledge, thereby building user trust through transparent communication of its certainty.

marsbit45 dk önce

Overturning the Mainstream Approach to Hallucinations: Metacognition is the New Solution for Large Models to Break the Hallucination Barrier

marsbit45 dk önce

Hedge by Buying Gold and Oil, Chase Soaring Returns with AI. ‘Dated’ Bitcoin Enters a Bear Market

Bitcoin has recently declined, hitting a two-month low near $66,123, while Ethereum fell to a three-month low around $1,837. Analysts suggest the drop is not merely due to factors like ETF outflows or MicroStrategy's selling but reflects a deeper issue: Bitcoin is losing a broader asset competition. In a near-zero interest rate environment, Bitcoin previously thrived as an outlet for investor dissatisfaction with inflation and limited options. However, the market landscape has shifted. Bitcoin now occupies an "awkward middle ground," facing competition on three fronts. For inflation hedging, investors prefer gold, energy stocks, and commodity producers—assets with tangible backing and clearer pricing power. For growth exposure, AI-related companies with actual revenues and profits are more attractive. Even within crypto, investors can choose stablecoins, exchanges, or infrastructure firms tied directly to adoption, offering clearer business models and leverage. Thus, Bitcoin is no longer the top choice for hedging, growth, or crypto exposure. This shift is evident in market reactions: despite recent warnings about persistent inflation from a Fed official, Bitcoin did not rally as it might have in the past. Instead, capital flowed to assets with direct commodity or energy exposure. The recent ETF outflows and MicroStrategy sales are symptoms, not causes, of this new reality. Investors are becoming more selective, demanding clearer value propositions beyond mere scarcity. The emerging bear case for Bitcoin is not about it being a bubble or failed technology, but that scarcity alone is no longer sufficient.

华尔街日报48 dk önce

Hedge by Buying Gold and Oil, Chase Soaring Returns with AI. ‘Dated’ Bitcoin Enters a Bear Market

华尔街日报48 dk önce

İşlemler

Spot
Futures
活动图片