Post-95s Doctoral Graduate Ventures into World Models, FaceMind Raises Tens of Millions of Yuan

marsbitPubblicato 2026-06-26Pubblicato ultima volta 2026-06-26

Introduzione

The AI startup FaceMind, led by 95-year-old founder Dr. Lu Hongyuan, has secured tens of millions of yuan in a Pre-A Series funding round led by StarLink Capital, with existing investor 360 Capital significantly over-subscribing. Originally focused on on-device multimodal models, the company has pivoted to developing foundational world models, which aim to predict environmental changes for applications like GUI agents and embodied AI (e.g., robotics). Dr. Lu's team gained recognition for research on low-frequency word processing and the "Adam's Law" principle, work later acknowledged by Anthropic. FaceMind's proprietary system emphasizes parameter-efficient, loop-iterative architecture for long-horizon prediction. An early product, "Diedie Club," serves as a testing ground for GUI understanding. Investors highlight the team's research depth, rapid execution, and Dr. Lu's focus on fundamental model architecture over mere scale. The funding will support continued R&D and multi-scenario validation of their world model technology.

Investment Realm has learned that the world model company FaceMind recently completed a Pre-A round financing of tens of millions of yuan. The investors were StarLink Capital, with existing shareholder 360 significantly over-subscribing its follow-on investment.

It is reported that FaceMind's new round of financing is already in progress, with financial advisors such as Shendu Capital acting as FA. Investment institutions have already expressed investment intentions.

This is a young AI company. At the helm is Lu Hongyuan, born after 1995, who founded FaceMind while still a student. Over the past two years, the company has pivoted from focusing on on-device full-modal models to more fundamental world models.

As AI enters screens, software, and robots, understanding the world is becoming the next major topic.

Post-95s Doctoral Graduate Leads the Team

A World Model Team Emerges

The story of FaceMind begins with Lu Hongyuan.

A post-95s founder, Lu Hongyuan completed his bachelor's and master's studies at Imperial College London and earned his PhD from the Natural Language Processing Laboratory at The Chinese University of Hong Kong, under the supervision of Professor Lin Wei. He has long researched natural language processing and the underlying mechanisms of large models. During his PhD, he produced 14 first-author/corresponding-author papers at top-tier conferences, with several becoming highly cited in the field.

FaceMind was founded in 2023, initially targeting the research, development, and application of on-device full-modal models.

What truly brought them to public attention was the earlier discussion about "Ma Jiaqi causing a large model to fail." At that time, a large model could accurately state Ma Jiaqi's relevant career details but couldn't stably output the three characters for "Ma Jiaqi." An ordinary name unexpectedly exposed a fundamental issue in large models' language processing: before text enters a model, it is first tokenized; when the model encounters low-frequency words, uncommon names, or words from less common languages, comprehension and generation can become unstable.

Lu Hongyuan's team focused on this issue earlier. In 2025, they published a paper related to SLoW, discussing how low-frequency words affect the translation performance of large models; by 2026, their paper on Adam's Law further pushed the issue to the sentence level—the more frequent and common the expression for the same meaning, the more easily it tends to be processed and learned by the model.

More unexpectedly, the technology related to this paper was adopted by Anthropic and was liked and reposted by an Anthropic investor on X. A Chinese post-95s researcher's judgment on the underlying principles of large models was thus seen by more people.

Following this line further, FaceMind began shifting its focus to world models.

Simply put, while large language models are good at predicting the next piece of text, world models aim to predict what will happen next in an environment. In the context of screens, it's about GUI Agents understanding web pages, documents, buttons, and user intent; in the field of robotics, it's about understanding space, actions, and task outcomes.

FaceMind's self-developed world model system is developed precisely around this direction. The company aims to enhance the model's stability in long-term temporal prediction, screen understanding, and embodied tasks through a cyclic, iterative, and parameter-efficient model architecture.

DieDieShe (叠叠社) serves as an early testing ground for this capability. On the surface, it appears to be an AI bullet-comment (danmu) product that can generate interactive bullet comments in real-time based on the web pages, documents, videos, or game content a user is browsing. Looking deeper, for a GUI Agent to complete tasks, it must understand the screen, comprehend page structure, judge button positions, and predict results after clicks. Each page jump, input feedback, and task completion forms a type of high-density world model data.

This is also the opportunity FaceMind wants to seize: world models are becoming a new underlying entry point for AI.

StarLink Capital and 360 Invest

The Hottest Battlefield in Embodied AI

The latest financing comes to light.

Recently, FaceMind announced the completion of a tens-of-millions-yuan Pre-A round financing. This round not only introduced new investor StarLink Capital but also secured a significant oversubscription from existing shareholder 360.

Xiang Qiqi, the pre-investment head of 360 Group, stated, "Dr. Lu is one of the most outstanding young AI researchers I have ever met."

In his view, Lu Hongyuan focuses not on local optimizations but on the fundamental principles and architectural innovations of models. While the industry was still discussing the concept of world models, FaceMind had already started training world models from scratch, achieving industry SOTA-level results in various benchmarks. Subsequently, Adam's Law gained attention and validation from overseas leading model developer Anthropic, and the team's latest proposed Loop cyclic architecture further explores long-term temporal training issues in world models.

"The iteration speed is astonishing. Before every communication, I first look at their latest published papers and technical reports," Xiang Qiqi remarked, truly experiencing what it means to "invest once, learn for a lifetime."

Li Wenjue, a partner at StarLink Capital, said the most prominent characteristic of the FaceMind team is its combination of solid research capabilities and complex engineering implementation abilities. The core team members have long been deeply involved in fundamental AI technology, capable of forming independent judgments on cutting-edge directions and rapidly validating research results in real scenarios.

"What we see is a team with high talent density, forward-looking technical judgment, and strong execution capabilities." In her view, Lu Hongyuan embodies both the exploratory drive of a young researcher and the action-oriented mindset of an entrepreneur, capable of leading the team to continually tackle high-difficulty problems and translate technical judgments into clear R&D directions. This founder trait and team cohesion were important reasons for StarLink Capital's decision to invest.

Over the past year, world models have become a new keyword in the AI industry. Amidst the excitement, disagreements have also emerged: Will the next stage of competition continue to rely on larger data and parameters, or will it improve model efficiency in utilizing limited data through new architectures?

FaceMind has chosen the latter.

According to the introduction, the core characteristics of the company's self-developed model are cyclic iteration and parameter efficiency. Simply put, it attempts to enable the model, under the same parameter scale, to acquire stronger long-term temporal prediction and environmental reasoning capabilities. The company disclosed that its 1B-scale model's performance already rivals that of internationally comparable strong models, with improved parameter efficiency.

Currently, FaceMind has begun validating this model capability in multiple scenarios. Information shows its world model capability has been validated in simulated embodied environments, GUI Agent environments, and real-world robotic arm environments. For downstream applications, the company plans to provide a full suite of capabilities—from scenario validation, model training, and architectural deployment to inference services and continuous optimization—for partners such as robot OEMs, content platforms, chip and cloud vendors.

In Lu Hongyuan's view, the opportunity for world models will unfold alongside GUI Agents and embodied intelligence. At that point, the competition among models will be about their ability to understand tasks, predict changes, and stably complete actions. After the financing round, FaceMind will continue to invest in world model R&D and multi-scenario validation.

A young company is squeezing its way onto the table for the next generation of AI infrastructure.

This article is from the WeChat public account "Investment Realm AI," author: Wang Lu

Domande pertinenti

QWhat is FaceMind, and what recent funding milestone did it achieve?

AFaceMind is a young AI company focusing on world models and end-side multimodal models. It recently completed a tens of millions of yuan Pre-A round of financing, with Xinglian Capital as the new investor and existing shareholder 360 making an additional investment.

QWhat is the background of FaceMind's founder, Lu Hongyuan, and how does it relate to the company's research direction?

ALu Hongyuan is a '95s-born' founder with a Ph.D. from the NLP Lab of The Chinese University of Hong Kong. His research on large model underlying mechanisms, including work on how word frequency affects model performance (SLoW, Adam's Law), laid the groundwork for FaceMind's pivot to world models.

QWhat specific problem does FaceMind's world model aim to address, and how is it different from a Large Language Model (LLM)?

AWhile LLMs predict the next piece of text, FaceMind's world model aims to predict what will happen next in an environment. This includes understanding screen elements (GUI Agent) or predicting spatial actions and outcomes in robotics.

QWhat are the key technical features and validation scenarios of FaceMind's self-developed world model system?

AIts core features are loop iteration and parameter efficiency, aiming for stronger long-term prediction capability with the same parameter scale. It has been validated in simulation embodied environments, GUI Agent environments, and real robotic arm environments.

QHow do the investors (Xinglian Capital and 360) view the strengths of the FaceMind team?

A360's representative praised Lu Hongyuan as a top-tier young researcher focused on underlying principles. Xinglian Capital highlighted the team's combination of deep research capability, independent judgment on cutting-edge directions, and strong engineering execution to validate ideas in real-world scenarios.

Letture associate

Trillion-Dollar Pension Fund Entry? Franklin Bitcoin Dividend Reinvestment ETF Comes with a Built-in Selling Pressure Ceiling

Franklin Templeton has filed to launch two ETFs that embed a "default configuration" logic into Bitcoin investment, aiming to tap into massive pension fund flows. These "Bitcoin Dividend Reinvestment Index ETFs" will initially hold 95% equities and 5% Bitcoin, automatically reinvesting stock dividends to buy Bitcoin. However, a quarterly rebalancing rule forces selling of Bitcoin if its allocation exceeds 5%, capping its maximum holding at 20%. While the product cleverly circumvents advisor reluctance and compliance hurdles by labeling itself as a U.S. equity product, its actual Bitcoin buying power is minimal. Given low dividend yields (e.g., ~1% for broad market indices), annual Bitcoin purchases from a fund the size of Franklin's existing Bitcoin ETF would be a mere $3.6 million—negligible against Bitcoin's daily trading volume. Crucially, during bull markets, the fund becomes a programmed, passive *seller* of Bitcoin, potentially creating sustained sell pressure if many similar funds emerge. The strategy leverages investor inertia and automatic enrollment, similar to the success of target-date funds in 401(k) plans. It also uses an offshore Cayman subsidiary for holding Bitcoin and raises a tax complication where investors must pay taxes on dividends they never receive as cash. Although recent U.S. regulatory changes allow crypto in retirement plans, widespread adoption as a default option faces legal hurdles. The core premise remains: the system doesn't need to convince anyone to buy Bitcoin actively; it simply relies on people doing nothing.

marsbit2 min fa

Trillion-Dollar Pension Fund Entry? Franklin Bitcoin Dividend Reinvestment ETF Comes with a Built-in Selling Pressure Ceiling

marsbit2 min fa

Bitcoin Hits 20-Month Low as Largest Bull Suffers $15 Billion Paper Loss

Bitcoin Hits 20-Month Low as Major Bull Loses $15 Billion On June 25th, Bitcoin fell below $60,000, hitting a low of $58,030—its lowest level since October 2024. The sell-off triggered over $1 billion in leveraged liquidations in 24 hours, with longs accounting for $788 million. This marks a more than 53% decline from the October 2025 all-time high of $126,198. A critical factor in the downturn is the weakening position of MicroStrategy, the largest corporate Bitcoin holder. With 847,363 BTC at an average cost of $75,651, the company now faces over $14.6 billion in unrealized losses. Its core financing flywheel—raising capital to buy Bitcoin—is stalling. Its variable-rate preferred shares (STRC), a key fundraising tool, have fallen 25% below their $100 target. This raises doubts about its ability to continue providing steady institutional demand for Bitcoin. Simultaneously, U.S. spot Bitcoin ETFs are experiencing significant outflows, with a single-day net outflow of $469 million on June 24th. This represents the most severe sustained capital flight since their launch. The macroeconomic backdrop remains restrictive, with persistent inflation delaying expected Fed rate cuts. Analysts note a shift in capital allocation, with institutional funds moving away from crypto towards AI infrastructure stocks. Immediate pressure comes from approximately $10 billion worth of Bitcoin options expiring on June 26th, which could increase market volatility. The combined effect of these factors—eroding core demand pillars, macro headwinds, and capital rotation—has decisively broken the $60,000 support level.

Foresight News8 min fa

Bitcoin Hits 20-Month Low as Largest Bull Suffers $15 Billion Paper Loss

Foresight News8 min fa

STRC Falls Below $80, Can Conservative Investors Still Buy the Dip?

The article analyzes whether the STRC (a perpetual preferred stock issued by MicroStrategy) presents a buying opportunity after its price fell below its $100 par value to around $80, offering a seemingly high yield of 13-15%. The core argument is that STRC's discount reflects market skepticism about the sustainability of MicroStrategy's capital structure model, not just temporary panic. This model relies on issuing securities (like STRC) to raise funds to buy more Bitcoin, a "flywheel" that works in a bull market. The recent small sale of BTC to fund dividends, while minor, broke the psychological "never sell" anchor and signaled potential strain. Key risks identified are not a traditional Ponzi collapse but a potential breakdown in the financing narrative: 1) If Bitcoin enters a deep bear market, crushing MicroStrategy's stock premium (mNAV), its ability to raise cheap capital weakens. 2) If STRC remains deeply discounted, it signifies permanently higher funding costs. 3) The high cash dividend yield represents a significant ongoing expense. 4) If selling BTC to pay dividends becomes routine, the bullish narrative reverses. The conclusion is that STRC is not a risk-free high-yield asset. It is a high-coupon bet on whether MicroStrategy's BTC treasury financing model can withstand a bear market. Buying it is a wager that the market will continue to believe in and fund this structure at acceptable costs. The current price asks if this cycle's "casualty" might be a BTC treasury company's融资 model itself.

marsbit24 min fa

STRC Falls Below $80, Can Conservative Investors Still Buy the Dip?

marsbit24 min fa

Why Do Crypto Projects Keep Changing Their Names?

**Why Do Crypto Projects Keep Changing Names?** In the crypto world, changing a project's name is common—over 16% of projects have done so, including major ones like Polygon (formerly Matic Network). This contrasts sharply with traditional businesses, which fiercely protect brand equity. The core reason is that in crypto, brand loyalty is often weak. Users are frequently investors, airdrop hunters, or yield seekers, not traditional consumers. A name associated with price crashes, hacks, or failed narratives becomes a liability, not an asset. Renaming can be a strategic reset to shed this baggage. Name changes serve as a potent marketing tool. They can signal a genuine pivot in strategy or scope (e.g., EthSign dropping "Eth" as it expanded). However, they are often used to "narrative surf," rebranding to align with hot trends like AI, RWA, or the metaverse (e.g., Elrond → MultiversX). Critically, renaming is also a PR tactic to distance a project from past failures like security breaches (e.g., Anyswap → Multichain). The most significant risk emerges when a name change is coupled with a token migration or swap. This process can allow projects to reset exchange price charts, erase visible historical downtrends, and create an illusion of a fresh start. It often facilitates liquidity resets, where low float can be exploited for pumps. More alarmingly, migrations sometimes mask overhauls to tokenomics, introducing substantial new token supply through "ecosystem funds" or "node rewards," effectively diluting existing holders. The fundamental issue isn't renaming itself, which can be valid for strategic evolution. The problem is when it functions as an escape from history—a way to avoid accountability for past mistakes, failed promises, and poor performance. When a project announces a rebrand, the critical questions are: What tangible new capability or strategy does it represent? Has the tokenomics changed? And what part of its past is it most trying to make users forget?

marsbit30 min fa

Why Do Crypto Projects Keep Changing Their Names?

marsbit30 min fa

A Trillion-Dollar Entry Point for Pension Funds? Franklin's Bitcoin Dividend Reinvestment ETFs Come with a Built-In Selling Pressure Ceiling

Franklin Templeton filed for two ETFs on June 18 that embed a "default option" logic into Bitcoin investing. These funds—the Franklin US Equity Bitcoin Dividend Reinvestment Index ETF and the Franklin US Innovative Equity Bitcoin Dividend Reinvestment Index ETF—aim to automatically allocate a portion of investor dividends to Bitcoin, initially with a 95% stock and 5% Bitcoin allocation. The mechanism is designed for financial advisors, not retail investors. By packaging Bitcoin exposure within a standard equity fund label, advisors can bypass internal compliance restrictions against direct cryptocurrency allocation for their clients. Dividends from the stock holdings are automatically used to buy Bitcoin via spot ETFs, futures, or options. However, the structure imposes strict rebalancing rules: if Bitcoin's allocation exceeds 5%, it is trimmed back to 4.5% quarterly, with a hard cap of 20%. This means the fund becomes a systematic seller during Bitcoin price rallies. Realistically, the potential buying pressure is minimal. Based on dividend yields (approximately 1.05% for broad market, 0.52% for innovative equity), the annual inflow into Bitcoin would be a tiny fraction of the fund's assets. For comparison, Franklin's existing Bitcoin ETF ($359 million AUM) would generate only about $3.6 million in annual Bitcoin purchases—negligible against Bitcoin's daily trading volume. The innovative equity fund, heavily weighted in low-dividend stocks like Nvidia, would have even weaker buying power. The product utilizes an offshore Cayman subsidiary to hold Bitcoin, a common compliance tactic for commodity exposure in mutual funds. A key drawback for investors is the tax liability: they must pay taxes on dividends that are automatically converted into Bitcoin, requiring out-of-pocket cash for a gain they never directly receive. For the strategy to scale significantly, such funds would need to become a default or near-default option in retirement plans like 401(k)s. Recent regulatory moves, including a Trump executive order and a Department of Labor proposal offering fiduciary safe harbors for including crypto assets, could pave the way. However, widespread employer adoption likely awaits further legal clarity. Ultimately, the fund's model leverages investor inertia and automated systems, rather than convincing anyone to actively choose Bitcoin. While it creates a new, albeit small, structural buyer, its rebalancing rules also establish a built-in "selling ceiling" that could dampen price upside if similar products proliferate.

Foresight News32 min fa

A Trillion-Dollar Entry Point for Pension Funds? Franklin's Bitcoin Dividend Reinvestment ETFs Come with a Built-In Selling Pressure Ceiling

Foresight News32 min fa

Trading

Spot

Futures