Physical AI is Hot, Some New Thoughts from Me

marsbitОпубликовано 2026-05-18Обновлено 2026-05-18

Введение

The term "Physical AI" is gaining significant traction, marking a shift from AI that processes information to AI that understands and interacts with the physical world. Unlike traditional AI confined to screens, Physical AI involves integrating intelligence into robotic bodies to perform tasks in environments governed by gravity, friction, and inertia. The concept, formally defined in a 2020 paper, focuses on creating embodied systems that can complete perception-to-action cycles. 2026 is identified as a pivotal "deployment year," where the focus moves from demonstrations to practical utility. Companies like China's Zhiyuan Robotics have transitioned to live, unscripted factory deployments and announced mass production targets. Internationally, Figure AI, after a major funding round, shifted to its own neural system, while NVIDIA partnered with major industrial robot firms to upgrade millions of existing units with AI capabilities. A key trend is the crossover from the automotive supply chain. Companies like Aptiv and Valeo are entering the Physical AI space, leveraging their expertise in sensors, control systems, and mass production from the autonomous vehicle sector. This "technology spillover" is accelerating development, as seen with Tesla's plans to repurpose automotive production lines for its Optimus robot. The technical breakthrough enabling this progress is the engineering maturity of "world models." Previously theoretical, these AI models can now simulate physica...

Article | New Mou, Author | Lu Yao

Recently, a term has been buzzing in certain circles: "Physical AI".

This term was actually mentioned over ten times by Jensen Huang in his speech at the Las Vegas CES early last year, but it wasn't until this year that "Physical AI" truly exploded in significance.

So, what exactly is "Physical AI"?

A couple of days ago, I saw a video of a robot watering flowers. The robot first walked to the faucet, turned on the valve, filled the watering can, then turned around, walked to the flower pot, adjusted its angle, and poured the water in evenly. The spout didn't hit the edge of the pot, and no water spilled out.

For a machine to understand "carrying a cup of water," it needs to know the cup is cylindrical, calculate the precise force needed to grip it without slipping or crushing it, understand that water is a liquid and will spill if shaken, and constantly adjust its arm angle while walking to compensate for body movement.

These things, a human three-year-old can do intuitively. But for AI, this is a huge leap. Over the past decade, AI learned to see, hear, speak, and draw, but it remained trapped within screens. What Physical AI aims to do is put this smart brain into a body that can run, jump, grasp, and manipulate objects in the real world.

Simply put, Physical AI is about making AI understand and act upon the physical world. It's no longer just processing text and images; it's about performing correct actions in an environment governed by gravity, friction, and inertia.

A fact seldom discussed domestically is that the term "Physical AI" didn't originate from some chip giant's PR department. This concept first appeared in a 2020 paper published in *Nature Machine Intelligence*. The paper systematically defined Physical AI for the first time:

A class of embodied systems capable of performing tasks typically associated with intelligent organisms. The core lies in deeply integrating physical laws into the AI system, so machines are no longer "physically blind" and can complete the perception-to-action loop.

From the academic world's opening shot in 2020 to the industry's full embrace in 2026, there was a gap of six whole years. In these six years, sensor costs dropped by several orders of magnitude, edge AI computing power moved from theory to engineering, and the reliability and mass production capability of robot bodies quietly reached an inflection point — these were the hidden forces pushing Physical AI from papers to production lines.

From Demonstration to Working

If the large language models of 2023 taught AI to chat, then the keyword for Physical AI in 2026 is just one thing: work.

The change is visible to the naked eye.

This time last year, the way robot companies showed off their muscles was still by filming demo videos, setting up scenes, rehearsing repeatedly, and shooting in one take. Impressive to watch, but you never knew how many takes they did.

This year, the playbook is completely different. This year, Zhi Yuan Robotics did something on a 3C production line in Nanchang: they threw a robot into a real factory and had it work continuously for several hours, live-streaming the entire process. No preset script, no limited scene — just the same production line workers face daily. Hundreds of thousands of people watched online.

A month later, Zhi Yuan announced in Hong Kong the mass production of 10,000 humanoid robots. The leap from one prototype in the lab to 10,000 on a production line is a milestone that changes the game.

Zhi Yuan's approach is interesting. Most robotics startups focus on a specific segment — some only on the body, some only on the large model, some only on dexterous hands. Zhi Yuan chose another path: doing the full stack, simultaneously developing the body manufacturing, AI model, dexterous manipulation, and data collection, while also investing in over 60 upstream and downstream companies in the industry chain.

The cost of this approach is clear: the parent company has over a thousand employees, expected to grow further by the end of this year, with an annual salary expenditure alone reaching billions. This path burns cash, but once proven, its moat is also the deepest.

Zhi Yuan's founder Deng Taihua proposed an analytical framework called the "XYZ Curve." He said embodied intelligence development has three stages: X is the development and experimentation phase, where people are still playing with demos; Y is the deployment and growth phase, where robots actually start working on production lines; Z is the ultimate intelligent emergence phase.

He characterized 2026 as: "the first year of deployment phase, officially moving from 'can move' to 'can work'." The difference between "can move" and "can work" is just one word, but it marks the entire industry's coming of age.

The pace overseas is equally intense, not slowing down across the Pacific.

American humanoid robot company Figure AI is an unavoidable name on this track. In September last year, they completed a funding round of over $1 billion, raising their valuation to $39 billion, making them the world's highest-valued humanoid robot company at the time.

A month later, they released a new generation product, Figure 03, standing 1.68 meters tall and weighing about 60 kilograms, demonstrating household chores like watering plants, serving dishes, and folding clothes. Founder Brett Adcock specifically added on social media: all actions were autonomously completed by the robot, with no human remote control.

Technologically, it's noteworthy that Figure made a major strategic pivot, terminating its cooperation with OpenAI and fully transitioning to its self-developed neural network system, Helix.

This system mimics human cognition with a three-layer structure: the bottom layer handles balance and instinctive reactions, the middle layer translates brain commands into motor control commands 200 times per second, and the top layer is the logical brain, responsible for understanding scenes and making decisions. This "instinct-reflex-thought" three-tier architecture is quite clever, essentially giving the robot a non-crashing nervous system.

Another thing worth mentioning. At this year's GTC conference, NVIDIA announced a move: deep cooperation with the world's four industrial robotics giants — ABB, KUKA, Yaskawa, and Fanuc. Over 2 million industrial robots already installed on production lines worldwide can now use NVIDIA's simulation platform for virtual commissioning and AI training.

These four companies combined account for over half of the global industrial robot market share. In the next decade, these robots will undergo an upgrade from "traditional programming" to "AI-driven." Whichever software platform can embed itself into this process will essentially secure the "operating system" layer for the next generation of industrial automation. NVIDIA clearly doesn't want to miss this boat ticket.

Cross-Border Sprint from the Supply Chain

Another interesting phenomenon: automotive supply chain companies are entering the Physical AI track en masse.

At this year's Beijing Auto Show, traditional automotive suppliers like Aptiv, Valeo, Horizon Robotics, and Qianxun SI showcased robotics-related solutions in clusters. Many industry insiders realized then that embodied intelligent perception is the same as automotive intelligent driving perception; automotive solutions can be directly applied to humanoid robots.

Thinking about it carefully, it makes sense. The automotive intelligent driving system is essentially a perception-decision-execution loop for a "mobile robot." Its three core modules — visual perception, path planning, and real-time control — are highly homologous in technical architecture with traditional industrial robots and humanoid robots.

Automotive suppliers' cameras, radars, steer-by-wire chassis, and real-time operating systems can be migrated to the robotics field with slight adaptation. In this sense, the hundreds of billions in R&D spending the automotive industry burned over the past decade on intelligence are now flowing into the Physical AI track as "technology spillover."

This might explain why Chinese robotics companies can so quickly enter the mass production stage. Manufacturing capabilities and supply chain management aren't built from scratch; many are readily available. Those component suppliers already honed on automotive production lines for over a decade are now applying their skills on a new battlefield.

There are ready-made cases abroad. Take Tesla, for example. Its first-generation humanoid robot Optimus is also accelerating its entry. Previously, Tesla clearly announced in its Q1 2026 earnings call that the company would transition to "a future centered on AI, autonomous taxis, and humanoid robots," with the first-generation robot production line having a capacity of 1 million units, replacing the current Model S and Model X production lines.

The number 1 million might seem exaggerated in today's context, but Tesla's logic is clear: it wants to directly replicate the large-scale production capabilities and supply chain management experience accumulated in automobile manufacturing into the humanoid robotics field.

What Musk wants is not a "robot that can move," but a "mass-produced tool" that can work alongside humans in factories. Once this path is proven, its impact on the manufacturing automation landscape will be no less than that of the Model 3 on the fuel vehicle market.

World Model: Why It Become Usable This Year

Having covered the major players' moves at the industry level, let's zoom in one layer deeper: what's the technological foundation of this Physical AI race?

To sum it up in one sentence: the engineering breakthrough of world models. I think this is also the most critical point for understanding this wave.

The concept of "world model" isn't new; it was proposed back in 2018. The core idea is simple: let AI develop an internal understanding of how the physical world operates, so it can predict "what will happen if I push this cup." But previously, this mostly existed only in papers — too computationally expensive, unstable generation quality, unsuitable for real-time interaction.

The turning point happened in the last year. NVIDIA launched a series of models called Cosmos, whose core capability is generating action data conforming to physical laws from text or images.

For example: if you want to train a robot to move boxes in various weather conditions, you don't need to actually film videos in factories during rain, snow, or at night. Set the parameters in a simulation environment, and Cosmos can directly generate massive amounts of highly realistic training data covering various extreme scenarios.

Early this year, the Ant Lingbo team open-sourced a framework called LingBot-World, specifically for interactive world models. It can achieve nearly 10 minutes of continuous, stable video generation, with end-to-end interaction latency controlled within seconds. Users can control virtual characters in real-time with a keyboard and mouse like playing a game, with the model providing instant feedback on scene changes. The significance is that world models moved from "offline rendering" to "online interaction," boosting training efficiency by an order of magnitude.

Another startup, Jijia Vision, released the GigaWorld-1 platform, positioned as a "digital sandbox" for the physical world. A month later, Alibaba's ABot-PhysWorld surpassed it on a benchmark called WorldArena, topping the comprehensive rankings. Competition is advancing month by month.

The importance of these open-source projects lies not in how high their parameters are, but in turning a game "only giants could play" into a tool "small teams can also use." When enough people are building the wheels, more cars will truly start running.

The reason world models have become a core component in the Physical AI era is that they answer that long-unresolved question: how to enable robots to learn the complex laws of the physical world in a low-cost, high-efficiency way?

Training data from the real world is extremely costly to obtain and inherently carries distribution bias. It's hard to gather all edge scenarios in reality, like factory night shifts during a blizzard, emergency situations during a logistics warehouse blackout, or sudden human intervention on a production line. But synthetic data can. By manipulating scene parameters with prompts in a simulation environment, researchers can generate large-scale training videos covering extreme conditions within hours, which would take months or even years under the traditional real-data collection route.

The leverage effect of this breakthrough might exceed any single algorithm improvement.

The Paradigm Has Changed

The breakthrough in world models is actually just one part of the evolution of the Physical AI tech stack. Changes in underlying technology are driving a fundamental architectural rebuild of the entire robotics industry.

Traditional robots use a "sense, plan, act" three-stage approach. First, sensors perceive the environment, then engineers write rules telling the machine how to plan its path, and finally, it executes the action. This works fine in structured environments like factory assembly lines, but once the scenario gets complex, its shortcomings are exposed. The machine only follows the preset script and gets stuck when encountering unseen situations.

Physical AI takes a different path: "perception, reasoning, execution." After perception, it doesn't go through human-written rules but uses a trained neural network to reason what to do and then execute. The essential difference is that the former is "the engineer thinks for the machine," while the latter is "the machine understands the physical world itself."

The International Federation of Robotics released a technology roadmap this year, predicting that within the next three years, 80% of new robot models will adopt this new architecture, with the traditional three-stage approach gradually exiting the mainstream. This isn't a minor tweak; it's a full paradigm shift.

As an industry expert aptly summarized: Physical AI is the ultimate mode of AI development because it needs to understand not only human instructions but also all the laws of the physical world.

Jensen Huang said the "ChatGPT moment" for robotics development has arrived. In my view, the nature of Physical AI's "moment" is completely different from that of language models. The "that moment" for language models was when ordinary people worldwide first got their hands on AI. The "that moment" for Physical AI is when AI truly starts working for the first time.

Currently, this track is at a very special stage: the direction is locked in, the concept is validated, but the landscape isn't settled.

On one hand, making demos and achieving mass production are two completely different capability systems. Getting one prototype to work is one thing; having ten thousand products perform consistently in real-world scenarios tests manufacturing consistency, supply chain resilience, scenario generalization ability, and operational systems. These have little to do with AI algorithms, but each is enough to halt a batch of players. On the other hand, real-world data collection is expensive, time-consuming, and has limited coverage, which almost predestines that large-scale training for Physical AI will heavily rely on synthetic data.

At the same time, from automotive supply chains and traditional industrial automation to consumer electronics manufacturing, industries that seem unrelated to "AI" are accelerating their entry into Physical AI through technology spillover. Their manufacturing capabilities, supply chain management experience, and scenario resources might be the key variables determining the speed of Physical AI's practical application.

An intuitive judgment is this: look back at the AI wave ignited by ChatGPT in early 2023. The ones who captured the most value weren't the model makers, but the infrastructure providers. Will this wave of Physical AI replay the same script?

NVIDIA's moves suggest it's betting on this direction, but the story isn't finished. 2026 is the first year of the deployment phase; industrial competition has just begun. Looking back three years from now, which names are still at the table and which have been eliminated might surprise most people.

Связанные с этим вопросы

QWhat is Physical AI and how is it fundamentally different from previous AI developments?

APhysical AI refers to an intelligent, embodied system that can understand and interact with the physical world by integrating physical laws into its AI framework. Unlike earlier AI models confined to processing digital data like text and images, Physical AI operates within environments governed by gravity, friction, and inertia, enabling it to perform tasks like grasping, moving, and manipulating real-world objects.

QWhat were the key industry developments in 2026 that marked the transition of Physical AI into a deployment phase?

AIn 2026, key developments included Zhiyuan Robotics conducting live, unscripted demonstrations of its humanoid robots on real 3C production lines, announcing mass production of 10,000 units. Internationally, Figure AI released its Figure 03 model and shifted to its in-house Helix neural system. Additionally, NVIDIA partnered with four major industrial robotics firms to integrate AI training into existing robotic fleets, signaling a shift from prototype demonstrations to practical, scalable deployment.

QHow is the automotive supply chain contributing to the advancement of Physical AI?

AAutomotive suppliers are leveraging their expertise in sensors (cameras, radar), drive-by-wire systems, and real-time operating systems developed for autonomous vehicles. This technology is highly transferable to robotics for perception, planning, and control. Companies like Aptiv, Valeo, and Horizon Robotics are applying these solutions to the Physical AI domain, providing mature manufacturing capabilities and supply chain management that accelerate the transition of robots from labs to mass production.

QWhat is a 'World Model' and why has it become a critical technological foundation for Physical AI in 2026?

AA 'World Model' is an AI system that learns an internal understanding of physical world dynamics, allowing it to predict outcomes of actions (e.g., what happens if a cup is pushed). In 2026, its engineering breakthrough, led by models like NVIDIA's Cosmos and open-source frameworks like LingBot-World, enabled the efficient generation of massive, realistic synthetic training data. This allows robots to learn complex physical interactions and edge-case scenarios in simulation at low cost and high speed, which is impractical with real-world data collection alone.

QHow is the traditional robotics architecture being transformed by the Physical AI paradigm?

AThe traditional 'Sense, Plan, Act' architecture, which relies on pre-programmed rules for specific environments, is being replaced by Physical AI's 'Perception, Reasoning, Execution' paradigm. Instead of following fixed scripts, robots now use trained neural networks to reason and make decisions based on their understanding of the physical world. This shift enables adaptability in unstructured environments. Industry forecasts suggest that 80% of new robot models will adopt this new architecture within three years, representing a fundamental paradigm change in the field.

Похожее

BNB Chain Releases Research Report, Exploring Post-Quantum Cryptography Migration Path for BSC

BNB Chain, a leading Layer-1 blockchain ecosystem, has released a research report exploring the potential migration path for BNB Smart Chain (BSC) to post-quantum cryptography. The study evaluates replacing traditional cryptographic systems with quantum-resistant alternatives, specifically examining the use of ML-DSA-44 for transaction signing and pqSTARK for aggregating validator consensus signatures. While quantum computers are not currently a practical threat to existing blockchain cryptography, the research represents a proactive effort to ensure long-term network security and infrastructure resilience. The report assessed several core areas of the BSC tech stack, including post-quantum transaction signing, validator signature aggregation, transaction validation, public key storage, and network performance under increased data loads. A key finding is that achieving post-quantum readiness is technically feasible today but requires significant trade-offs in scalability. Test data indicates: • Transaction size would increase from ~110 bytes to ~2.5 kilobytes. • Block size would grow from ~110 kilobytes to ~2 megabytes. • Native transfer TPS would decrease from 4,973 to 2,997. The primary performance bottleneck is not signature verification itself, but the increased network transmission overhead caused by larger transaction and block sizes. Conversely, the pqSTARK aggregation technology proved highly efficient, compressing validator signatures by an approximately 43:1 ratio, which helps manage consensus-layer overhead. The report notes that post-quantum alternatives for areas like P2P handshakes and KZG commitments were not within the scope of this evaluation and require further research and broader ecosystem coordination. BNB Chain emphasizes this work is a research-oriented exploration and not a response to any imminent security threat.

marsbit13 мин. назад

BNB Chain Releases Research Report, Exploring Post-Quantum Cryptography Migration Path for BSC

marsbit13 мин. назад

After Developer Numbers Halved: Crypto Isn't Dead, It's Just Giving Up Talent to AI

The title "After a 50% Drop in Developer Count: Crypto Isn't Dead, It's Just Ceding Talent to AI" suggests a shift, not an end. The article analyzes GitHub data showing a significant drop in overall Crypto developer activity from a peak of 45K monthly active developers in 2022 to about 23K in 2026. However, this masks a deeper trend of "talent deleveraging." The exodus consists mainly of newcomers who entered during the bull market for hype-driven roles (e.g., NFT contracts, forked DeFi protocols), with over 50% of developers with less than one year of experience leaving. In contrast, established developers (2+ years of experience) have hit record highs, contributing roughly 70% of the code. They are consolidating in ecosystems with real users and revenue, like Bitcoin and Solana. These experienced builders possess unique skills forged in Crypto's "code is law" environment: the ability to build trust and functional systems from scratch in the absence of external authority or rules, with zero tolerance for error. The article argues that AI's scaling faces structurally similar trust, coordination, and verification problems—particularly regarding compute aggregation, multi-agent incentive alignment, and autonomous payments. Crypto builders are already applying these skills in AI. Examples include CoreWeave (mining to AI compute), OpenRouter (NFT marketplace routing to AI model routing), and projects like Hyperbolic (using crypto-native mechanisms for decentralized compute verification) and EigenLayer (applying restaking logic to AI agent governance). Stablecoin infrastructure is becoming critical for AI agent micro-payments (e.g., x402 protocol). The role of these builders is evolving from writing smart contracts to "designing trusted mechanisms for autonomous AI systems." This shift is reflected in new hiring trends at major exchanges and significant venture capital flowing into the crypto-AI convergence (e.g., funds from Paradigm, Haun Ventures). The article concludes that while developer numbers have halved, the core density of talent has increased, and their uniquely cultivated skills are finding a new, larger stage in the AI era.

marsbit23 мин. назад

After Developer Numbers Halved: Crypto Isn't Dead, It's Just Giving Up Talent to AI

marsbit23 мин. назад

After the Developer Count Halved: Crypto Is Not Dead, It's Just Ceding Talent to AI

Following a significant decline in the total number of open-source crypto developers, from a peak of 45K in 2022 to approximately 23K by 2026, this article argues the industry is undergoing a "talent deleveraging" rather than a collapse. The exodus primarily consists of newcomers who entered during the bull market, while the core of experienced developers (2+ years) has grown to a record high, contributing around 70% of code. These established builders are concentrating in ecosystems with real users and revenue, like Bitcoin and Solana. The article posits that crypto has cultivated a unique skill set in building trustless, autonomous systems with near-zero tolerance for error—a capability now finding high demand in the AI era. As AI scales, it faces structural gaps in decentralized compute aggregation, multi-agent coordination/incentive alignment, and autonomous payment infrastructure. Crypto builders are transitioning their expertise to address these exact problems. Examples include CoreWeave (mining to AI compute), Hyperbolic (decentralized compute verification), EigenLayer (extending restaking mechanisms to AI agent governance), and the x402 protocol (enabling AI agent micro-payments via stablecoins). The role of the crypto builder is evolving from writing smart contracts to designing the rule-based, trust-minimized frameworks necessary for AI-native systems. Venture capital is increasingly funding this convergence, viewing it as a structural opportunity rather than a narrative shift. The core talent and systemic design principles from crypto are not disappearing but being re-priced and applied to the foundational challenges of scalable AI.

链捕手27 мин. назад

After the Developer Count Halved: Crypto Is Not Dead, It's Just Ceding Talent to AI

链捕手27 мин. назад

A Quick Look at the Latest Moves of the 24-Year-Old 'AI Stock God': Sixty Percent of the Portfolio Hedging Against Semiconductor Downturn

24-year-old AI investing prodigy Leopold Aschenbrenner's fund, Situational Awareness LP, has disclosed its Q1 2026 13F holdings. The fund's total portfolio nominal value surged 148% to $13.7 billion, driven by both investment gains and significant new capital inflows. The most striking move was the establishment of massive short-term hedges against potential volatility in the AI semiconductor sector. Over 60% of the fund's nominal exposure is now in put options (bets on declines) targeting major AI hardware stocks like NVIDIA (NVDA), VanEck Semiconductor ETF (SMH), Broadcom (AVGO), and AMD. Notably, the fund also holds call options (bets on rises) on some names like Micron (MU) and TSMC, indicating it expects extreme price swings in these stocks. Alongside these hedges, the fund remains a long-term bull on AI infrastructure. It significantly increased its equity stakes in companies like GPU cloud provider CoreWeave (CRWV) and added to positions in power/energy infrastructure firms like Bloom Energy (BE), albeit after taking substantial profits on the latter. The fund also exited positions in optical communication hardware (LITE, COHR) and reduced leverage by clearing out large call option positions on Intel and CoreWeave. In essence, the portfolio reflects a dual strategy: cautious on near-term semiconductor valuations and potential over-extension, while maintaining a conviction that the true long-term bottlenecks and value will be in the underlying infrastructure powering the AI revolution—such as energy, data centers, and compute availability.

marsbit33 мин. назад

A Quick Look at the Latest Moves of the 24-Year-Old 'AI Stock God': Sixty Percent of the Portfolio Hedging Against Semiconductor Downturn

marsbit33 мин. назад

Торговля

Спот
Фьючерсы

Популярные статьи

Неделя обучения по популярным токенам (2): 2026 может стать годом приложений реального времени, сектор AI продолжает оставаться в тренде

2025 год — год институциональных инвесторов, в будущем он будет доминировать в приложениях реального времени.

1.8k просмотров всегоОпубликовано 2025.12.16Обновлено 2025.12.16

Неделя обучения по популярным токенам (2): 2026 может стать годом приложений реального времени, сектор AI продолжает оставаться в тренде

Обсуждения

Добро пожаловать в Сообщество HTX. Здесь вы сможете быть в курсе последних новостей о развитии платформы и получить доступ к профессиональной аналитической информации о рынке. Мнения пользователей о цене на AI (AI) представлены ниже.

活动图片