# Пов'язані статті щодо Robotics

Центр новин HTX надає останні статті та поглиблений аналіз на тему "Robotics", що охоплює ринкові тренди, оновлення проєктів, технологічні розробки та регуляторну політику в криптоіндустрії.

Li Fei-Fei's Latest Long-Form Article: When Video Generation, Robotics, and NVIDIA All Call Themselves World Models, We Need a Taxonomy

In a new article, Dr. Fei-Fei Li addresses the widespread and often inconsistent use of the term "world model" in AI. She proposes a clear, functional taxonomy rooted in the classic Partially Observable Markov Decision Process (POMDP) loop (agent → action → state → observation → agent). According to this framework, current systems called "world models" are different projections of this loop, categorized by their primary output: 1. **Renderers**: Output observations (pixels). Their goal is visual fidelity for human consumption (e.g., video generation models like Sora). They are the most commercially mature but are limited by a focus on appearance over physical accuracy. 2. **Simulators**: Output states (geometric, physical, dynamic representations). They provide a structurally accurate world for both human professionals (e.g., architects) and computational agents (e.g., robots for training). Li argues simulators are the crucial, underappreciated bridge, as they can underpin both rendering and planning. 3. **Planners**: Output actions. Given an observation and a goal, they decide what an agent should do next (e.g., robotic action models). This area is highly promising but remains the least mature for real-world deployment. Li highlights a key trend: the boundaries between these three categories are beginning to blur, as they all rely on a shared underlying understanding of geometry, physics, and dynamics. The logical endpoint is a unified world foundation model capable of switching between rendering, simulation, and planning based on downstream needs. This convergence, she concludes, is central to advancing spatial intelligence—enabling machines not just to talk about the world, but to truly understand, imagine, and interact with it.

marsbit10 год тому

Li Fei-Fei's Latest Long-Form Article: When Video Generation, Robotics, and NVIDIA All Call Themselves World Models, We Need a Taxonomy

marsbit10 год тому

Li Feifei's Latest Article: When Video Generation, Robotics, and NVIDIA All Claim to Have 'World Models,' We Need a Taxonomy

"World Model" has become a widely used yet ambiguous term in AI. Drawing from the classic POMDP framework (agent → action → state → observation), this article proposes a functional taxonomy to clarify the concept. It identifies three distinct types, categorized by their output in the perception-action loop: 1. **Renderers**: Output visual observations (pixels). These models, like advanced video generators, prioritize visual fidelity but often lack underlying physical accuracy. 2. **Simulators**: Output the state of the world (geometry, physics, dynamics). They provide a structurally accurate representation for professionals (e.g., architects) and serve as training environments for robots and AI agents. 3. **Planners**: Output actions. Given an observation and a goal, they determine what an agent should do next, closing the perception-action loop (e.g., vision-language-action models). While renderers are currently the most commercially mature and planners are the most aspirational, the article argues that **simulators are the crucial, underappreciated hub**. By working at the level of geometry and physics, a simulator can project upwards to create visuals for humans and downwards to predict action consequences for agents. The future lies in the convergence of these three functions. Emerging research and products, like World Labs' Marble model which outputs both visual splats and physical collision meshes, are beginning to blur these boundaries. The logical endpoint is a unified world foundation model capable of rendering, simulating, and planning based on a shared understanding of spatial and temporal structures—ultimately enabling machines to understand, imagine, and interact with the physical world.

链捕手11 год тому

Li Feifei's Latest Article: When Video Generation, Robotics, and NVIDIA All Claim to Have 'World Models,' We Need a Taxonomy

链捕手11 год тому

In the First Half of the Year, Half of VC Money Flowed to AI, with These 30 Companies Alone Raising Over 170 Billion Yuan

First Half of 2026: VC Investment in AI Explodes, with 30 Top Companies Raising Over 170 Billion RMB In the first half of 2026, China's AI sector saw a massive surge in venture capital, with total equity financing exceeding 300 billion RMB—already surpassing the entire 2025 total. Key trends include: * **Massive Funding Scale:** The AI track recorded 1,203 financing events totaling over 300 billion RMB. Investment peaked in June, partly driven by DeepSeek's landmark 51-billion-RMB Series A round. * **Geographic Concentration:** Beijing, Hangzhou, Shanghai, and Shenzhen dominated, accounting for 74% of deals and 86% of total funding. Beijing led with 95.5 billion RMB, while Hangzhou surged to second place due to DeepSeek's round. * **Sector Focus:** * **Large Models** were the top draw, securing over half of all funds (nearly 1.6 trillion RMB). * **AI Infrastructure** (compute, chips) and **Embodied AI** (e.g., robotics) were other major investment areas, with the latter being the most active in number of deals. * **AIGC Applications** attracted significant capital (59.6 billion RMB), indicating strong belief in near-term commercialization. * **Investment Stage Logic:** Capital followed a clear strategy: heavy bets on growth-stage companies (A/B rounds), major funding for mature leaders, and widespread, smaller-scale seeding of early-stage innovators. * **Notable Early-Stage Trends:** World models (seen as the "OS" for embodied AI) attracted the most early capital. Angel/seed rounds reached unprecedented sizes ("inflation"), and investment shifted from foundational large models to downstream applications like robotics and physical AGI. * **Top Companies:** The 20 largest mid/late-stage deals raised 1.565 trillion RMB. Leaders include the "Big Three" large model firms (DeepSeek, StepFun, Kimi), seven leading humanoid robot companies ("Seven Samurai"), and top AIGC application players. * **Outlook:** Full-year 2026 funding is projected to exceed 6 trillion RMB. However, consolidation is expected in the large model sector, with the window for pure-play general AI startups closing. Survival will depend on finding niche verticals or securing strategic backing.

marsbit2 дні тому 09:01

In the First Half of the Year, Half of VC Money Flowed to AI, with These 30 Companies Alone Raising Over 170 Billion Yuan

marsbit2 дні тому 09:01

Introduction to the Concept of World Models: A Story from Psychology to the Main Battlefield of AI

**World Models: From Psychology to AI's Core Concept** "World model" is a trending but often confusing term in AI, describing a system that allows machines to internally simulate, predict, and rehearse potential outcomes before taking real-world action—like a mental "sandbox." While definitions vary—Yann LeCun emphasizes physical understanding, OpenAI's Sora is a video-based "world simulator," Google DeepMind's Genie 3 creates interactive 3D environments, and companies like Alibaba and Tesla focus on practical applications—the core goal is consistent: reduce reliance on vast real-world data by creating an internal, predictive model for safer and more efficient AI. The concept has deep roots, tracing back to psychologist Kenneth Craik (1943). In AI, it was revitalized by researchers like David Ha and Jürgen Schmidhuber (2018). Major technical approaches include: 1) generative video models (e.g., Sora) for visual realism; 2) abstract predictive models (e.g., LeCun's JEPA) for efficiency and physical reasoning; and 3) explicit 3D simulators (e.g., NVIDIA Omniverse) for precision. Fei-Fei Li proposes a classification based on the AI action loop: renderers (output observations), simulators (output world states), and planners (output actions). The emerging "World Action Model" (WAM) paradigm aims to unify future prediction and action generation. An industry framework is forming: upstream (data, compute, sensors), midstream (general and vertical platforms), and downstream applications (autonomous driving, robotics, gaming, etc.). Autonomous driving is currently the most mature use case. The current lack of a unified definition reflects the field's early, dynamic stage, similar to past tech revolutions. Different approaches—focusing on pixels, physics, or behavior—represent parallel explorations of how best to compress and understand the world. This diversity, while seemingly chaotic, signals that world models have moved from an academic idea to a critical industrial battleground, ultimately aiming to give machines the ability to understand, imagine, and reason about the world.

marsbit06/29 05:09

Introduction to the Concept of World Models: A Story from Psychology to the Main Battlefield of AI

marsbit06/29 05:09

A 380% Soar, Shenzhen’s 100-Billion-Yuan IPO Rings the Bell

HKC Holdings, a major Chinese display panel manufacturer, has successfully listed on the Shenzhen Stock Exchange's main board. The company's shares surged over 380% on its debut, pushing its market capitalization to around 350 billion yuan (formerly reaching 500 billion yuan). Founded by Wang Zhiyong in Shenzhen's Huaqiangbei electronics market nearly three decades ago, HKC evolved from assembling monitors to becoming a global top-tier supplier of semiconductor display panels for TVs, monitors, and smartphones. The IPO marks a significant milestone for HKC and its backers. The company's growth into the capital-intensive panel manufacturing sector was supported through partnerships with state-owned capital from regions like Chongqing, Mianyang, and Chuzhou. Its shareholder list also includes BOE Technology's investment arm. In recent years, HKC reported strong financials, with core panel business contributing over 70% of revenue and clients including Samsung, TCL, and Xiaomi. This listing is seen as part of a broader trend in Shenzhen's evolving tech landscape. Beyond established giants, the city is nurturing clusters of leading companies in specialized sectors like robotics—exemplified by the "Shenzhen Robot Valley"—and storage chips, where a group of firms dubbed the "Storage Five Tigers" has achieved a combined trillion-yuan market valuation. Shenzhen's strategic focus on emerging industries such as AI terminals, low-altitude economy, and humanoid robotics aims to build new industrial depth and foster the next generation of tech champions.

marsbit06/26 04:38

A 380% Soar, Shenzhen’s 100-Billion-Yuan IPO Rings the Bell

marsbit06/26 04:38

The War Without a Unified Name: The Domestic Tech Giants' World Model Landscape

The article outlines the diverse and fragmented landscape of "World Models" in China's tech industry, where major players are pursuing similar goals under different names like world foundational models, physical AI, or integrated within autonomous driving and embodied intelligence systems. The core aim is to enable AI to create an internal, dynamic environment for simulation, reasoning, and learning, reducing reliance on infinite real-world data. This "data engine" allows for unlimited generation, experimentation, and iteration. The report categorizes the approaches of different companies: * **Internet Giants:** Alibaba is developing models for linguistic, virtual, and physical worlds (Qwen-AgentWorld, HappyOyster, Qwen-RobotWorld). Tencent's HY-World focuses on 3D, game, and social scenarios. ByteDance leverages its vast video data for a potential "digital twin" model. Huawei integrates its model into industrial applications like smart cars and robotics without separately branding it. Baidu embeds world model capabilities within its Apollo autonomous driving and Ernie systems. * **Automakers:** Companies like NIO, Li Auto, XPeng, and Geely are using world models as virtual "driving schools" and "testing grounds." They generate complex scenarios (e.g., rain, snow) to train and validate autonomous driving systems in simulation, aiming for more capable and safer AI drivers. * **Autonomous Driving Suppliers:** Firms such as Momenta, Horizon Robotics, Haomo.ai, and DeepRoute.ai are building the underlying "world engines." They focus on large-scale video generation for simulation, reinforcement learning, and enhancing end-to-end autonomous driving models, often integrating these capabilities into commercial products. While startups bring focus and innovation, they face challenges like limited data, compute resources, and deployment channels. Large companies possess these advantages and are rapidly transitioning world models from research projects into core business infrastructure powering products in vehicles, games, and industry. The conclusion is that world models represent an evolution and convergence of existing AI fields into crucial industrial infrastructure, moving the competition from simply building a model to effectively deploying it to understand and interact with the physical world.

marsbit06/25 06:52

The War Without a Unified Name: The Domestic Tech Giants' World Model Landscape

marsbit06/25 06:52

SoftBank CEO Masayoshi Son's New Trillion-Dollar "Gamble"

SoftBank founder Masayoshi Son is embroiled in a new trillion-dollar "bet" on Physical AI and humanoid robotics, even as his massive wager on OpenAI faces uncertainty ahead of its potential IPO. Recent reports reveal OpenAI's steep losses—$85 billion net loss by Q1 2026 and a $38.5 billion loss in 2025—casting doubt on its path to a trillion-dollar valuation. SoftBank, OpenAI's second-largest external shareholder with a planned 13% stake, stands to gain hugely if OpenAI succeeds. Undeterred, Son is already pushing forward with his next ambitious venture: consolidating SoftBank's AI and robotics assets into a new U.S.-based company named "Roze," targeting a $100 billion IPO as early as late 2026. This move aligns with his belief that Physical AI, merging AI cognition with robotic physical execution, is the next trillion-dollar frontier. Son's confidence stems from recent AI wins; SoftBank's stock surged and he briefly regained the title of Asia's richest person, largely due to OpenAI's soaring valuation. However, his aggressive strategy has raised internal concerns about over-reliance on OpenAI and strained finances. With competitors like Anthropic advancing rapidly and OpenAI's IPO timing uncertain, Son is racing to capitalize on the AI boom. His long-term vision for Physical AI includes a decade of investments in robotics, from Boston Dynamics to recent acquisitions like ABB's robotics unit, and a planned $1 trillion investment in U.S.-based AI robotics industrial parks. Yet, challenges remain: humanoid robotics firms like Figure AI lack the clear revenue paths of AI software companies, and Roze's lofty valuation faces skepticism. For Son, these bets are also driven by an unfulfilled promise of massive returns to key investors like Saudi Arabia's PIF. Despite risks, he continues to double down, betting that the fusion of AI and physical machines will define the next technological era.

marsbit06/25 00:05

SoftBank CEO Masayoshi Son's New Trillion-Dollar "Gamble"

marsbit06/25 00:05

活动图片