The AI Industrial Revolution: Where Are We Now?

marsbitОпубликовано 2026-05-27Обновлено 2026-05-27

Введение

This article explores the current stage of the AI industrial revolution, arguing we are still merely attaching new tools to old workflows rather than fundamentally redesigning production. The author compares this to the early Industrial Revolution, where factories simply replaced waterwheels with steam engines without changing their core structure. Similarly, today we embed AI chat windows into existing software but leave organizational processes unchanged. While massive investment floods into AI infrastructure (data centers, chips), akin to railway manias of the past, the real transformation lies in "dismantling the old workshop"—reorganizing companies around AI. Examples include Notion's use of hundreds of AI Agents and Y Combinator's experiments with self-improving AI systems that operate autonomously. The author notes a critical gap: while China has vast AI user growth, few companies have rebuilt core workflows. AI is beginning to impact entry-level jobs, and early adopters are gaining a compounding advantage. The conclusion is that the pivotal moment will not be the invention of better models, but when organizations decide to tear down old structures and rebuild around AI, shifting the bottleneck from human coordination to computing power. The future workplace and job titles are yet to be defined, but the imperative is to move away from legacy processes and position oneself where the new "railway" is being built.

Written by: Will Awang

Over the past year, I've attended several AI-themed industry conferences. Guests on stage took turns demonstrating the wonders of AI, while people in the audience held up their phones to film the screen, posted on social media, and then went back to scrolling. But back in the office, it was the same weekly meetings, the same approval processes, the same status reports. Big tech companies have already written token consumption into KPIs; some become model employees by writing scripts to inflate usage. Those same people on social media—Claude revolution today, Codex amazing tomorrow, Gemini long live the day after—is this embracing revolution, or just rushing from one spectacle to another?

All of this is noise, not the answer I'm looking for.

The real question isn't whether AI is powerful enough—the steam engine has been built. The question is who will be the first to tear down the old workshop.

The day the Industrial Revolution truly began wasn't when Watt improved the steam engine; it was when the factory owners in Lancashire decided to move away from the river and rebuild their workshops around the steam engine. The most important moment for AI is the same—not the day the large language model was invented, but the day the first organization decides to dismantle its old processes and rebuild its mode of production around AI. That day hasn't arrived yet. But it's on its way.

Two people saw this early. Notion CEO Ivan Zhao wrote an article at the end of 2025 titled "Steam, Steel, and Infinite Minds", offering a cold judgment: we are still in the "replacing the waterwheel" stage—attaching AI chatbots to existing tools, but no one is redesigning the factory. Former OpenAI employee Leopold Aschenbrenner took another path: he wrote a 165-page document titled "Situational Awareness", then started a fund that grew from $225 million to $13.68 billion, all betting on AI infrastructure. One looks inward, the other bets outward.

This article is not about them. It's about us—where we stand now, and which part of history we are repeating.

( Power-loom weaving, engraving by J. Tingle after Thomas Allom, 1835 / Wikimedia Commons )

I. The Workshop Is Still Old

Most people's day goes like this: use AI to write an email in the morning, saving ten minutes; then spend two hours in an unnecessary weekly meeting; copy and paste the same set of data between three tools in the afternoon; post on social media at night saying "AI is so great." The ten minutes saved are completely eaten back by the old processes.

Similarly, when the steam engine first appeared, factory owners initially just replaced the waterwheel with the steam engine, leaving everything else unchanged—factories were still built by the river, still multi-story buildings, still with a central drive shaft powering the entire production line. We embed ChatGPT into Slack, add Copilot to Office, place AI chat windows into workflows—we're doing the same thing. The tool is upgraded, but the workshop remains the same.

But replacing the machine is not the same as replacing the workshop. As McLuhan famously said:

We look at the present through a rear-view mirror. We march backwards into the future. Using old processes to accommodate new tools is like early films being merely recorded stage plays. The real breakthrough comes when someone completely frees the steam engine from the river and redesigns the entire production system around the new power source.

Looking at the Industrial Revolution timeline and comparing it to AI, we can roughly locate where we are on the map:

Now the timeline is extremely compressed. The Industrial Revolution took 60 years from the steam engine to the railway mania; AI took only 7 years from Transformer to the data center construction boom.

Speed is not the problem; the problem is where we are stuck—the first four rows are still the stage of installing new machines in old workshops. The steam engine is installed, railways are being laid, but the mode of production remains intact. The sixth row is the real watershed. We are most likely stuck between these two steps.

The steam engine is in our hands, but the workshop is still old.

II. All the Money Is Bet on the Layer Farthest from the Factory

Infrastructure is always overbuilt. It's the investors who go bankrupt, not the infrastructure.

In 1846, the British Parliament passed 263 Railway Acts, approving the construction of 9,500 miles of new railway. At its peak, railway investment accounted for 13% of Britain's GDP. Railway shares could be bought with only a 10% down payment, and the middle class flocked to invest. The bubble burst in 1847. One-third of the approved lines were never built, and countless investors lost everything. Darwin lost 60% on railway stocks, and he was luckier than most.

But the railways remained.

Today's AI infrastructure is following the same path. Goldman Sachs' latest estimate puts global AI infrastructure capital expenditure at $765 billion in 2026, projected to reach $1.6 trillion annually by 2031. The proportion of capital expenditure to operating cash flow for hyperscale cloud providers has risen from about 40% in 2023 to nearly 70% in 2025. AI-related investments already account for about a quarter of all US investment. Aschenbrenner's $13.68 billion is betting on this layer—he's not betting on which application will win, but on the underlying compute power itself.

This capital cycle is isomorphic to real estate development. Building data centers is like building buildings: land is electricity, building materials are GPUs and storage, contractors are data center builders, developers are cloud providers, tenants are AI application companies, and rent is API revenue. The cloud providers' business model is to "rent to cover the loan"—using API revenue to cover data center capital expenditure, waiting for the valuation leap brought by the explosion of AI applications.

(Compute Power Real Estate: Each generation has its own infrastructure)

The core risk is the same: is the growth rate of API call volume offsetting the decline in API unit price? If rent falls below the loan repayment line—this is a nightmare familiar to real estate developers. The lesson of 2008 was not that too many houses were built, but that the structure of the houses built did not match the structure of real demand. The equivalent risk for AI is: an oversupply of general-purpose compute power, while specialized capabilities that can truly handle high-value scenarios like financial compliance or medical diagnosis remain scarce.

Railways, real estate, AI—infrastructure investments across three eras share the same rule: overbuilding is the norm, material suppliers always lose pricing power, and long-term returns always belong to the owners of "prime locations." Look at the Q1 fund holdings on Wall Street—probably 80% is concentrated in this infrastructure layer: NVIDIA, data centers, cloud infrastructure. But the railway mania teaches us: this is not the full picture of the AI revolution, and it's not even the layer with the highest returns.

What is the prime location for AI? It's unique industry data and deeply embedded workflows. For individuals, the real "prime location" is not the stocks you hold, but your own irreplaceable judgment and industry knowledge—provided you have already rebuilt the way you use them around AI.

The real returns are in the next layer. But between infrastructure and value creation, there is no seamless connection. There is a gap in the middle—historically, this gap has swallowed decades.

III. Who Is Tearing Down the Workshop

Those tearing down the workshop and those "using AI for efficiency" are not doing the same thing.

Notion co-founder Simon used to be a "10x programmer"; now he rarely writes code himself—he simultaneously controls three or four AI coding agents, achieving 30x to 40x efficiency. Notion now has 1,000 employees and over 700 AI agents. The gap isn't the tools; it's that Simon tore down his old workshop, while most people just replaced their waterwheel.

600 million Chinese users have used generative AI tools, a year-on-year increase of 142%—this is the world's largest pool of AI demand. But almost no Chinese company has rebuilt its core workflows around AI. The world's largest demand side, paired with a nearly stagnant supply side in terms of organizational change. This contrast itself is a signal: it's not that the tools are lacking, it's that organizations haven't kept up. The context of knowledge work is scattered across dozens of tools and dozens of minds, outputs are not verifiable, and no one knows how to judge whether a strategic memo is effective.

(Labor market impacts of AI: A new measure and early evidence)

Anthropic is already moving on a larger scale. They released an Economic Index, using real usage data to depict which tasks and industries AI is replacing first, then building according to this blueprint: forming a joint venture AI-native enterprise services company with Goldman Sachs, Blackstone, and Hellman & Friedman; establishing a global alliance with KPMG, connecting 276,000 employees to Claude; Accenture forming a business group, training 30,000 people, focusing on finance, life sciences, and healthcare.

The role these consulting firms play is not that of AI users, but AI railway engineers—they don't build steam engines or lay tracks; they help enterprises tear down old factories and rebuild production lines around the new power source. Without this role, most factory owners wouldn't know where to start.

The signals are already flashing. One of the sharpest comes from the job market.

Young people aged 22-25 entering AI-high-exposure professions are 14% less likely to find a job than their peers entering low-exposure professions. Junior positions are already being squeezed.

If I were a new graduate, this number would directly affect my job search. If I were a manager, the next batch of junior positions I hire might not be people.

Organizations are dismantling. What about individuals? My degree, my resume, the industry experience I've accumulated over the years—these are my waterwheels. They once drove my entire production line, but the steam engine has arrived. A degree from a top university is no longer a moat; it just proves I once built a decent factory by the river.

Now the question is, do we have the ability to leave that river?

Anthropic's data shows that users who have used AI tools for more than 6 months have a task success rate 10% higher than new users. Those who started half a year earlier are already leading by 10%, and this gap compounds over time.

But no company has gone bankrupt yet from not using AI, at least my law firm is still advancing full steam ahead around AI. The winners haven't been selected by the market yet. The learning curve is real—early adopters are already accumulating advantages, but most are still at the starting line.

IV. My Next Job Doesn't Have a Name Yet

Will my current job title exist ten years from now? How many of the tools I used daily five years ago are still used today? The answers are likely both negative. But I don't know what the things that replace them are called—because those things don't exist yet.

It's been this way every time in history. New things aren't planned; they grow on their own after old constraints disappear.

Before railways were built, Britain was a collection of isolated local economies. The price of cotton cloth in Manchester and London could differ by 30%. Each city had its own time standard, and no one saw a problem. In the twenty years after railways were built, everything changed. A national unified market appeared for the first time, price differences were smoothed out; standard time was forced by railways, not invented; stationmasters, telegraph operators, travel agents—these jobs didn't exist at all before railways.

No one foresaw department stores when laying railways. No one foresaw standard time when building steam engines.

(Steam, Steel, and AI Infinite Intelligence)

The history of cities tells the same story. Cities hundreds of years ago were human-scale—forty minutes to walk across Florence. Steel frames made skyscrapers possible, railways connected cities to their hinterlands, and elevators, subways, and highways followed. Tokyo, Chongqing, Dallas—these are not bigger versions of Florence; they are entirely new ways of life.

Current knowledge work is also human-scale. Teams of a few dozen people, meetings and emails set the rhythm, becoming unmanageable beyond a few hundred people. We are building Florence with stone and wood. AI makes "Tokyo" possible—organizations composed of thousands of AI agents and people, with workflows running continuously across time zones. Old weekly meetings, quarterly planning, annual reviews may no longer make sense.

Simon no longer writes code—his job has become "managing AI agents." This position didn't exist two years ago. My next job title might not have a name yet. But someone is already building that future we cannot yet name.

V. What Does the New Workshop Look Like

After tearing down the old workshop, what do you build? Y Combinator's answer is: let the company improve itself.

Their internal system now modifies its own code at night. An employee ran a query during the day that failed. A supervising agent read about this failure, deduced the cause, wrote code to fix it, submitted it for review, and deployed it. The same query ran successfully the next day. The whole thing happened while everyone was asleep.

This isn't AI helping people produce 30% more. This is the system running through an entire closed loop on its own, figuring out how to become better.

In an internal talk, YC partner Tom Blomfield called this company form a "recursive self-improving AI loop." His judgment is direct: most companies are still Roman legions—information trickles down layer by layer and aggregates up layer by layer, with people acting as conduits. What AI breaks is not the efficiency of a certain link, but the very premise of this entire hierarchical structure.

His new logic is: burn tokens, not headcount. The bottleneck is shifting from manpower to compute power. The data YC sees shows that companies reaching Demo Day have about 5 times higher revenue per capita than 18 months ago. The role of middle management is being taken over by AI—"coordination" no longer requires humans. Everyone should be an IC, a builder, an operator. Every task has a named owner, not a committee.

There's another prerequisite: the company must be "readable" to AI. Things that aren't recorded are, to AI, as if they never happened. YC now archives all partner emails, records all Slack messages and office hour recordings. One partner used 2,000 hours of recordings accumulated over three months to have AI regenerate a 150-page internal manual—much better than the old version. This manual updates automatically every month, becoming a perpetually fresh "living brain."

Tom left a question:

If you were building your company from scratch today, would you set it up in this form? If your company already has a hierarchical structure, you face a harder question—will the pain of rebuilding be less than the cost of continuing to operate as a Roman legion?

People are not at the center of the workshop; they are on the periphery—responsible for the places AI can't yet reach: on-the-ground judgment, entirely new situations, high-stakes, high-emotion moments. The center of the company is a "corporate brain" pieced together from data, records, and industry knowledge. The software running on it is consumable—if it can be generated, it can be regenerated. What's valuable resides in people's minds—how the business runs, which steps involve judgment; this understanding is the real asset.

What Ivan Zhao describes in "Steam, Steel, and Infinite Minds" is the other side of this direction—an organization of 1,000 employees and over 700 AI agents collaborating, where people are responsible for judgment, and agents are responsible for execution. Aschenbrenner bets on compute infrastructure; Zhao bets on organizational reconstruction. Both paths ultimately point to the same destination: a new mode of production rebuilt around AI.

VI. Conclusion

Between the 1840s and 1850s—the railways were laid, but the factories hadn't been rebuilt.

Where are we? Simon no longer writes code. He tore down his own waterwheel.

The question has never been whether the steam engine is good enough. The question is who will be the first to tear down the old workshop.

I don't intend to predict the future department stores. I only intend to take care of myself—ensuring I stand along the railway line, not guarding a river that is drying up.

What about you?

Связанные с этим вопросы

QWhat is the central argument of the article regarding our current phase in the 'AI Industrial Revolution'?

AThe central argument is that we are currently in the 'steam engine' phase of the AI revolution. We have powerful new tools like Large Language Models, but are largely using them to perform old tasks within existing organizational structures and workflows ('the old workshop'). The real revolution hasn't begun yet. It will start not when AI is invented, but when organizations fundamentally dismantle old processes and rebuild their core production methods around AI.

QAccording to the author, what is the key difference between simply 'using AI for efficiency' and truly transforming work?

AThe key difference is between 'replacing the waterwheel' and 'dismantling the old workshop.' Using AI for efficiency means attaching AI chatbots to existing tools (like adding ChatGPT to Slack) to save time on discrete tasks, but leaving the underlying workflows and organizational logic unchanged. True transformation involves redesigning the entire 'factory' around the new 'engine'—reimagining processes, roles, and structures from the ground up with AI as the core driver, as exemplified by individuals like Notion's Simon who manage AI agents instead of writing code.

QWhat historical parallel does the author draw to the massive investment in AI infrastructure (like data centers and GPUs), and what warning does this imply?

AThe author draws a parallel to the British 'Railway Mania' of the 1840s. This implies a warning of a potential investment bubble where capital is overwhelmingly poured into foundational infrastructure (the 'rail lines' and 'real estate' of AI—data centers, chips, cloud capacity), far ahead of proven, high-value applications. The risk is that the 'rent' (API revenue) may not cover the 'mortgage' (capital expenditure) if the growth in usage doesn't outpace price declines, leading to overcapacity in generic compute while specialized, industry-transforming capabilities remain scarce.

QWhat does the author suggest is the 'core location' or most valuable asset for individuals in the AI era, and why?

AFor individuals, the author suggests the true 'core location' is not stock holdings in infrastructure companies, but their own 'irreplaceable judgment and industry knowledge.' This is because AI commoditizes execution and information retrieval. The unique value lies in human capabilities like offline judgment, navigating novel situations, and high-stakes emotional intelligence—areas where AI currently cannot reach. However, this knowledge is only a valuable asset if the individual has already rebuilt their way of working to leverage AI effectively.

QWhat is the 'recursive self-improving AI loop' described in the Y Combinator example, and what does it signify about future organizations?

AThe 'recursive self-improving AI loop' is when an AI system autonomously identifies a failure in its own operations, diagnoses the cause, writes and deploys code to fix it, and verifies the solution—all without human intervention. This signifies a move beyond AI as a productivity tool for humans. It points to future organizations where AI forms a 'company brain' from data and communications, enabling systems to self-optimize. This undermines traditional hierarchical ('Roman legion') structures, as coordination and middle-management tasks are automated. The future organization is 'AI-readable,' with humans focused on high-judgment roles at the periphery.

Похожее

Three Years Later: Looking Back at My Predictions About ChatGPT in 2023

Three Years Later: Revisiting My 2023 Predictions on ChatGPT In March 2023, shortly after ChatGPT's launch, I made 20 predictions about its future. Now, in mid-2026, I've used AI agents to fact-check each one against the latest data. Overall, most major directional forecasts were correct, with only one outright error (incorrectly stating GPT-4 had 100 trillion parameters). Key successes included predicting that RAG and retrieval architectures would become the standard for handling knowledge and hallucinations, that natural language interfaces (LUI) would create a massive new industry layer beyond the models themselves, and that China would develop viable large language models, significantly closing the performance gap with Western counterparts within about three years. Predictions about the absence of mass unemployment, the rise of a new "robot network" for agent communication, and ChatGPT not possessing consciousness also held true in their core arguments. However, the "devil was in the details." Errors frequently involved specific numbers, timelines, or overlooking distributional effects. I tended to overestimate the speed of adoption (e.g., for agent networks) while underestimating the ultimate scale of capabilities or costs (e.g., AI winning IMO gold without tools, or the extreme capital required for frontier models). Other misjudgments included: underestimating how AI would reinforce, not dissolve, information filter bubbles; incorrectly assuming AI-generated content would easily circumvent copyright (it has instead triggered record-breaking settlements); and misidentifying where value would be captured (it accrued overwhelmingly to the compute layer, like Nvidia, not just the application or model layers). Key lessons from reviewing these predictions are: 1) Directional and mechanistic insights are far more reliable than precise numbers or absolute statements. 2) There's a consistent bias to overestimate short-term speed but underestimate long-term magnitude. 3) Errors often lie in missing distributional impacts within a generally correct aggregate trend. 4) Predictions phrased with nuance and caveats aged the best. 5) Some fundamental debates (e.g., on machine consciousness or the ultimate value chain) remain unresolved even after three years. This exercise is less about scoring the past and more about establishing rules for clearer thinking about the next three years of AI.

marsbit2 ч. назад

Three Years Later: Looking Back at My Predictions About ChatGPT in 2023

marsbit2 ч. назад

Three Years Later: Looking Back on My 2023 Predictions for ChatGPT

Looking Back After Three Years: Revisiting My 2023 Predictions on ChatGPT In March 2023, shortly after ChatGPT's debut and before GPT-4's release, I made over twenty predictions about AI's future based on limited information and intuition. Now, in May 2026, I revisited those forecasts using an AI-driven analysis with 41 Opus 4.8 agents to cross-reference them with the latest data. The assessment used symbols: ✅ Correct, 🟢 Mostly Correct, 🟡 Partially Correct, ❌ Incorrect. Overall, the directional judgments held up well, with only one major factual error regarding GPT-4's rumored parameter size (incorrectly cited as 100T). However, nuances and degrees of accuracy revealed more. **What Was Largely Correct:** Predictions about mechanisms and directions proved accurate. The rise of RAG (Retrieval-Augmented Generation) as the standard architecture for combating AI hallucination was confirmed, as was the transformative potential of LUI (Language User Interface) in creating a new industry layer atop GUIs. The emergence of "robot networks" (agent-to-agent communication protocols) and China's rapid catch-up in developing capable large models (closing the performance gap with top models to ~2.7%) were also on point. The analysis affirmed that LLMs lack consciousness and that the Turing Test merely measures perceived intelligence. **What Was Off Target:** Errors often involved specific numbers, over-optimistic timelines, or misjudged distributions. The prediction that value would primarily accrue to the application layer was half-right but missed NVIDIA's dominance as the profitable infrastructure layer. Forecasts about AI circumventing copyright issues and fostering a "global common ground" by averaging human viewpoints were incorrect; instead, major copyright settlements occurred and AI personalization is increasing. Estimates for model training costs ("$5-10 billion cap") were significantly off, underestimating frontier costs and overestimating replication costs. The notion that LLMs could never do complex math without tools was disproven by later models winning IMO gold. **Key Patterns from the Review:** 1. **Direction over precision:** Judgments about mechanisms and trends were more reliable than specific numbers or definitive statements. 2. **Timing bias:** There was a tendency to overestimate short-term speed but underestimate long-term magnitude and transformation. 3. **The distribution blind spot:** Aggregate-level correctness often masked uneven impacts (e.g., on young professionals' employment). 4. **The value of qualifiers:** Predictions framed with caution (e.g., "reportedly," "for now," "prototype in 2-3 years") aged better. 5. **Some debates continue:** Issues like the nature of "emergent abilities" or machine consciousness remain unresolved. This three-year review highlights that while seeing the big picture is crucial, humility regarding specifics, timelines, and disparate impacts is essential for future forecasting.

链捕手4 ч. назад

Three Years Later: Looking Back on My 2023 Predictions for ChatGPT

链捕手4 ч. назад

AI Bubble Warning: AI Investments Are Negative Returns for Most Tech Giants

The article issues a stark warning about a potential AI investment bubble. It notes that while the AI boom shares similarities with the TMT bubble of the late 1990s, its scale is vastly larger, currently driving 93% of U.S. GDP growth. Major hyperscale cloud providers like Microsoft, Alphabet, Amazon, Meta, and Oracle are planning to invest trillions in AI data centers over the coming years. However, calculations based on analyst projections for 2025-2030 reveal a concerning math problem: expected capital expenditure growth far outpaces projected revenue growth. Even under an extremely optimistic scenario of zero costs, the implied return on investment for most of these tech giants (except Amazon) is deeply negative. This suggests that the current trajectory could lead to one of history's largest shareholder value destruction events. The piece outlines two potential escapes: AI generating vastly more revenue than currently anticipated—a near-impossible task—or a significant cutback in the planned investment splurge. The latter scenario could trigger a domino effect, severely impacting the entire tech supply chain (from Nvidia to TSMC), potentially pushing the U.S. economy into recession, and causing a major stock market downturn. The author suggests upcoming high-profile IPOs by companies like OpenAI and Anthropic might represent a transfer of risk from early investors to public market participants. While the peak of the hype cycle might sustain investment through 2026, the fundamental financial dilemma remains unresolved, setting the stage for a potential market correction in 2027 or 2028, similar to the years following Alan Greenspan's "irrational exuberance" warning.

marsbit5 ч. назад

AI Bubble Warning: AI Investments Are Negative Returns for Most Tech Giants

marsbit5 ч. назад

From Tokens to Machine Labor: AI is Shifting from Tool to "Worker"

The article "From Token to Machine Labor: AI is Evolving from Tool to 'Worker'" argues that the business model for AI is shifting beyond simply selling computational resources (tokens, GPU hours) or model access. Instead, a new "machine labor market" is emerging, where the core economic transaction is the purchase of economically useful work directly performed by software. The central thesis is that AI pricing will evolve through four stages: 1) raw tokens, 2) standardized LLM capabilities (e.g., text generation), 3) industry-specific labor markets (e.g., legal review, radiology), and finally 4) a programmable results market where tasks like resolving a support ticket are bid on and priced based on outcome. In this future, buyers will care less about *which* model or GPU completes a task and more about whether the work meets specified standards for accuracy, latency, and cost. This transition reframes the impact of AI on human labor. Rather than simple replacement, it suggests a re-coordination where machines handle standardized, verifiable work, freeing humans for roles involving oversight, context management, responsibility, and final judgment. In some cases, this "last 1%" of human input becomes more valuable as it enables the other 99% to be automated. Furthermore, as AI reduces the cost of work, demand may expand, creating larger markets (e.g., 24/7 customer service) rather than just cheaper versions of existing ones. The article concludes that while infrastructure (GPUs, models, tokens) remains crucial upstream, the market is converging on a simpler, tradeable unit: machine labor that can be defined, measured, priced, and procured based on contractible specifications.

marsbit5 ч. назад

From Tokens to Machine Labor: AI is Shifting from Tool to "Worker"

marsbit5 ч. назад

Xiaomi MiMo's 99% Price Cut is Not Marketing! Luo Fuli Posts on X to Refute Critics

The price of Xiaomi's MiMo-V2.5 series API has been permanently reduced by up to 99%, specifically for the "Input (Cache Hit)" cost, which covers users re-reading historical context in long conversations. MiMo's head, Luo Fuli, published a detailed technical blog to clarify that this drastic price cut stems from genuine engineering breakthroughs, not a marketing stunt or a simple price war. The core of the achievement lies in six key engineering optimizations. First, the model architecture adopts a Hybrid Sliding Window Attention (SWA), reducing the memory footprint (KVCache) to 1/7th of a traditional model. Second, a dual-pool memory management system actually utilizes these savings, allowing a single GPU to handle over 5 times more concurrent users. Third, an upgraded prefix caching mechanism achieves a cache hit rate of 93-95% for repeated reads, meaning most such requests bypass GPU computation entirely. Fourth, a self-developed distributed cache (GCache) utilizes idle SSD space on existing GPU servers, eliminating additional storage costs. Fifth, an intelligent scheduling system (LLM-Router) efficiently routes requests to maximize cache reuse and performance. Sixth, Multi-Token Prediction (MTP) accelerates the model's text generation ("output") side. Together, these systemic optimizations dramatically lower the real computational cost per request, enabling the 99% price reduction for cached inputs while reportedly maintaining positive gross margins. Luo Fuli's disclosure aims to shift the narrative from "price war" to a demonstration of substantive AI engineering progress.

marsbit7 ч. назад

Xiaomi MiMo's 99% Price Cut is Not Marketing! Luo Fuli Posts on X to Refute Critics

marsbit7 ч. назад

Торговля

Спот
Фьючерсы

Популярные статьи

Неделя обучения по популярным токенам (2): 2026 может стать годом приложений реального времени, сектор AI продолжает оставаться в тренде

2025 год — год институциональных инвесторов, в будущем он будет доминировать в приложениях реального времени.

1.8k просмотров всегоОпубликовано 2025.12.16Обновлено 2025.12.16

Неделя обучения по популярным токенам (2): 2026 может стать годом приложений реального времени, сектор AI продолжает оставаться в тренде

Обсуждения

Добро пожаловать в Сообщество HTX. Здесь вы сможете быть в курсе последних новостей о развитии платформы и получить доступ к профессиональной аналитической информации о рынке. Мнения пользователей о цене на AI (AI) представлены ниже.

活动图片