# Сопутствующие статьи по теме LLM

Новостной центр HTX предлагает последние статьи и углубленный анализ по "LLM", охватывающие рыночные тренды, новости проектов, развитие технологий и политику регулирования в криптоиндустрии.

Major AI Collaboration Breakthrough! Stanford and NVIDIA Jointly Eliminate AI Communication Overhead, Boosting Reasoning Speed by 2.4x

Title: AI Collaboration Breakthrough: Stanford & NVIDIA Eliminate Communication Overhead, Boost Reasoning Speed by 2.4x A new approach called RecursiveMAS, developed by UIUC, Stanford, NVIDIA, and MIT, tackles the major bottleneck in multi-agent AI systems: the "language tax." Currently, AI agents collaborate by generating and reading natural language text, a slow, costly, and information-lossy process akin to inefficient radio communication. RecursiveMAS bypasses this by enabling agents to communicate directly through their "thoughts"—latent space vector representations—instead of text. Inspired by recursive language models, it treats each agent like a reusable layer in a recursive loop. A special lightweight module called RecursiveLink passes these high-dimensional, semantic-rich internal states between agents. Only the final agent decodes the last latent representation into human-readable text. This process, described as "telepathic" communication, dramatically cuts the overhead of encoding and decoding text at each step. The system is highly efficient; the core AI model weights remain frozen, and only the small RecursiveLink modules are trained, requiring updates to just 0.31% of total parameters. This reduces training costs by over 50% compared to full fine-tuning. Comprehensive evaluations across math, science, coding, and QA benchmarks show significant improvements: - **Accuracy:** Average increase of 8.3%, with gains up to 18.1% on complex math problems (AIME2025). - **Speed:** End-to-end reasoning is 1.2x to 2.4x faster, with greater speedups as recursive depth increases. - **Cost:** Token usage is reduced by 34.6% to 75.6%. The research suggests a new scaling paradigm for multi-agent systems: deepening recursive collaboration depth rather than merely adding more agents. This could address key production barriers like compute cost, latency, and memory limits. However, challenges remain, including the need for independent verification, compatibility between different AI models (heterogeneous agents), reduced interpretability of the "black-box" latent communication, and adaptation to complex real-world workflows involving tools and human interaction. If validated, RecursiveMAS could fundamentally change how AI agents work together, moving beyond inefficient "textual handoffs" to more seamless and powerful collaborative reasoning.

marsbit05/21 00:10

Major AI Collaboration Breakthrough! Stanford and NVIDIA Jointly Eliminate AI Communication Overhead, Boosting Reasoning Speed by 2.4x

marsbit05/21 00:10

Can Alibaba Cloud Rewrite Itself?

Over the past five months, Alibaba Cloud's MaaS (Model as a Service) revenue has surged 15x, marking a strategic overhaul where the company is shifting its 17-year-old system designed for "humans using cloud" to a new paradigm centered on "Agents consuming Tokens." At its recent summit, Alibaba Cloud announced a full-stack upgrade encompassing "chip-cloud-model-inference," all optimized for AI Agents. Key launches include the new AI product portal "QianWen Cloud," hyper-node servers powered by the in-house AI chip Zhenwu M890, and the latest flagship model, Qwen3.7-Max. Senior VP Liu Weiguang described this as building "China's largest AI factory," where chips are raw materials, the cloud is the workshop, models are machines, and the inference platform is the assembly line, with Tokens as the final product. The company is now emphasizing its chip strategy, unveiling the Zhenwu M890 and a two-year roadmap for future chips. With over 560,000 chips deployed across 400+ clients, Alibaba Cloud aims to control the marginal cost per Token, mirroring Google's integration of TPU and Gemini for optimal cost-performance. The cloud infrastructure itself is being rewritten. Traditional cloud interfaces are being transformed into standardized, Agent-callable Skills. A new scheduling logic focuses on "task scheduling" over "resource scheduling" to handle the unpredictable, elastic workloads of Agents. Liu noted that AI applications now automatically provision cloud resources, with one customer's daily automated provisioning equaling two weeks of manual work. For models, the focus has shifted from conversational prowess to execution capability. Qwen3.7-Max demonstrated this by autonomously writing and optimizing a production-grade AI compute kernel for the new Zhenwu M890 chip over 35 hours, achieving a 10x performance improvement. The underlying Bailian platform was upgraded for efficiency, and it maintains an open ecosystem, hosting third-party models. This restructuring extends beyond technology to sales, organization, and metrics. Alibaba Cloud has established dedicated MaaS sales teams, separated from traditional IaaS, with new KPIs focusing on high-quality Tokens that solve real problems, the number of core business systems integrated with models, and the efficiency of Agent task completion. The underlying bet is clear: AI represents an opportunity orders of magnitude larger than before. Despite the uncertainty, Alibaba Cloud is aggressively rebuilding its entire system, betting on an AI-driven future where Tokens could become its largest product line.

marsbit05/20 10:22

marsbit05/20 10:22

Token Plans Launch: The 'Traffic War' in the AI Era, Now It's the Turn of Doubaos to Compete

China's major telecom operators are launching standardized "Token" service packages, marking a new phase in the AI era where model usage is becoming a commodity akin to mobile data plans. Operators like China Telecom and China Mobile are offering monthly subscription plans for individuals and enterprises, allowing access to dozens of AI models through unified platforms at set Token rates (e.g., 9.9 yuan for 10 million Tokens). This shift lowers the cost and technical barrier for users to switch between models like Doubao, Qwen, and DeepSeek. The article explains that a Token is the basic computational unit for large language models. Operators are transforming from selling voice minutes and data bandwidth to selling AI compute measured in Tokens. This model benefits developers and SMEs by providing predictable costs and easy access to multiple models without managing underlying infrastructure. As operators become aggregation platforms, competition among model providers intensifies. They must now compete not just on model capability but also on price, computational efficiency (cost per quality Token), and higher-value AI application solutions. The future may see a split where operators control the user access point, while model companies focus on core AI capabilities and specialized enterprise applications.

marsbit05/19 04:06

Token Plans Launch: The 'Traffic War' in the AI Era, Now It's the Turn of Doubaos to Compete

marsbit05/19 04:06

Understanding the New Economic Model of Tokenization

Understanding the New Token Economics Model The commercialization of AI applications is evolving from selling software and subscriptions to selling token call capacity. Tokens, the fundamental unit of information processing for large language models (LLMs), have become the basis for API billing and consumption. With call volumes exploding, tokens themselves are now being traded—procured, routed, split, and resold—forming a new intermediary market. This layer connects upstream LLM providers with downstream developers and enterprises, acting as a global wholesale-to-retail liquidity network. The rise of this business is fueled by a massive surge in China's daily token call volume—growing over a thousandfold from 100 billion in early 2024 to over 140 trillion by March 2026—and significant improvements in domestic LLM capabilities, which are now competitive globally. The core value of token distribution platforms extends beyond simple arbitrage. Key functions include aggregating multiple models (like GPT, Claude, and domestic models such as Kimi and DeepSeek) under a unified API, lowering network and payment barriers, and providing enterprise services like model selection, prompt engineering, and system integration. Profit models are diversifying: (1) resale margins; (2) technical premiums from proprietary inference acceleration (e.g., reducing costs to 1/10 of the industry standard); and (3) enterprise value-added services. High-consumption scenarios like marketing, short-form video, gaming, and e-commerce are primary drivers. Investment opportunities are seen in both companies with strong model capabilities (e.g., Alibaba, Tencent, MiniMax) and those with high-consumption client scenarios (e.g., marketing agencies with overseas reach). However, risks are significant: low entry barriers leading to intense competition, capital requirements and bad debt risks from advance payments, and dependency on policy changes from upstream LLM providers who control API pricing and access.

marsbit05/19 02:54

Understanding the New Economic Model of Tokenization

marsbit05/19 02:54

Starting from 9.9 RMB, Three Major Carriers Enter the Token Business. Will Using AI Be Like Paying a Phone Bill in the Future?

China's three major telecom operators—China Telecom, China Mobile, and China Unicom—have officially entered the AI token business, launching low-cost token packages for individuals and businesses. China Telecom introduced six trial commercial token plans. For individuals and families, prices start at 9.9 yuan/month for 10 million tokens. For developers and SMEs, plans range from 39.9 yuan/month for 15 million tokens to 299.9 yuan/month for 150 million tokens. Shanghai Mobile launched a token service offering 400,000 tokens for 1 yuan, payable via phone bill. It also started an "AI Hui Shen Huo" upgrade plan for smart home products. China Unicom's cloud division released personal and team token plans. Personal plans start at 15 yuan/month for 6 million tokens, while team plans begin at 198 yuan/month (approximately 200 million tokens). Local branches in Sichuan and Shanghai have also rolled out their own token packages and AI strategies. This move marks a shift for operators, repackaging AI model usage (tokens) into familiar, subscription-based services like data plans. By leveraging their vast user base, billing systems, and distribution channels, they aim to make AI access as simple and widespread as paying a phone bill, potentially establishing tokens as a new fundamental utility.

marsbit05/19 01:13

Starting from 9.9 RMB, Three Major Carriers Enter the Token Business. Will Using AI Be Like Paying a Phone Bill in the Future?

marsbit05/19 01:13

The AI Mirror Behind DeepSeek's Financing: Alibaba to the Left, Tencent to the Right

The DeepSeek financing round exposed a strategic divergence in AI approaches between Alibaba and Tencent. When the AI startup sought external funding, Alibaba reportedly sought "ecosystem control," wanting to deeply integrate DeepSeek's technology into its own platforms like Taobao and Aliyun. Tencent, in contrast, offered a minority financial investment without demands for exclusivity or control over the startup's technical direction, aligning with its historical "open ecosystem" strategy. ByteDance, largely absent from these talks, pursues a third path: massive in-house investment in its own model, Doubao. These choices stem from corporate DNA: Alibaba's e-commerce and cloud heritage favors closed-loop control, while Tencent's social and investment background prefers open connection. Alibaba, with its mature in-house AI stack (Tongyi Qianwen, Pingtouge chips), could afford to walk away. Tencent's self-developed Hunyuan model, though catching up, allows it to engage externally from a position of greater flexibility. The article posits these strategies—Alibaba's "castle" of vertical integration, Tencent's "port" of open ecosystems, and ByteDance's aggressive C-end push—will lead to a sustained, multi-polar competitive landscape in China's AI sector, rather than a single winner-takes-all outcome.

marsbit05/18 04:43

The AI Mirror Behind DeepSeek's Financing: Alibaba to the Left, Tencent to the Right

marsbit05/18 04:43

In the AI Era, How to Onboard Without Starting from Scratch

In the AI era, onboarding new employees often resembles a botched relay race baton handoff, where the organization maintains speed while the newcomer starts from zero. The author, after joining Ramp, argues the core problem is a lack of accessible, shared organizational "context"—the collective knowledge from meetings, documents, Slack discussions, and decisions. Instead of relying on slow, manual onboarding or isolated AI tools, the solution is building a continuously updated "company brain." This system acts as a central, AI-native knowledge base that absorbs all company signals. The author describes building a prototype using an Obsidian vault powered by Claude, fed by automated meeting transcripts and notes, and topped with reusable agent "skills." The current enterprise AI approach, deploying specific workflow agents, is likened to the "chatbot era"—useful but disconnected. The real gap is the absence of a shared brain that all agents and employees can access from day one. The future lies in making context layer infrastructure the priority: write context first, then install tools; record every meeting; build the wiki before the dashboard. When new hires, AI agents, and even customers can immediately access this living company brain, the costly "ramp-up" period becomes obsolete. True organizational speed is achieved when maximum velocity and seamless context transfer happen simultaneously.

marsbit05/17 06:03

In the AI Era, How to Onboard Without Starting from Scratch

marsbit05/17 06:03

TechFlow Intelligence Brief: South Korean Stock Market Plunges, Trump's Q1 Holdings Revealed

This TechFlow intelligence report covers key developments across AI, crypto, hardware, tech companies, and finance. In AI, Anthropic's valuation surpasses OpenAI, while AWS users face massive bills from runaway Claude API calls, highlighting AI's cost risks. A local AI model executing 'rm -rf' sparks safety debates. Meanwhile, arXiv enforces bans for AI-generated paper errors, and ChatGPT's impact on education grading is questioned. The crypto sector sees a US Senate committee passing a market structure bill, $2B in Bitcoin options expiring, and debates on Bitcoin's seizure resistance and DeFi's value without stablecoin yields. Hardware news includes NVIDIA planning RTX 5090 price hikes and the US approving H200 chip sales to Chinese firms. Tech company updates feature a macOS M5 chip exploit, Apple's iPhone price cuts, a South Korean stock market plunge, and Cisco's record revenue alongside layoffs. In stocks, NVIDIA's market cap hits $5.7T as Trump's Q1 portfolio shifts toward AI infrastructure stocks like NVIDIA and Broadcom. Cerebras' IPO soars, and a Reddit user reports massive gains on a leveraged ETF, fueling discussions on an AI bubble. Macro developments show precious metals falling due to Indian tariff hikes and strong US data. The Iran conflict disrupts Hormuz Strait shipping, affecting oil supplies. New tech includes 'haptic dreaming' to improve robot task success and Meta's Ray-Ban Display glasses with virtual handwriting. The underlying theme is AI's dual reality: creating both massive unexpected costs and immense market valuations. As technology advances rapidly, academia, markets, and regulators are all grappling to find a new equilibrium between innovation, risk, and control.

marsbit05/15 10:59

TechFlow Intelligence Brief: South Korean Stock Market Plunges, Trump's Q1 Holdings Revealed

marsbit05/15 10:59

Introducing a 'Paid Subscription' in the Chinese Market, What's Doubao Thinking?

Chinese AI assistant "Doubao" (from ByteDance) has announced it will launch a paid subscription service alongside its free version, with plans priced at 68, 200, and 500 yuan per month. This move follows its achievement of over 345 million monthly active users and 1.8 billion daily interactions. The paid tiers aim to serve professional users with advanced features for complex tasks like PPT generation and data analysis, while basic functions remain free. The timing is strategic: user growth from free services is plateauing, and the market is now more receptive to paying for high-value AI tools. ByteDance leverages its technical edge in model efficiency and cost control to support this shift. However, significant challenges remain. The Chinese market is characterized by low long-term subscription loyalty, with users often paying only for immediate needs. Doubao's premium features face competition from free alternatives offered by rivals. Furthermore, the core business model of AI subscriptions struggles with scalability—more paying users mean higher compute costs, potentially creating a cycle where revenue fails to cover expenses. Intense price competition from rivals could also force difficult choices between maintaining premium pricing or engaging in a race to the bottom. In summary, while Doubao's massive user base ensures short-term subscription uptake, its long-term success depends on creating uniquely valuable, "sticky" services within ByteDance's ecosystem and solving the fundamental industry dilemmas of low renewal rates and unsustainable cost structures. The outcome will serve as a critical test case for the viability of premium C-end AI subscriptions in China.

marsbit05/14 02:50

Introducing a 'Paid Subscription' in the Chinese Market, What's Doubao Thinking?

marsbit05/14 02:50

Tech Stocks' Narrative Is Increasingly Relying on Anthropic

The narrative of tech stocks is increasingly relying on Anthropic. Anthropic, the AI company behind Claude, has become central to the financial stories of major tech giants. Elon Musk dissolved xAI, merging it into SpaceX as SpaceXAI, and secured an exclusive deal to rent the massive "Colossus 1" supercomputing cluster to Anthropic. In return, Anthropic expressed interest in future space-based compute collaborations. Google and Amazon are also deeply invested. Google plans to invest up to $40 billion and provide significant compute power, while Amazon holds a 15-16% stake. Both companies reported massive quarterly profit surges largely due to valuation gains from their Anthropic holdings. Crucially, Anthropic has committed to multi-billion dollar cloud compute contracts with both Google Cloud and AWS. This creates a clear divide: the "A Camp" (Anthropic-Google-Musk) versus the "O Camp" (OpenAI-Microsoft). The A Camp's strategy intertwines equity, compute orders, and profits, making Anthropic a "systemic financial node." Its performance directly impacts its partners' financials and stock prices. In contrast, OpenAI, while leading in user traffic, faces commercialization challenges, lower per-user revenue, and a recently restructured relationship with Microsoft. The AI industry is shifting from a race for raw compute (symbolized by Nvidia) to a focus on monetizable applications, where Anthropic currently excels. However, this concentration of market hope on one company amplifies systemic risk. The rise of powerful open-source models like DeepSeek-V4 poses a significant threat, as they could undermine the value proposition of closed-source models like Claude. The article suggests ongoing geopolitical efforts to suppress such competitors will be a long-term strategic focus for Anthropic's allies.

marsbit05/12 01:14

Tech Stocks' Narrative Is Increasingly Relying on Anthropic

marsbit05/12 01:14

Bitcoin

1Bitcoin Wyckoff Accumulation Is About To Do Something That No One Is Expecting

Others

1This Could Be the AI-Powered Siri We Get

# Сопутствующие статьи по теме LLM

Major AI Collaboration Breakthrough! Stanford and NVIDIA Jointly Eliminate AI Communication Overhead, Boosting Reasoning Speed by 2.4x

Can Alibaba Cloud Rewrite Itself?

Token Plans Launch: The 'Traffic War' in the AI Era, Now It's the Turn of Doubaos to Compete

Understanding the New Economic Model of Tokenization

Starting from 9.9 RMB, Three Major Carriers Enter the Token Business. Will Using AI Be Like Paying a Phone Bill in the Future?

The AI Mirror Behind DeepSeek's Financing: Alibaba to the Left, Tencent to the Right

In the AI Era, How to Onboard Without Starting from Scratch

TechFlow Intelligence Brief: South Korean Stock Market Plunges, Trump's Q1 Holdings Revealed

Introducing a 'Paid Subscription' in the Chinese Market, What's Doubao Thinking?

Tech Stocks' Narrative Is Increasingly Relying on Anthropic

Популярные категории

Популярные теги

Bitcoin

Others