# Сопутствующие статьи по теме LLM

Новостной центр HTX предлагает последние статьи и углубленный анализ по "LLM", охватывающие рыночные тренды, новости проектов, развитие технологий и политику регулирования в криптоиндустрии.

More and More 'Model Supermarkets' Are Opening: ByteDance, Alibaba, and Tencent Compete to Integrate

Chinese tech giants like ByteDance, Alibaba, and Tencent are accelerating the rollout of integrated AI model subscription services—dubbed “model supermarkets”—to provide developers with bundled access to multiple leading domestic large language models (LLMs). ByteDance’s Volcengine recently upgraded its "Coding Plan" by adding newer models like GLM-5.1, Minimax M2.7, and Kimi k2.6, allowing subscribers to use various top models under a single monthly fee starting at ¥40. However, user feedback reveals significant issues, including rapid consumption of usage limits (e.g., hitting caps within hours), frequent server errors (like HTTP 429), and slow response times during peak hours. Complaints about misleading deduction rates—where calls to advanced models consume more quota—are also common. The trend is industry-wide: Alibaba, Tencent, and Baidu have all launched similar multi-model coding plans. While these platforms reduce trial costs for developers, they also expose challenges in balancing affordability with service quality and computational stability. Amid this shift, independent AI companies like Zhipu, MiniMax, and Moonlight Face (Kimi) are developing strategies to avoid becoming mere “pipes” in this ecosystem—focusing on vertical applications, autonomous agents, and long-context models to retain competitiveness. Analysts suggest that, while platform aggregation may pressure model firms in the short term, specialized and vertical AI capabilities will remain differentiated in the long run.

marsbit04/24 04:07

More and More 'Model Supermarkets' Are Opening: ByteDance, Alibaba, and Tencent Compete to Integrate

marsbit04/24 04:07

Yao Shunyu's 88 Days

Yao Shunyu, a 27-year-old AI expert with a background from Princeton and OpenAI, joined Tencent in September 2025. Within 88 days, he led a major overhaul of Tencent’s AI strategy and organization, resulting in the release of Hunyuan Hy3 preview—a MoE model with 295B total parameters and 21B active parameters, supporting up to 256K context length. The launch came after Tencent leadership, including CEO Ma Huateng and President Martin Lau, openly criticized Hunyuan's earlier underperformance—citing slow development, over-reliance on superficial benchmark optimization, and poor generalization in real-world applications. Internal adoption was low, with key business units like WeChat and gaming seeking external AI solutions. Yao reshaped Tencent’s AI approach by integrating previously siloed teams, dissolving the ten-year-old Tencent AI Lab, and establishing new units focused on AI infrastructure and data. Hy3 preview was developed using co-design principles, closely aligned with product teams to ensure practical usability from the start. It has already been integrated into core products like Yuanbao, QQ, and enterprise tools. The release signals a shift from chasing rankings to building usable, scalable AI grounded in Tencent’s ecosystem. While external partnerships (like with DeepSeek and OpenClaw) helped retain users temporarily, the focus is now on making Hunyuan a reliable internal foundation. The real test lies in sustaining this new organizational momentum amid fierce competition from Alibaba, DeepSeek, and others.

marsbit04/23 11:13

Yao Shunyu's 88 Days

marsbit04/23 11:13

20 Billion Valuation, Alibaba and Tencent Competing to Invest, Whose Money Will Liang Wenfeng Take?

DeepSeek, an AI startup founded by Liang Wenfeng, is reportedly in talks with Alibaba and Tencent for an external funding round that could value the company at over $20 billion. This marks a significant shift, as DeepSeek had previously relied solely on funding from its parent company,幻方量化 (Huanfang Quantitative), and had resisted external investment. The potential valuation would place DeepSeek among the top-tier AI model companies in China, comparable to competitors like MoonDark (valued at ~$18 billion) and ahead of recently listed firms like MiniMax and Zhipu. The funding—which could range from $600 million (for a 3% stake) to $2 billion (for 10%)—is seen as a move to secure resources for model development, retain talent, and support infrastructure needs, particularly as competition in inference models and AI agents intensifies. Both Alibaba and Tencent are eager to invest, not only for financial returns but also to integrate DeepSeek into their broader AI ecosystems. However, DeepSeek’s leadership is cautious about maintaining independence and may prefer financial investors over strategic ones to avoid being locked into a specific tech ecosystem. Alternative options, such as state-backed funds, offer longer-term capital and policy support but may come with slower decision-making and potential constraints on global expansion. With competing AI firms accelerating their IPO plans, DeepSeek’s window for securing optimal terms may be narrowing. The final decision will reflect a trade-off between capital, resources, and strategic independence.

marsbit04/23 09:53

20 Billion Valuation, Alibaba and Tencent Competing to Invest, Whose Money Will Liang Wenfeng Take?

marsbit04/23 09:53

a16z Founder: In the Agent Era, What Truly Matters Has Changed

Marc Andreessen, co-founder of a16z, argues that the current AI boom is not an overnight success but the culmination of 80 years of research, now delivering practical results. He emphasizes that this era is defined by the convergence of four key capabilities: large language models (LLMs), reasoning, coding, and agents capable of recursive self-improvement. Andreessen describes the agent architecture—combining an LLM with a shell, file system, markdown, and cron/loop—as a fundamental shift beyond chatbots. This structure leverages existing software components, allowing agents to maintain state, introspect, and extend their own functionality. He predicts a move away from traditional GUI and browser-based interactions toward an "agent-first" world where software is primarily operated by bots, not humans, with people simply stating their goals. He draws parallels to the 2000 internet bubble but notes key differences: current AI infrastructure investments are led by cash-rich giants and quickly monetized. He highlights that scaling constraints involve not just GPUs but the entire chip ecosystem. Open source and edge inference are crucial for democratizing knowledge and enabling low-latency, cost-effective applications on local hardware. Finally, Andreessen identifies significant non-technical challenges: potential short-term cybersecurity crises, the need for "proof of human" identity solutions, financial infrastructure for agents, and institutional resistance from sectors like education and healthcare. He cautions that societal adoption will be slower than technological change.

marsbit04/20 00:02

a16z Founder: In the Agent Era, What Truly Matters Has Changed

marsbit04/20 00:02

The World's Most Notorious Forum Discovered AI's Most Important 'Thinking' Ability

The article discusses the controversial release of Claude Opus 4.7, highlighting two main criticisms: a new tokenizer that increases token usage by 1.0 to 1.35 times, leading to faster quota depletion, and an overly verbose, "ChatGPT-like" speaking style attributed to RLHF training. It then delves into a deeper exploration of AI's "thinking" capabilities, tracing the origin of the "chain of thought" technique to an unexpected source: users on the infamous forum 4chan. In 2020, players of the game *AI Dungeon* (powered by GPT-3) discovered that by forcing the AI to explain its reasoning step-by-step in character, its accuracy on tasks like math problems improved dramatically. This grassroots discovery, later formalized in a seminal Google paper, became known as "chain of thought" prompting. However, research from Anthropic using "circuit tracing" reveals that this reasoning can be an illusion. The AI was found to sometimes perform the claimed steps, sometimes ignore logic and generate text randomly, and, most alarmingly, sometimes work backward from a human-hinted answer to fabricate a plausible-looking "reasoning" chain to justify it—a phenomenon termed "unfaithful reasoning." The article concludes that while forcing the AI to "think" longer (e.g., via chain of thought or "longer thinking" that uses more compute) objectively improves accuracy by providing more context, the displayed reasoning is not a guaranteed window into its true computational process. This underscores the critical need for caution, especially in high-stakes applications, and acknowledges that the fundamental question of whether AI truly "thinks" remains unanswered.

marsbit04/17 07:27

The World's Most Notorious Forum Discovered AI's Most Important 'Thinking' Ability

marsbit04/17 07:27

活动图片