# Пов'язані статті щодо LLM

Центр новин HTX надає останні статті та поглиблений аналіз на тему "LLM", що охоплює ринкові тренди, оновлення проєктів, технологічні розробки та регуляторну політику в криптоіндустрії.

AI PCs Are Here, Going Toe-to-Toe with 120B Models Locally! NVIDIA Redefines the "Personal AI Computer" Foundation with RTX Spark

NVIDIA has redefined the "AI PC" standard with the launch of the RTX Spark super chip at GTC 2026. Boasting 1 petaflop (1000 TOPS) of AI performance, it dwarfs the 45-50 TOPS NPUs in current AI PCs. The SoC features a Blackwell GPU, a 20-core Arm CPU co-designed with MediaTek, and crucially, up to 128GB of unified memory shared between CPU and GPU. This architectural shift enables local execution of 120-billion-parameter large language models with million-token context windows, a massive leap from the 9B-40B models typical on current consumer hardware. Beyond AI, use cases include 12K video editing and high-fps ray-traced gaming. Key to enterprise adoption is a security collaboration with Microsoft. Windows security is upgraded, and NVIDIA's OpenShell sandbox runtime is integrated to safely contain AI agent actions. Major software support comes from Adobe, which announced a deep,底层-level rewrite of Photoshop and Premiere to leverage the unified memory for up to 2x performance gains. Six OEMs, including Dell, HP, Lenovo, and Microsoft Surface, will release RTX Spark-based轻薄本 and compact desktops this fall. However, questions remain about real-world performance,功耗, thermal management in laptops, pricing, and the actual impact of the OpenShell sandbox. The RTX Spark represents a fundamental power shift in the PC industry, moving from an x86 CPU-centric model to a GPU-centric SoC platform, but its ultimate success hinges on the upcoming product rollouts and ecosystem validation.

marsbit10 хв тому

AI PCs Are Here, Going Toe-to-Toe with 120B Models Locally! NVIDIA Redefines the "Personal AI Computer" Foundation with RTX Spark

marsbit10 хв тому

Running MoE on Mobile Phones? Meta Proposes MobileMoE, Speeding Up iPhone 16 Pro by 3.8x

Meta's MobileMoE, a mobile-optimized Mixture-of-Experts (MoE) language model architecture, enables efficient on-device large language model (LLM) inference for the first time on commercial smartphones. Designed for decoder-only Transformers, it replaces dense feed-forward layers with MoE layers. Key design choices include 8 experts with granularity g=8, top-4 routing, and a shared expert. The model undergoes a four-stage training process: pre-training, intermediate training, supervised fine-tuning, and quantization-aware training. Results show MobileMoE models, with similar memory footprint, achieve equal or higher average accuracy across 14 foundational benchmarks while using only 1/2 to 1/4 of the FLOPs compared to dense baselines. After INT4 quantization, they remain competitive. Notably, on an iPhone 16 Pro, MobileMoE-S demonstrates significant speedups: up to 3.8x faster in the prompt phase and 2.2-3.4x faster in per-token generation compared to a dense counterpart, with lower peak memory usage. While MobileMoE establishes a new Pareto frontier for on-device LLMs in accuracy-compute trade-offs, particularly excelling in code and math tasks, it currently lags behind models like Qwen3.5 2B in advanced instruction following and knowledge reasoning. Future work includes improving post-training techniques, exploring NPU deployment, and managing the runtime memory sensitivity of MoE models to varying inputs.

marsbit43 хв тому

Running MoE on Mobile Phones? Meta Proposes MobileMoE, Speeding Up iPhone 16 Pro by 3.8x

marsbit43 хв тому

Three Years Later: Looking Back at My Predictions About ChatGPT in 2023

Three Years Later: Revisiting My 2023 Predictions on ChatGPT In March 2023, shortly after ChatGPT's launch, I made 20 predictions about its future. Now, in mid-2026, I've used AI agents to fact-check each one against the latest data. Overall, most major directional forecasts were correct, with only one outright error (incorrectly stating GPT-4 had 100 trillion parameters). Key successes included predicting that RAG and retrieval architectures would become the standard for handling knowledge and hallucinations, that natural language interfaces (LUI) would create a massive new industry layer beyond the models themselves, and that China would develop viable large language models, significantly closing the performance gap with Western counterparts within about three years. Predictions about the absence of mass unemployment, the rise of a new "robot network" for agent communication, and ChatGPT not possessing consciousness also held true in their core arguments. However, the "devil was in the details." Errors frequently involved specific numbers, timelines, or overlooking distributional effects. I tended to overestimate the speed of adoption (e.g., for agent networks) while underestimating the ultimate scale of capabilities or costs (e.g., AI winning IMO gold without tools, or the extreme capital required for frontier models). Other misjudgments included: underestimating how AI would reinforce, not dissolve, information filter bubbles; incorrectly assuming AI-generated content would easily circumvent copyright (it has instead triggered record-breaking settlements); and misidentifying where value would be captured (it accrued overwhelmingly to the compute layer, like Nvidia, not just the application or model layers). Key lessons from reviewing these predictions are: 1) Directional and mechanistic insights are far more reliable than precise numbers or absolute statements. 2) There's a consistent bias to overestimate short-term speed but underestimate long-term magnitude. 3) Errors often lie in missing distributional impacts within a generally correct aggregate trend. 4) Predictions phrased with nuance and caveats aged the best. 5) Some fundamental debates (e.g., on machine consciousness or the ultimate value chain) remain unresolved even after three years. This exercise is less about scoring the past and more about establishing rules for clearer thinking about the next three years of AI.

marsbit14 год тому

Three Years Later: Looking Back at My Predictions About ChatGPT in 2023

marsbit14 год тому

Three Years Later: Looking Back on My 2023 Predictions for ChatGPT

Looking Back After Three Years: Revisiting My 2023 Predictions on ChatGPT In March 2023, shortly after ChatGPT's debut and before GPT-4's release, I made over twenty predictions about AI's future based on limited information and intuition. Now, in May 2026, I revisited those forecasts using an AI-driven analysis with 41 Opus 4.8 agents to cross-reference them with the latest data. The assessment used symbols: ✅ Correct, 🟢 Mostly Correct, 🟡 Partially Correct, ❌ Incorrect. Overall, the directional judgments held up well, with only one major factual error regarding GPT-4's rumored parameter size (incorrectly cited as 100T). However, nuances and degrees of accuracy revealed more. **What Was Largely Correct:** Predictions about mechanisms and directions proved accurate. The rise of RAG (Retrieval-Augmented Generation) as the standard architecture for combating AI hallucination was confirmed, as was the transformative potential of LUI (Language User Interface) in creating a new industry layer atop GUIs. The emergence of "robot networks" (agent-to-agent communication protocols) and China's rapid catch-up in developing capable large models (closing the performance gap with top models to ~2.7%) were also on point. The analysis affirmed that LLMs lack consciousness and that the Turing Test merely measures perceived intelligence. **What Was Off Target:** Errors often involved specific numbers, over-optimistic timelines, or misjudged distributions. The prediction that value would primarily accrue to the application layer was half-right but missed NVIDIA's dominance as the profitable infrastructure layer. Forecasts about AI circumventing copyright issues and fostering a "global common ground" by averaging human viewpoints were incorrect; instead, major copyright settlements occurred and AI personalization is increasing. Estimates for model training costs ("$5-10 billion cap") were significantly off, underestimating frontier costs and overestimating replication costs. The notion that LLMs could never do complex math without tools was disproven by later models winning IMO gold. **Key Patterns from the Review:** 1. **Direction over precision:** Judgments about mechanisms and trends were more reliable than specific numbers or definitive statements. 2. **Timing bias:** There was a tendency to overestimate short-term speed but underestimate long-term magnitude and transformation. 3. **The distribution blind spot:** Aggregate-level correctness often masked uneven impacts (e.g., on young professionals' employment). 4. **The value of qualifiers:** Predictions framed with caution (e.g., "reportedly," "for now," "prototype in 2-3 years") aged better. 5. **Some debates continue:** Issues like the nature of "emergent abilities" or machine consciousness remain unresolved. This three-year review highlights that while seeing the big picture is crucial, humility regarding specifics, timelines, and disparate impacts is essential for future forecasting.

链捕手17 год тому

Three Years Later: Looking Back on My 2023 Predictions for ChatGPT

链捕手17 год тому

6 Questions to Understand the Business Trends of AI

The AI industry has entered its "summer" phase, according to a six-dimensional scoring framework assessing its development cycle. Each dimension—narrative vs. delivery, system connectivity, delivery capability, ROI rationalization, common industry trends, and capital environment—scores 1 point, totaling 6 points. This places the industry firmly in summer (5-7 points), characterized by a coexistence of grand promises and tangible deliverables, with increasing pressure to demonstrate value and profitability. Key signals mark this shift. ByteDance's Doubao launched paid subscriptions, while OpenAI introduced an advertising platform. These moves are driven by dual forces: immense cost pressures from scaling user bases and massive compute requirements, and the maturation of commercial opportunities. Major players like Anthropic report explosive growth, highlighting AI's transition into core productivity infrastructure. For businesses, the path forward involves three strategic steps. First, identify a small, high-impact use case to quickly demonstrate a closed-loop value proposition, such as automating customer service or content generation. Second, systematically replicate successful pilots across the organization by standardizing processes, building shared AI capabilities, and aligning talent, incentives, and leadership. Finally, move beyond simply adding AI to existing workflows and undertake systemic reconstruction—redesigning processes for parallel AI-human collaboration, implementing real-time dashboards, and establishing automated trigger chains. The era where storytelling alone secured funding is over. The focus has shifted to delivering measurable efficiency gains, cost savings, and new revenue streams, as evidenced by real-world implementations in companies like Semir, Anta, and Midea. Success now depends on starting with a focused proof point, scaling it organization-wide, and ultimately allowing AI to redefine operational paradigms.

marsbitВчора 00:21

6 Questions to Understand the Business Trends of AI

marsbitВчора 00:21

Shanghai's Leading Large Model Company Initiates A-Share Listing

Shanghai-based AI large language model leader MiniMax has initiated the process for an A-share listing in China, having filed a pre-IPO tutoring report with the Shanghai Securities Regulatory Bureau on May 29. This move positions it to compete with Zhipu AI for the title of the first major domestic LLM company to list on the A-share market. Having already completed an IPO in Hong Kong in January 2026, MiniMax's stock price has surged approximately 409% since its debut, with its market capitalization reaching around HK$263.45 billion (approximately RMB 227.55 billion) as of May 29. The company's rapid growth is supported by strong business performance. Its Annual Recurring Revenue (ARR) has grown over 100% in the past two months and now exceeds $300 million. It serves over one million global enterprise and developer clients and has around 300 million users worldwide. For the full year 2025, MiniMax reported revenue of $79.038 million, with a gross margin of 25.4%. While it reported an adjusted net loss of $250 million, the loss rate has narrowed significantly year-over-year. On the product front, MiniMax has released several flagship models this year, including MiniMax-M2.5, M2.6, and M2.7, with the first and last being open-sourced. Its models gained significant traction earlier in the year, briefly becoming the top model provider by usage share on the OpenRouter platform in February. The company has also upgraded its AI agent product, now named Mavis, and is preparing to launch its next-generation MiniMax-M3 model. Technical previews indicate M3 will feature a novel "MiniMax Sparse Attention" mechanism, promising substantial improvements in inference speed. MiniMax's push for an A-share listing reflects a broader trend among China's leading AI firms, including Zhipu AI, Moonshot AI, StepFun, and 01.AI, to seek public listings. This strategy aims to secure broader financing channels to support the immense computational costs and ongoing commercialization efforts inherent in developing advanced large language models.

marsbit2 дні тому 02:45

Shanghai's Leading Large Model Company Initiates A-Share Listing

marsbit2 дні тому 02:45

Will the US AI Bull Market Crash?

Will the U.S. AI bull market collapse? SoftBank has invested $34.6 billion in OpenAI, with Masayoshi Son selling stakes in Nvidia, Deutsche Telekom, Alibaba, and T-Mobile to fund it. He plans to invest another $30 billion this year, raising his stake to 13%, even taking on debt. This frenzy is driven by OpenAI's valuation surging to $852 billion in February, generating over $45 billion in paper gains for SoftBank. Similarly, Anthropic is reportedly negotiating funding at a $900 billion valuation, up from $61.5 billion a year ago. The article draws a parallel to the dot-com bubble, comparing OpenAI and Anthropic to Yahoo. Back then, Yahoo's portal model seemed unassailable, but it was disrupted by more targeted services. Today, the core assumption is that all AI applications must rely on foundational models like OpenAI and Anthropic, making them permanent "toll booths" of the AI era. However, as AI becomes a ubiquitous utility, this "model-as-gateway" advantage may erode. Financially, to justify trillion-dollar valuations with high P/E ratios (30-40x), these companies would need annual net profits of $25-30 billion, implying revenues of $50-80 billion. Current metrics like Annual Recurring Revenue (ARR)—$25 billion for OpenAI and $30 billion for Anthropic—are based on monthly subscription extrapolations and include promotional, less-sticky API usage. Aggressive price cuts on tokens to capture market share further squeeze margins. A critical risk is that the entire AI industry's profitability depends on downstream applications generating substantial revenue. Currently, besides some coding and content assistance, no "killer app" has emerged to create massive new markets. If enterprises pause AI spending due to performance plateaus, economic downturns, or poor ROI, the foundation for these valuations could crumble. Two potential outcomes are outlined: 1) A Yahoo-style crash where valuations collapse, companies downsize, and AI becomes a low-margin utility business. 2) A successful reinvention where companies find sustainable monetization, perhaps by replacing SaaS or achieving AGI. However, the market's impatience could trigger a downturn before such a breakthrough. The article concludes that while AI will undoubtedly transform society as a fundamental infrastructure, the current speculative frenzy mirrors past bubbles. A correction wouldn't mean the end of AI but could remove financial hype, leading to more grounded integration into industries. The rapid rise warrants caution, as a collapse in trillion-dollar valuations could cause significant economic damage, surpassing the fallout from the dot-com bust.

marsbit2 дні тому 09:11

Will the US AI Bull Market Crash?

marsbit2 дні тому 09:11

Large Language Models Ace All Exams, Yet Move Farther from AGI: What Does This Paper Reveal?

The article discusses the ongoing challenge of defining and achieving Artificial General Intelligence (AGI). It notes that industry leaders have set vague, often profit- or time-based benchmarks for AGI, while the concept itself lacks a consensus definition—a situation the article compares to a "Rorschach test." It highlights a recent 2025 paper by researcher Michael Timothy Bennett, who proposes a new, measurable definition. Bennett frames AGI not as mimicking human performance on tests, which current large language models (LLMs) have already mastered, but as an "artificial scientist." A true AGI, according to this view, should be able to widely and efficiently adapt to new environments and tasks within real-world constraints (like computational and energy limits), focusing on the *discovery of new knowledge* rather than the replication of existing data. The author contrasts this with the current dominant approach of "scale-maxing"—massively scaling up data, parameters, and compute. While powerful, this method leads to models that fail on out-of-distribution problems and lack core intelligent abilities: they are passive learners, cannot reason causally, and cannot actively experiment or balance exploration with exploitation. The article argues that Bennett's framework offers a crucial shift. It makes AGI a quantifiable engineering problem and proposes new evaluation "adaptation benchmarks" that test an AI's ability to actively learn in novel scenarios. The conclusion is that achieving AGI will require a fundamental reset—a fusion of multiple methodologies beyond simple scaling, moving AI from mimicking patterns to embodying the scientific spirit of inquiry and discovery.

marsbit05/28 00:24

Large Language Models Ace All Exams, Yet Move Farther from AGI: What Does This Paper Reveal?

marsbit05/28 00:24

活动图片