# LLM Articoli collegati

Il Centro Notizie HTX fornisce gli articoli più recenti e le analisi più approfondite su "LLM", coprendo tendenze di mercato, aggiornamenti sui progetti, sviluppi tecnologici e politiche normative nel settore crypto.

Claude Code Slashes 80% of Prompt Tokens, But Opus 5 Just Adds Them Right Back In

Claude Code, the AI coding assistant from Anthropic, recently announced a massive reduction of over 80% in its system prompt content for models like Opus 5 and Fable 5. The goal was to remove verbose, often conflicting, rules (like strict commenting and documentation requirements) and replace them with a simpler directive: write code that matches the style of the surrounding project. This "pruning" aims to make the model more efficient by reducing internal conflict from overlapping instructions, with no measurable performance drop reported. However, a developer's (@chenchengpro) investigation revealed a twist. While the prompt was drastically cut from 15,225 characters in Opus 4.7 to 4,467 in Opus 4.8, it *increased* by approximately 72% to 7,694 characters in Opus 5. This isn't a contradiction. The "over 80% cut" refers to the overall shift from the old, detailed rulebook-style prompts to a new, streamlined system. The 72% increase for Opus 5 represents new, targeted instructions added to manage the model's enhanced capabilities. Opus 5 is more proactive—it likes to report progress, generate longer outputs, use sub-agents, and expand task scope. The added prompt content (roughly 3,755 characters) primarily provides guidelines for "Delivering work" (controlling task scope, progress reporting) and "Corrections" (limiting excessive self-correction). These new rules are necessary to curb potential over-engineering on simple tasks, ensuring efficiency even as the model becomes more independent. In short, the old, restrictive manual was deleted, but new guidelines were written to harness the model's newfound initiative.

marsbitIeri 11:37

Claude Code Slashes 80% of Prompt Tokens, But Opus 5 Just Adds Them Right Back In

marsbitIeri 11:37

One-Third of arXiv 'Contaminated', 65% of CS Papers Smell of AI, Only 0.7% in Math

Approximately one-third of recently posted arXiv papers show signs of significant AI-generated text, according to a new study. An analysis of 12,750 papers from January 2023 to July 2026 across ten disciplines found a sharp increase in AI text markers following ChatGPT's release, with the overall detection rate reaching 32% in the latest quarter and peaking near 39% in early 2026. The rate varies drastically by field. Computer Science papers lead at 65%, followed by Quantitative Biology (56.3%) and Electrical Engineering (51.3%). Mathematics, however, has the lowest detection rate at just 0.7%. The study's authors note this could be due to mathematicians using AI less or because the detector struggles with the high volume of formulas and symbolic notation in math papers, leaving the true cause unclear. The research highlights that the detector identifies a statistical "AI style" in the text rather than proving full AI authorship. It cannot distinguish between light AI-assisted editing and fully AI-generated content. Furthermore, the detector can produce false positives, as some pre-ChatGPT academic writing also exhibits patterns now flagged as "AI-like." The growing use of AI, particularly in highly competitive fields, is creating a cycle where researchers may feel pressured to adopt AI tools to keep pace. The findings raise questions about the changing nature of academic writing and the emergence of a new "AI style" that is increasingly difficult to distinguish from human prose, potentially undermining trust in written text regardless of its true origin.

marsbitIeri 11:35

One-Third of arXiv 'Contaminated', 65% of CS Papers Smell of AI, Only 0.7% in Math

marsbitIeri 11:35

2 Months, Valuation Soars from $8.8B to $68B! The Largest AI Model Hub OpenRouter May Be Acquired

Stripe is reportedly in talks to acquire AI model marketplace OpenRouter for a price nearing $10 billion, a dramatic increase from its $1.3 billion valuation just two months prior. The deal, which could be announced within a month, would see the payment giant absorb a key "router" or aggregation layer in the AI infrastructure stack. OpenRouter provides developers with a single API to access over 400 large language models (LLMs), automatically routing queries to the most suitable model based on cost, capability, and speed. This allows AI applications to optimize expenses while maintaining user experience. Founded in 2023 by ex-OpenSea co-founder Alex Atallah and Louis Vichy, OpenRouter has grown rapidly, reaching $50 million in annualized revenue by April and serving over one million developers. For Stripe, the acquisition of OpenRouter follows its late-2025 purchase of usage-based billing platform Metronome. The combined strategy aims to create an integrated suite for the AI economy: OpenRouter would handle model selection and routing, Metronome would manage granular usage-based billing, and Stripe's core platform would process payments. This positions Stripe to control a critical part of the AI application value chain, influencing which models get used while simplifying cost management for enterprise customers.

链捕手07/24 08:59

2 Months, Valuation Soars from $8.8B to $68B! The Largest AI Model Hub OpenRouter May Be Acquired

链捕手07/24 08:59

Claude Opus 5 Leaks, First Wave of User Tests Arrive

Claude Opus 5 appears to have been leaked, with early test results circulating online. Users report generating highly detailed 3D scenes, such as a catapult attacking a castle with intricate parameter cards, dynamic weather interfaces, and realistic kitchen renders. Comparisons show Opus 5 outperforming the current Fable 5 in detail density for similar prompts. Other demos include impressive Minecraft recreations and detailed SVG graphics. Evidence of the leak includes sightings in Cursor's model selector (under the code "Honeycomb EAP") and Google Vertex AI, followed by reports of the model appearing for some users, though officially labeled as version 4.8. Speculation is growing that Opus 5 could match Fable 5's capabilities at half the price per token, though concerns exist about potentially higher token consumption. The model's formal release seems imminent.

marsbit07/24 07:53

Claude Opus 5 Leaks, First Wave of User Tests Arrive

marsbit07/24 07:53

As Consensus Accelerates, What Are Young Investors Betting On?

Title: As Consensus Forms Faster, What Are Young Investors Betting On? In the rapid evolution of tech investment, a new generation of young investors is navigating a landscape where AI, robotics, commercial aerospace, and quantum computing are advancing simultaneously. Traditional investment logic based on financial models is giving way to a need for deep technical understanding and the ability to act before industry consensus forms. An analysis of trends from the "WAIC FUTURE TECH" list of young investment leaders reveals key shifts in focus. The first major trend is the movement of AI from the digital screen into the physical world. Investment is shifting from large language models and chatbots towards embodied AI, robotics, AI hardware, and edge computing. While demonstrations generate excitement, the real challenge lies in achieving scalable, reliable, and cost-effective delivery in complex real-world environments like factories and logistics. Success depends not just on algorithms but on the integration of sensors, actuators, and control systems. Second, the competitive focus for large models is moving beyond raw capability toward building an "intelligence flywheel." The goal is to create self-reinforcing systems where user interaction generates data, improving the model, which in turn enhances the user experience and attracts more engagement. Companies that successfully embed AI into workflows to create these closed-loop systems can build lasting value that isn't easily erased by the next model upgrade. Third, facing a potential bottleneck in high-quality human-generated data, investors are looking at new underlying technologies. Reinforcement learning and self-play, as demonstrated by AlphaGo Zero, offer paths for AI to generate its own experience. Scientific foundation models, which aim to build general AI capabilities for fields like life sciences and materials discovery, represent a non-consensus direction that could unlock new frontiers of knowledge and data. Finally, in deep-tech areas like quantum computing, commercial aerospace, and space-based infrastructure, patient capital is essential. These fields have long, uncertain development and validation cycles involving complex engineering, supply chains, and regulations. Investment here requires a long-term view, focusing on foundational team capabilities and the eventual emergence of market demand, even if commercial returns are distant. Collectively, these trends illustrate how young investors are adapting to a new era. They are learning to make earlier, technically-informed judgments, balance hype with real-world viability, and provide the patient capital needed to build the deep-tech foundations of the future.

marsbit07/22 03:34

As Consensus Accelerates, What Are Young Investors Betting On?

marsbit07/22 03:34

Replicating the "DeepSeek Moment"? Wall Street Unanimously Says: Kimi K3 Instead Strengthens Computing Power Demand

Title: Wall Street Sees Kimi K3 as a Catalyst for Compute Demand, Not a "DeepSeek Moment 2.0" Summary: Following the release of Moonshot AI's powerful open-source model Kimi K3, initial market reaction mirrored the "DeepSeek moment" that sparked a sell-off in compute stocks earlier in 2025, fearing reduced demand for AI infrastructure. However, major Wall Street banks including UBS, Nomura, BofA, and Citi argue the opposite: K3 will accelerate, not weaken, demand for compute, memory, storage, and networking. Their analysis centers on K3's specifications—2.8 trillion parameters, 1M token context, and MoE architecture—which represent a "scale" story rather than a pure "efficiency" one like DeepSeek R1. These features increase pressure on inference, memory (especially KV cache), and storage. Analysts invoke Jevons Paradox: as high-quality models become more affordable (K3 is cheaper than top closed models but not the cheapest), usage and token volumes expand, ultimately increasing total compute consumption. The reports highlight that competition will force leading US AI labs (OpenAI, Anthropic, Google) to invest more in training and iteration to maintain their edge. Furthermore, the rise of capable open-source models like K3 is expanding the global AI developer ecosystem, with Chinese models now accounting for over 45% of developer traffic. Key beneficiaries identified across the AI infrastructure chain include memory/storage players (e.g., Micron, Samsung), compute leaders (Nvidia, TSMC), networking suppliers (due to "super-node" cluster needs for deploying K3), and cloud platforms (e.g., Alibaba) that host diverse model ecosystems. The consensus is that stronger open-source models are an entry point for the next wave of infrastructure demand diffusion, provided workload growth outpaces efficiency gains.

链捕手07/21 06:12

Replicating the "DeepSeek Moment"? Wall Street Unanimously Says: Kimi K3 Instead Strengthens Computing Power Demand

链捕手07/21 06:12

AI Claims Erdős's $100 Bounty, Solves in One Page What a 44-Page Top Journal Paper Couldn't

An AI, in collaboration with a mathematician, has produced a one-page proof for a long-standing Erdős problem (#119), claiming the $100 bounty originally offered by Paul Erdős. The problem concerns the maximum modulus of polynomials with zeros on the unit circle. The new result, generated with the help of GPT-5.6 Sol and posted on the erdosproblems.com forum, is notably simpler than a famous 44-page proof by József Beck published in the Annals of Mathematics in 1991, which addressed a related but distinct part of the problem. Thomas Bloom, a mathematician and the maintainer of the Erdős problems website, stated that the AI's proof uses straightforward harmonic analysis techniques and contains "interesting ideas," suggesting the problem was less inherently difficult than previously believed. This follows other recent AI-assisted proofs, such as for the Cycle Double Cover conjecture. The development has sparked debate within the mathematical community. While some argue AI has hit a wall in pure mathematics, others point to incremental but genuine progress on tough problems, indicating that AI's relentless, non-intuitive exploration can uncover overlooked paths that human mathematicians might dismiss after initial failures. This event highlights a potential shift: some "open" problems may persist not due to sheer difficulty, but due to the limits of human patience in exploring all possible avenues.

marsbit07/20 12:34

AI Claims Erdős's $100 Bounty, Solves in One Page What a 44-Page Top Journal Paper Couldn't

marsbit07/20 12:34

Large Model Memory Anxiety Solved on a USB Drive?

To address the memory and bandwidth challenges of large language model inference, SanDisk and SK Hynix are developing a new technology called High Bandwidth Flash (HBF). This approach adapts the advanced 3D packaging and stacking techniques used for High Bandwidth Memory (HBM) to NAND Flash, which is traditionally seen as slow for compute tasks. The key insight is that while NAND Flash is slow for writes, its read speeds can be significantly enhanced through parallelization. By stacking up to 16 NAND Flash chips, the first-generation HBF aims for capacities up to 512GB and read bandwidths up to 1.6 TB/s, with roadmaps targeting 3.2 TB/s. This positions HBF not as a replacement for HBM, but as a complementary, cost-effective tier in a hierarchical memory system for AI inference. During inference, model weights are largely static and read-intensive, making them suitable for HBF storage. This can alleviate the severe capacity bottleneck of expensive HBM, potentially reducing the number of accelerators needed per server, lowering overall system cost and power consumption. The technology, currently being standardized within the Open Compute Project, signals a shift towards more specialized memory architectures tailored for the specific demands of large-scale AI deployment.

marsbit07/20 00:18

Large Model Memory Anxiety Solved on a USB Drive?

marsbit07/20 00:18

DeepSeek V4 'Full-Blooded Edition' Leaked, Could Be Released As Early As Tomorrow

The highly anticipated full release of DeepSeek V4 is imminent, expected to launch as early as tomorrow after nearly three months of waiting. A select group has already received access to the GA (General Availability) beta, which includes two versions: DeepSeek V4 Flash and DeepSeek V4 Pro. Early testers report that V4's overall performance is close to the level of Opus 4.8, with coding capabilities rivaling GPT-5.6 Sol. Its agent abilities are significantly enhanced, and 3D/SVG generation has improved notably. While it may not surpass the recently released Kimi K3 in performance, its expected price point is significantly lower. The official release will introduce a new "peak/off-peak" pricing model for its API. For example, deepseek-v4-pro will cost $0.87 per million output tokens during standard times and $1.74 during peak hours. The flash version is even more aggressive at $0.28/$0.56 per million tokens, with cached input tokens priced extremely low at $0.0028. This makes V4 a strong contender in terms of cost-effectiveness, potentially offering Opus-level capabilities at a fraction of the cost, continuing DeepSeek's reputation as a "price disruptor" in the AI market. Initial demos showcasing V4's capabilities have begun circulating, including generated 3D simulation games, HTML games blending elements of Minecraft and No Man's Sky, and classic games like a "Cut the Rope" clone. The final GA version is set to replace the older deepseek-chat and deepseek-reasoner models, which will be retired on July 24th.

marsbit07/19 05:31

DeepSeek V4 'Full-Blooded Edition' Leaked, Could Be Released As Early As Tomorrow

marsbit07/19 05:31

Opening Claude's Brain Is Useless; The True Key to the AI Black Box Lies in Ontology Engineering

"Dissecting Claude's Brain Is Futile: The Real Key to the AI Black Box Lies in Ontology Engineering" This article critiques the limitations of Anthropic's "J-Space" research, which attempts to explain AI models by observing their internal neural activation patterns, akin to fMRI brain scans. While this "internalist" approach offers unprecedented visibility into model states, it fundamentally conflates observability with true explainability. The core issue is that understanding a model's output requires more than tracing neural activity; it necessitates examining the meaning of the information it processes—its relationship to the world, semantic norms, and human cognitive frameworks. The author proposes a paradigm shift: moving from a neuroscience-inspired focus on the model itself to an "information ontology" approach centered on the knowledge the model handles. Drawing from Kant's philosophical categories, the argument posits that true explainability lies in structuring and understanding information within a formal conceptual framework, not in peering into the "black box." The practical application of this theory is ontology engineering. Ontologies provide a structured, computable framework for knowledge, serving as a semantic anchor for model outputs. The article details a bidirectional synergy: Large Language Models (LLMs) can automate and scale ontology construction, while ontologies, in turn, enhance AI explainability. They act as a verification framework, allowing model reasoning to be traced back to defined concepts, properties, and relationships. This transforms explainability from the impossible task of making neural networks transparent into the achievable engineering goal of making their outputs and impacts understandable, traceable, and accountable. The future of AI explainability, therefore, lies not in explaining the model's internal mechanics but in explaining and governing the knowledge structures and real-world effects of its outputs.

marsbit07/17 07:39

Opening Claude's Brain Is Useless; The True Key to the AI Black Box Lies in Ontology Engineering

marsbit07/17 07:39

# LLM Articoli collegati

Claude Code Slashes 80% of Prompt Tokens, But Opus 5 Just Adds Them Right Back In

One-Third of arXiv 'Contaminated', 65% of CS Papers Smell of AI, Only 0.7% in Math

2 Months, Valuation Soars from $8.8B to $68B! The Largest AI Model Hub OpenRouter May Be Acquired

Claude Opus 5 Leaks, First Wave of User Tests Arrive

As Consensus Accelerates, What Are Young Investors Betting On?

Replicating the "DeepSeek Moment"? Wall Street Unanimously Says: Kimi K3 Instead Strengthens Computing Power Demand

AI Claims Erdős's $100 Bounty, Solves in One Page What a 44-Page Top Journal Paper Couldn't

Large Model Memory Anxiety Solved on a USB Drive?

DeepSeek V4 'Full-Blooded Edition' Leaked, Could Be Released As Early As Tomorrow

Opening Claude's Brain Is Useless; The True Key to the AI Black Box Lies in Ontology Engineering

Project Updates

Industry News