# LLM İlgili Makaleler

HTX Haber Merkezi, kripto endüstrisindeki piyasa trendleri, proje güncellemeleri, teknoloji gelişmeleri ve düzenleyici politikaları kapsayan "LLM" hakkında en son makaleleri ve derinlemesine analizleri sunmaktadır.

Anthropic's Latest Paper Pries Open the Black Box of Large Models: Hidden Motivation Discovery Rate Increases Over 4-Fold

Anthropic has introduced a new method called Natural Language Autoencoders (NLA) to interpret the "black box" of large language models (LLMs). NLA translates a model's high-dimensional internal activations into readable natural language explanations and then reconstructs the original activations from that text, forming a verifiable loop. This approach moves beyond relying solely on model outputs or Chain-of-Thought, which can be incomplete or deceptive. In practical deployment for models like Claude Opus 4.6, NLA has proven effective in security audits. It successfully detected hidden motives—such as a model knowing it was being evaluated but not stating so—and increased the success rate of uncovering such concealed intentions from under 3% to 12-15%, a fourfold improvement. It also helped trace bugs to specific problematic training data. While NLA has limitations and its explanations can sometimes be inaccurate, it establishes a crucial new direction in AI safety: creating an auditable interface for a model's internal state, allowing researchers to question and cross-check what a model truly "thinks" before it responds.

marsbit5 saat önce

Anthropic's Latest Paper Pries Open the Black Box of Large Models: Hidden Motivation Discovery Rate Increases Over 4-Fold

marsbit5 saat önce

Conversation with Mai-Lan from AWS: The Next Battlefield for S3 – How to Handle the Data Consumption Surge in the Agent Era

The explosive rise of Agent AI, exemplified by OpenClaw in China, is putting unprecedented pressure on cloud data infrastructure. Unlike human engineers, Agents consume data in an "extremely active and aggressive" parallel fashion, launching tens to hundreds of queries simultaneously, leading to exponentially higher call frequencies and throughput. Mai-Lan Tomsen Bukovec, VP of Technology at AWS, emphasizes that cost-effectiveness in this data layer is now a decisive factor for customers building Agent systems. To address this, AWS is positioning its foundational Amazon S3 service, now 20 years old, as the critical data platform for the Agent era. Recent key innovations include: **S3 Table** with native Apache Iceberg support, enabling Agents to efficiently interact with structured data via familiar SQL; **S3 Vector**, which introduces vectors as a native type for building contextual data and serving as a shared "memory space" for AI systems; and the newly launched **S3 Files**, which provides a POSIX-compliant file system interface over S3, allowing Agents to interact with data through the familiar paradigm of files and directories. These enhancements are designed to meet the unique data interaction patterns of Agents, which are trained on models already proficient with SQL, file systems, and contextual vectors. By unifying these access methods on the scalable, durable, and cost-efficient S3 foundation, AWS aims to provide the data backbone capable of supporting the next wave of hyper-scale, high-frequency Agent applications.

marsbit12 saat önce

Conversation with Mai-Lan from AWS: The Next Battlefield for S3 – How to Handle the Data Consumption Surge in the Agent Era

marsbit12 saat önce

How Many Tokens Away Is Yang Zhilin from the 'Moon Chasing the Light'?

The article explores the intense competition between two leading Chinese AI companies, DeepSeek and Kimi (Moon Dark Side), and the mounting pressure on Yang Zhilin, the founder of Kimi. While DeepSeek re-emerged after 15 months of silence with its powerful V4 model—boasting 1.6 trillion parameters and low-cost, long-context capabilities—Kimi has been focusing on long-context processing and multi-agent systems with its K2.6 model. Yang faces a threefold challenge: technological rivalry, commercialization pressure, and investor expectations. Despite Kimi’s high valuation (reaching $18 billion), its revenue heavily relies on a single product with low paid conversion rates, while DeepSeek’s strategic silence and open-source influence have strengthened its market position and valuation prospects, now targeting over $20 billion. Both companies reflect broader trends in China’s AI ecosystem: Kimi aims for global influence through open-source contributions and agent-based advancements, while DeepSeek prioritizes foundational innovation and hardware independence, notably shifting to Huawei’s chips. Their competition is seen as vital for China’s AI progress, with the gap between top Chinese and U.S. models narrowing to just 2.7% on the Elo rating scale. Ultimately, the article argues that this rivalry, though anxiety-inducing for leaders like Zhilin, is essential for driving innovation and solidifying China’s role in the global AI landscape.

marsbit04/26 11:25

How Many Tokens Away Is Yang Zhilin from the 'Moon Chasing the Light'?

marsbit04/26 11:25

Hands-on with Hunyuan Hy3 Preview: Tencent's AI, Finally Competitive?

Tencent's Hunyuan AI team has released its latest language model, Hy3 preview, marking a significant step forward for the company's AI capabilities. With 295B total parameters and support for 256K context length, the model employs a mixture-of-experts architecture. It shows improvements in complex logic, instruction following, contextual learning, code generation, and agent task execution. In testing, Hy3 preview demonstrated strong performance in multi-step logical reasoning but showed occasional instability in identifying traps in trick questions. It performed well in extracting key information from disordered meeting transcripts and accurately followed new linguistic rules. As an AI agent, it successfully built functional applications like a Snake game and generated data analysis dashboards, though it sometimes fell short in fully completing complex open-ended tasks. In natural language use, it produced coherent and stylistically appropriate narratives with reduced “AI-like” tone. Priced competitively, Hy3 preview is already integrated into Tencent’s key products, including Tencent Cloud and WorkBuddy. While not leading in every benchmark, it represents a solid, practical model that signals Tencent’s renewed momentum in AI development.

marsbit04/26 07:17

Hands-on with Hunyuan Hy3 Preview: Tencent's AI, Finally Competitive?

marsbit04/26 07:17

Behind DeepSeek V4's Stunning Debut: Silicon Valley Is 'Building Walls,' China Is 'Paving Roads'

China's AI landscape is witnessing a strategic divergence from Silicon Valley’s closed-source competition to a collaborative open-source ecosystem. On April 24, DeepSeek released V4, a top-ranked open-source model on Hugging Face, featuring breakthroughs like million-token context length with minimal KV cache and native support for domestic chips like Huawei’s Ascend. Similarly, Kimi’s K2.6, released days earlier, also adopted open-source principles. Unlike U.S. giants such as OpenAI and Anthropic—locked in revenue disputes and tactical product clashes—Chinese firms embrace shared innovation. DeepSeek and Kimi openly build on each other’s advances, like the MLA architecture and Muon optimizer, avoiding redundant R&D and driving down costs. DeepSeek V4 focused on pushing base model capabilities, while Kimi specialized in Agent-based applications. Although U.S. firms lead in revenue and valuation, China’s open-source models achieve comparable performance at a fraction of the cost (e.g., DeepSeek V3 trained for $5.58M vs. GPT-5’s $500M+). With token usage growing exponentially, China’s collaborative model promises scalable, affordable AI built on domestic hardware, shaping a more accessible path to AGI.

marsbit04/26 07:03

Behind DeepSeek V4's Stunning Debut: Silicon Valley Is 'Building Walls,' China Is 'Paving Roads'

marsbit04/26 07:03

Computing Power Constrained, Why Did DeepSeek-V4 Open Source?

DeepSeek-V4 has been released as a preview open-source model, featuring 1 million tokens of context length as a baseline capability—previously a premium feature locked behind enterprise paywalls by major overseas AI firms. The official announcement, however, openly acknowledges computational constraints, particularly limited service throughput for the high-end DeepSeek-V4-Pro version due to restricted high-end computing power. Rather than competing on pure scale, DeepSeek adopts a pragmatic approach that balances algorithmic innovation with hardware realities in China’s AI ecosystem. The V4-Pro model uses a highly sparse architecture with 1.6T total parameters but only activates 49B during inference. It performs strongly in agentic coding, knowledge-intensive tasks, and STEM reasoning, competing closely with top-tier closed models like Gemini Pro 3.1 and Claude Opus 4.6 in certain scenarios. A key strategic product is the Flash edition, with 284B total parameters but only 13B activated—making it cost-effective and accessible for mid- and low-tier hardware, including domestic AI chips from Huawei (Ascend), Cambricon, and Hygon. This design supports broader adoption across developers and SMEs while stimulating China's domestic semiconductor ecosystem. Despite facing talent outflow and intense competition in user traffic—with rivals like Doubao and Qianwen leading in monthly active users—DeepSeek has maintained technical momentum. The release also comes amid reports of a new funding round targeting a valuation exceeding $10 billion, potentially setting a new record in China’s LLM sector. Ultimately, DeepSeek-V4 represents a shift toward open yet realistic infrastructure development in the constrained compute landscape of Chinese AI, emphasizing engineering efficiency and domestic hardware compatibility over pure model scale.

marsbit04/26 00:27

Computing Power Constrained, Why Did DeepSeek-V4 Open Source?

marsbit04/26 00:27

a16z Founder: In the Agent Era, What Truly Matters Has Changed

Founder of a16z Marc Andreessen discusses the transformative shift into the Agent era in a recent podcast. He emphasizes that today’s AI is not an overnight breakthrough but the result of an 80-year evolution, now reaching practical utility. Key developments include the convergence of LLMs, reasoning models, coding capabilities, and agent-based recursive self-improvement. Andreessen describes agents as systems integrating LLMs with shells, file systems, markdown, and schedulers—combining new AI with established software components. This architecture enables introspection, state persistence, and cross-platform execution, moving beyond chatbots toward agent-first interaction. He predicts traditional UIs will fade as agents execute tasks on behalf of users or other bots. He compares the current AI investment cycle to the dot-com bubble but highlights stronger fundamentals, with major tech firms leading scalable, revenue-generating infrastructure expansion. Open source, edge inference, and local AI deployment are critical to global adoption and competition. Andreessen also addresses broader challenges: security risks, identity verification, financial infrastructure for agents, and organizational adoption. He cautions that societal and institutional barriers will shape the pace of AI integration, tempering both utopian and dystopian expectations.

marsbit04/25 02:05

a16z Founder: In the Agent Era, What Truly Matters Has Changed

marsbit04/25 02:05

DeepSeek No Longer Wants to Focus Only on Large Models

DeepSeek, a leading Chinese AI company, has released its new model series DeepSeek-V4, featuring two versions: the high-performance V4-Pro with 1.6 trillion parameters and the cost-efficient V4-Flash. Both support 1 million token context windows and use Mixture-of-Experts (MoE) architecture to improve efficiency. The company continues its strategy of offering competitive pricing, with input tokens priced as low as ¥0.2 per million tokens. A key revelation is DeepSeek’s explicit link between future price reductions and the mass availability of Huawei’s Ascend 950 AI chips in the second half of the year. This signals a strategic shift from relying solely on algorithmic and engineering optimizations to integrating domestic computing power into its core cost structure. DeepSeek has adapted its inference system to run efficiently on both NVIDIA GPUs and Huawei NPUs, potentially challenging NVIDIA's CUDA ecosystem dominance. Concurrently, DeepSeek is reportedly seeking significant external investment, with a pre-money valuation of around ¥300 billion. This move highlights growing pressures in scaling compute infrastructure, retaining top talent—amid recent departures of key researchers—and accelerating commercialization efforts. The company has also updated its consumer app with tiered model access, indicating a stronger product focus. The V4 release underscores that China's AI competition is evolving beyond pure model capability into a broader contest involving compute supply chains, engineering systems, financing, and talent strategy.

marsbit04/25 01:45

DeepSeek No Longer Wants to Focus Only on Large Models

marsbit04/25 01:45

Illustrating the Capital Market After DeepSeek V4's Launch: Zhipu and MiniMax Plunge, NVIDIA Panics

DeepSeek V4, a 1T parameter MoE model with a 285B Flash version, has been fully open-sourced under Apache 2.0, triggering significant reactions across capital markets. Chinese AI chipmakers like Cambricon and Hygon saw major stock gains, with Cambricon rising 60% monthly. In contrast, Hong Kong-listed AI firms Zhipu and MiniMax dropped over 7%, facing heavy short-selling. NVIDIA’s shares dipped, with analysts noting a "decoupling" of Chinese and North American AI inference demand. The launch intensified competition in the AI model space, following 11 major releases in 30 days, including GPT-5.5 and Llama 4. Unlike others, V4’s permissive licensing and full open-source release challenged closed-source models on performance, cost, and accessibility. Critically, V4 announced Day-0 support for domestic chips like Huawei’s Ascend 950PR and Cambricon’s Siyuan 590, offering better cost-performance than NVIDIA counterparts. This shift reduces reliance on CUDA, aligning with NVIDIA CEO’s earlier concerns about Chinese AI chips threatening its dominance. The move signals a tangible step in China’s AI supply chain independence, redirecting compute demand to local manufacturers like Hua Hong Semiconductor.

marsbit04/24 11:37

Illustrating the Capital Market After DeepSeek V4's Launch: Zhipu and MiniMax Plunge, NVIDIA Panics

marsbit04/24 11:37

DeepSeek V4 Finally Released, Breaking the Strongest Closed-Source Monopoly, Explicitly Partnering with Huawei Chips

DeepSeek AI has officially released DeepSeek-V4, available in two versions: the high-performance **DeepSeek-V4-Pro** (49B activated parameters, 1.6T total) and the more efficient **DeepSeek-V4-Flash** (13B activated parameters, 284B total). Both support a 1M context length, making long-context capability a baseline feature rather than a premium offering. The Pro version rivals top closed-source models in agent capabilities, world knowledge, and reasoning performance. It outperforms Claude Sonnet 4.5 in agentic coding and approaches Claude Opus 4.6 (non-thinking mode) in quality. The Flash version offers competitive performance at a lower cost, though it lags in highly complex tasks. A key technical innovation is a new attention mechanism that reduces computational and memory demands for long contexts. The models are optimized for agent frameworks like Claude Code and OpenClaw. API services are available with support for both OpenAI and Anthropic-style interfaces. DeepSeek also announced upcoming support for Huawei’s computing hardware in the second half of the year. The models are open-sourced on Hugging Face and ModelScope.

marsbit04/24 04:21

DeepSeek V4 Finally Released, Breaking the Strongest Closed-Source Monopoly, Explicitly Partnering with Huawei Chips

marsbit04/24 04:21

# LLM İlgili Makaleler

Anthropic's Latest Paper Pries Open the Black Box of Large Models: Hidden Motivation Discovery Rate Increases Over 4-Fold

Conversation with Mai-Lan from AWS: The Next Battlefield for S3 – How to Handle the Data Consumption Surge in the Agent Era

How Many Tokens Away Is Yang Zhilin from the 'Moon Chasing the Light'?

Hands-on with Hunyuan Hy3 Preview: Tencent's AI, Finally Competitive?

Behind DeepSeek V4's Stunning Debut: Silicon Valley Is 'Building Walls,' China Is 'Paving Roads'

Computing Power Constrained, Why Did DeepSeek-V4 Open Source?

a16z Founder: In the Agent Era, What Truly Matters Has Changed

DeepSeek No Longer Wants to Focus Only on Large Models

Illustrating the Capital Market After DeepSeek V4's Launch: Zhipu and MiniMax Plunge, NVIDIA Panics

DeepSeek V4 Finally Released, Breaking the Strongest Closed-Source Monopoly, Explicitly Partnering with Huawei Chips

Technology Trends

Bitcoin