Artículos Relacionados con Optimization

El Centro de Noticias de HTX ofrece los artículos más recientes y un análisis profundo sobre "Optimization", cubriendo tendencias del mercado, actualizaciones de proyectos, desarrollos tecnológicos y políticas regulatorias en la industria de cripto.

How Token-Hungry is Claude Code? A Comparative Experiment Shows Up to 30x Difference Across Three Frameworks

Claude Code's Token Consumption Exposed: Comparison Experiment Shows Up to 30x Difference Between Frameworks A recent experiment by the Composio team tested the same model (Kimi K3) across three different agent frameworks (Claude Code, Hermes, and Kimi Code) on 28 identical tasks. While task completion rates were similar, token consumption varied dramatically. The median token usage was approximately 61k for Kimi Code, 67k for Hermes, and a staggering 340k for Claude Code – about 6 times more than Kimi Code. For individual tasks, the maximum difference reached 30x. In terms of cost, using Claude Code averaged $2 per task compared to $0.22 for Kimi Code and $0.28 for Hermes (based on Kimi K3 pricing). Speed also differed, with Hermes being the fastest. Analysis suggests Claude Code's high token usage stems from its harness repeatedly feeding extensive context (previous messages, tool calls, command outputs, file contents) back into the model across multiple interaction rounds, significantly inflating input tokens rather than generating longer outputs. This highlights a crucial trend: the agent framework (harness) is becoming as important as the model itself for cost and efficiency. A separate study from Writer showed that simply switching the orchestration layer to their optimized harness reduced average task cost by 41% and latency by 44% across various models without sacrificing quality. The conclusion is clear: for cost-effective AI agents, optimizing the harness may yield greater savings than changing the model. The future of agent competition may hinge not just on capability ("can it do it?") but on efficiency ("who does it for less?").

marsbitAyer 12:26

How Token-Hungry is Claude Code? A Comparative Experiment Shows Up to 30x Difference Across Three Frameworks

marsbitAyer 12:26

Aave Gradually Phasing Out 75 Low-Utilization Reserve Assets! Here Are the Details

The leading decentralized finance (DeFi) lending protocol, Aave, has announced a major restructuring to improve platform efficiency. Founder Stani Kulechov stated the protocol will gradually phase out underutilized assets and scale back operations on certain blockchains. Specifically, Aave plans to cease support for 50 underused reserve assets on its native network and an additional 25 reserve assets across the Sonic, Scroll, zkSync, Metis, Soneium, and Aptos networks, totaling 75 assets. Management explained this move aims to allocate resources more efficiently and focus on assets with higher user demand. The restructuring will affect approximately $98.1 million in reserve assets and $15.6 million in loan positions. Aave stated the phase-out will be gradual, with transition plans to minimize user disruption. Experts note that periodically removing low-liquidity or underused assets is common in DeFi to reduce maintenance costs and enhance security and operational efficiency. As one of the world's largest DeFi protocols by Total Value Locked (TVL), Aave's decision may cause short-term liquidity shifts in affected assets but is viewed as a step toward building a more efficient and sustainable ecosystem in the long term.

cryptonews.ruHace 2 días 09:21

Aave Gradually Phasing Out 75 Low-Utilization Reserve Assets! Here Are the Details

cryptonews.ruHace 2 días 09:21

NVIDIA's 20-Year CUDA Moat Collapsed Over a Weekend, Claude Single-Handedly Got AMD's New GPU Running

In a single weekend, Claude, an AI agent from Anthropic, successfully ported and optimized its cutting-edge model to run on a brand-new AMD MI355X server rack without any manual code intervention. This feat demonstrates a potential breakthrough in overcoming NVIDIA's long-established CUDA software ecosystem dominance, built over two decades. Anthropic's team simply instructed Claude to get the AMD machine running. By Monday, it not only worked but was showing a continuously improving performance curve. The achievement impressed AMD CEO Lisa Su and accelerated a major deployment partnership: Anthropic plans to deploy up to 2GW of AMD Instinct GPUs starting in 2027. The key enabler is AMD's new ROCm.AI platform, a toolbox designed specifically for AI agents like Claude. It provides AI-readable documentation, including chip instruction sets (ISA), and tools like the Hyperloom service that allows agents to autonomously profile performance, identify bottlenecks, test configurations, and generate optimized kernels. In a demo, Hyperloom boosted the output speed of a model by 38%. This represents a fundamental shift. While CUDA's strength lies in its vast, human-expert-driven ecosystem of tools and tacit knowledge, AMD's strategy is to make its hardware and software stack directly accessible and optimizable by AI agents. An agent can parallelize tasks—debugging, profiling, coding—that would take human engineers years to master, compressing the traditional software adaptation timeline from years to tasks. The competition is no longer just about peak hardware specs but also about how well AI can read, utilize, and tune a platform.

marsbit07/28 00:09

NVIDIA's 20-Year CUDA Moat Collapsed Over a Weekend, Claude Single-Handedly Got AMD's New GPU Running

marsbit07/28 00:09

Claude Code Slashes 80% of Prompt Tokens, But Opus 5 Just Adds Them Right Back In

Claude Code, the AI coding assistant from Anthropic, recently announced a massive reduction of over 80% in its system prompt content for models like Opus 5 and Fable 5. The goal was to remove verbose, often conflicting, rules (like strict commenting and documentation requirements) and replace them with a simpler directive: write code that matches the style of the surrounding project. This "pruning" aims to make the model more efficient by reducing internal conflict from overlapping instructions, with no measurable performance drop reported. However, a developer's (@chenchengpro) investigation revealed a twist. While the prompt was drastically cut from 15,225 characters in Opus 4.7 to 4,467 in Opus 4.8, it *increased* by approximately 72% to 7,694 characters in Opus 5. This isn't a contradiction. The "over 80% cut" refers to the overall shift from the old, detailed rulebook-style prompts to a new, streamlined system. The 72% increase for Opus 5 represents new, targeted instructions added to manage the model's enhanced capabilities. Opus 5 is more proactive—it likes to report progress, generate longer outputs, use sub-agents, and expand task scope. The added prompt content (roughly 3,755 characters) primarily provides guidelines for "Delivering work" (controlling task scope, progress reporting) and "Corrections" (limiting excessive self-correction). These new rules are necessary to curb potential over-engineering on simple tasks, ensuring efficiency even as the model becomes more independent. In short, the old, restrictive manual was deleted, but new guidelines were written to harness the model's newfound initiative.

marsbit07/27 11:37

Claude Code Slashes 80% of Prompt Tokens, But Opus 5 Just Adds Them Right Back In

marsbit07/27 11:37

Large Model Memory Anxiety Solved on a USB Drive?

To address the memory and bandwidth challenges of large language model inference, SanDisk and SK Hynix are developing a new technology called High Bandwidth Flash (HBF). This approach adapts the advanced 3D packaging and stacking techniques used for High Bandwidth Memory (HBM) to NAND Flash, which is traditionally seen as slow for compute tasks. The key insight is that while NAND Flash is slow for writes, its read speeds can be significantly enhanced through parallelization. By stacking up to 16 NAND Flash chips, the first-generation HBF aims for capacities up to 512GB and read bandwidths up to 1.6 TB/s, with roadmaps targeting 3.2 TB/s. This positions HBF not as a replacement for HBM, but as a complementary, cost-effective tier in a hierarchical memory system for AI inference. During inference, model weights are largely static and read-intensive, making them suitable for HBF storage. This can alleviate the severe capacity bottleneck of expensive HBM, potentially reducing the number of accelerators needed per server, lowering overall system cost and power consumption. The technology, currently being standardized within the Open Compute Project, signals a shift towards more specialized memory architectures tailored for the specific demands of large-scale AI deployment.

marsbit07/20 00:18

Large Model Memory Anxiety Solved on a USB Drive?

marsbit07/20 00:18

OpenAI Officially Teaches You 8 Tricks to Master ChatGPT

OpenAI has released an updated guide with eight key strategies to get better results from ChatGPT: 1. **Use the latest model** (e.g., GPT-5.6 Sol) for best performance with prompt engineering. 2. **Provide clear, specific instructions**, detailing the desired content, format, style, and length. Avoid vague requests. 3. **Structure prompts effectively**: Place the core instruction at the beginning and use delimiters like `###` or `"""` to separate instructions from the text to be processed. 4. **Use examples and explanations** to clarify the exact output format and style you want. 5. **Adopt a stepwise approach**: Start with a zero-shot prompt (instruction only), then add a few examples (few-shot) if needed, and consider fine-tuning only as a last resort. 6. **Avoid vague or imprecise descriptions**. Use concrete terms (e.g., "3-5 sentences") instead of phrases like "keep it brief." 7. **Specify what to do, not just what to avoid**. After stating restrictions, guide the model toward the correct action. 8. **For code generation, use "leading words"** like `import` for Python or `SELECT` for SQL to steer the model into the correct pattern. Additionally, OpenAI's new "Generate Anything" feature can automatically create suitable prompts based on a simple description of your task. Mastering these techniques helps users get more accurate and useful outputs from ChatGPT.

marsbit07/16 08:26

OpenAI Officially Teaches You 8 Tricks to Master ChatGPT

marsbit07/16 08:26

Can Large Models Write Industrial-Grade Optimization Algorithms? MIT Proposes FrontierOR to Set an Exam for AI

Can large language models (LLMs) design industrial-grade optimization algorithms? MIT researchers introduced FrontierOR, a benchmark evaluating LLMs on their ability to design scalable, high-quality algorithms for complex, large-scale optimization problems—going beyond simple modeling or solver calls. The benchmark, constructed from 180 real-world problems published in OR journals (1992-2025), assesses models in one-shot algorithm generation and self-evolution settings. Key findings show top models achieve high code execution rates (~0.98), but struggle to maintain feasibility and near-optimal solution quality on hard instances. Models like Claude Opus 4.6 exhibit more diverse algorithm design (e.g., decomposition, heuristics, hybrids), correlating with better performance. Self-evolution frameworks (e.g., CORAL) significantly boost results, raising the quality-time efficiency metric from 0.15 to 0.50 on the hardest tasks by iteratively refining algorithms. The study highlights a shift in failure modes from basic modeling errors to deeper challenges in heuristic search and structural exploitation. FrontierOR points toward future AI-driven optimization systems where LLMs act as algorithm designers, dynamically composing strategies and learning from feedback for applications in supply chain, energy, and transportation.

marsbit07/10 09:08

Can Large Models Write Industrial-Grade Optimization Algorithms? MIT Proposes FrontierOR to Set an Exam for AI

marsbit07/10 09:08

Vitalik’s Rollup Proof Work Shows Ethereum Scaling Still Runs Through Cryptography

Vitalik Buterin's latest technical work focuses on proof optimization for Ethereum rollups, highlighting that the network's scaling path remains fundamentally tied to advances in cryptography. His research delves into improving the efficiency of polynomial commitments and other proof systems, which are essential for rollups to securely compress and verify transactions. While dense and technical, this ongoing development is crucial for enabling cheaper, faster, and more scalable execution on Ethereum without compromising security. The work underscores that beneath market noise around ETFs and prices, Ethereum's research layer remains actively focused on the foundational improvements needed for long-term growth and its position as a leading settlement layer.

bitcoinist07/09 23:12

Vitalik’s Rollup Proof Work Shows Ethereum Scaling Still Runs Through Cryptography

bitcoinist07/09 23:12

Fable 5 Crafts First CUDA 'Megakernel' from Scratch, Achieves 18.7x Speedup in 2.5 Hours

AI model Fable 5 (a safety-limited version of Anthropic's Claude Mythos) has achieved a breakthrough in GPU kernel optimization. In the rigorous KernelBench-Mega benchmark—which requires fusing an entire model's compute block into a single kernel—Fable 5 autonomously wrote a highly optimized CUDA "megakernel." This kernel executes a complete Kimi-Linear W4A16 hybrid decoding task within a single GPU kernel launch, using 14 grid barriers to sequence operations. The result was a performance increase of 18.7x over the baseline on an RTX PRO 6000 GPU, significantly outpacing competitors like Claude Opus 4.8 (14.4x) and GPT-5.5 (4.34x). Notably, its performance advantage widened with longer context lengths. The model spent the majority of its 2.5-hour, 550k-token session analyzing benchmarks and theoretical limits before coding, leading to an exceptionally efficient final design. Anthropic co-founder Jack Clark described this as the beginning of a "recursive self-improvement (RSI) loop," where AI's ability to optimize its own underlying computational infrastructure could rapidly accelerate its own development cycle. This advance highlights AI's growing capability in complex, low-level engineering tasks that were previously a human stronghold.

marsbit07/07 07:36

Fable 5 Crafts First CUDA 'Megakernel' from Scratch, Achieves 18.7x Speedup in 2.5 Hours

marsbit07/07 07:36

Sui Testnet Update v1.74.1 Slashes Transaction Gas Costs Via Protocol Version 128

Sui blockchain developer Mysten Labs has deployed testnet update v1.74.1, introducing protocol version 128. The primary outcome of this upgrade is a significant reduction in transaction gas costs for users and developers operating on the testnet. These optimizations are designed to enhance network performance and scalability in preparation for a future mainnet deployment. It is crucial to note that these changes are currently confined to the testnet environment. The development provides a concrete, source-verified data point regarding ongoing protocol improvements. For market participants, this represents a confirmed technical advancement to consider, though it should be weighed alongside broader market factors and does not in itself guarantee specific price movements. The story highlights a focus on foundational development amid typical market volatility.

bitcoinist07/03 16:07

Sui Testnet Update v1.74.1 Slashes Transaction Gas Costs Via Protocol Version 128

bitcoinist07/03 16:07

Artículos Relacionados con Optimization

How Token-Hungry is Claude Code? A Comparative Experiment Shows Up to 30x Difference Across Three Frameworks

Aave Gradually Phasing Out 75 Low-Utilization Reserve Assets! Here Are the Details

NVIDIA's 20-Year CUDA Moat Collapsed Over a Weekend, Claude Single-Handedly Got AMD's New GPU Running

Claude Code Slashes 80% of Prompt Tokens, But Opus 5 Just Adds Them Right Back In

Large Model Memory Anxiety Solved on a USB Drive?

OpenAI Officially Teaches You 8 Tricks to Master ChatGPT

Can Large Models Write Industrial-Grade Optimization Algorithms? MIT Proposes FrontierOR to Set an Exam for AI

Vitalik’s Rollup Proof Work Shows Ethereum Scaling Still Runs Through Cryptography

Fable 5 Crafts First CUDA 'Megakernel' from Scratch, Achieves 18.7x Speedup in 2.5 Hours

Sui Testnet Update v1.74.1 Slashes Transaction Gas Costs Via Protocol Version 128

Ethereum

Industry News