Artículos Relacionados con Optimization

El Centro de Noticias de HTX ofrece los artículos más recientes y un análisis profundo sobre "Optimization", cubriendo tendencias del mercado, actualizaciones de proyectos, desarrollos tecnológicos y políticas regulatorias en la industria de cripto.

Major AI Collaboration Breakthrough! Stanford and NVIDIA Jointly Eliminate AI Communication Overhead, Boosting Reasoning Speed by 2.4x

Title: AI Collaboration Breakthrough: Stanford & NVIDIA Eliminate Communication Overhead, Boost Reasoning Speed by 2.4x A new approach called RecursiveMAS, developed by UIUC, Stanford, NVIDIA, and MIT, tackles the major bottleneck in multi-agent AI systems: the "language tax." Currently, AI agents collaborate by generating and reading natural language text, a slow, costly, and information-lossy process akin to inefficient radio communication. RecursiveMAS bypasses this by enabling agents to communicate directly through their "thoughts"—latent space vector representations—instead of text. Inspired by recursive language models, it treats each agent like a reusable layer in a recursive loop. A special lightweight module called RecursiveLink passes these high-dimensional, semantic-rich internal states between agents. Only the final agent decodes the last latent representation into human-readable text. This process, described as "telepathic" communication, dramatically cuts the overhead of encoding and decoding text at each step. The system is highly efficient; the core AI model weights remain frozen, and only the small RecursiveLink modules are trained, requiring updates to just 0.31% of total parameters. This reduces training costs by over 50% compared to full fine-tuning. Comprehensive evaluations across math, science, coding, and QA benchmarks show significant improvements: - **Accuracy:** Average increase of 8.3%, with gains up to 18.1% on complex math problems (AIME2025). - **Speed:** End-to-end reasoning is 1.2x to 2.4x faster, with greater speedups as recursive depth increases. - **Cost:** Token usage is reduced by 34.6% to 75.6%. The research suggests a new scaling paradigm for multi-agent systems: deepening recursive collaboration depth rather than merely adding more agents. This could address key production barriers like compute cost, latency, and memory limits. However, challenges remain, including the need for independent verification, compatibility between different AI models (heterogeneous agents), reduced interpretability of the "black-box" latent communication, and adaptation to complex real-world workflows involving tools and human interaction. If validated, RecursiveMAS could fundamentally change how AI agents work together, moving beyond inefficient "textual handoffs" to more seamless and powerful collaborative reasoning.

marsbitHace 2 días 00:10

Major AI Collaboration Breakthrough! Stanford and NVIDIA Jointly Eliminate AI Communication Overhead, Boosting Reasoning Speed by 2.4x

marsbitHace 2 días 00:10

Why the Establishment of SocialFi Originates from a Misunderstanding of Its Own Medium

"Why SocialFi's Establishment Stems from a Misunderstanding of Its Own Medium" This article critiques the failure of SocialFi projects by applying Marshall McLuhan's theory of "hot" and "cool" media. McLuhan posited that a medium's form—not its content—reshapes user behavior. "Hot" media (e.g., print, radio) deliver high-definition, complete information, promoting passive consumption. "Cool" media (e.g., cartoons, telephone calls) provide low-definition, fragmented signals, requiring active user participation to complete the meaning. Traditional social media platforms (like early Twitter) are quintessentially "cool." A tweet or like is an incomplete fragment; its significance emerges only through replies, shares, and community engagement—it's a participation engine disguised as a content system. SocialFi (e.g., Friend.tech) aimed to monetize social capital by attaching real-time, tradable prices to follows and posts. However, this didn't add an economic layer to a cool medium; it fundamentally transformed the medium itself. The explicit, high-resolution signal of price replaced the ambiguous, low-resolution signal of social interaction. The platform became a financial market dressed as a social network. Once the financial dynamics (speculative profits) faded, the underlying social fabric, which had been suffocated from the start, could not sustain it. The medium overheated and collapsed. This "heat death" pattern isn't unique to crypto. Over time, mainstream platforms often drift from cool to hot by adding features like public metrics, verification badges, and algorithmic feeds that optimize for clarity over participation, leading to user disengagement. The article proposes a viable alternative: the "condensation point." Here, capital is introduced locally and infrequently into a cool medium without saturating it. Examples include Substack (subscriptions), Patreon (memberships), and Bandcamp (music purchases). The core social medium remains cool and participatory, while capital condenses at specific, structurally separate points (e.g., a monthly fee). The key lesson: "Liquidity is heat." Adding it to a cool medium doesn't enhance it but alters its fundamental nature. The NFT boom and bust provides a starker example. Collecting is a classic cool medium, where value is built slowly through stories and community. By making floor prices, rarity scores, and real-time charts omnipresent, NFT platforms rapidly overheated the medium, turning collectors into traders and destroying the participatory culture that gave collections meaning in the first place. The conclusion is that for the next wave to succeed, designers must ask not how to price every social action, but how to let capital condense within a social system without disrupting the cool, participatory mechanics that create its enduring value.

marsbit05/14 09:39

Why the Establishment of SocialFi Originates from a Misunderstanding of Its Own Medium

marsbit05/14 09:39

Auto Research Era: 47 Tasks Without Standard Answers Become the Must-Test Leaderboard for Agent Capabilities

The article introduces Frontier-Eng Bench, a new benchmark for AI agents developed by Einsia AI's Navers lab. Unlike traditional tests with clear answers, this benchmark presents 47 complex, real-world engineering tasks—such as optimizing underwater robot stability, battery fast-charging protocols, or quantum circuit noise control—where there is no single correct solution, only continuous optimization towards a limit. It shifts AI evaluation from static knowledge retrieval to a dynamic "engineering closed-loop": the AI must propose solutions, run simulations, interpret errors, adjust parameters, and re-run experiments to iteratively improve performance. This process tests an agent's ability to learn and evolve through long-term feedback, much like a human engineer tackling trade-offs between power, safety, and performance. Key findings from the benchmark reveal two patterns: 1) Improvements follow a power-law decay, becoming harder and smaller as optimization progresses, and 2) While exploring multiple solution paths (breadth) helps, sustained depth in a single path is crucial for breakthrough innovations. The research suggests this marks a step toward "Auto Research," where AI systems can autonomously conduct continuous, tireless optimization in scientific and engineering domains. Humans would set high-level goals, while AI agents handle the iterative experimentation and refinement. This could fundamentally change research and development workflows.

marsbit05/13 07:06

Auto Research Era: 47 Tasks Without Standard Answers Become the Must-Test Leaderboard for Agent Capabilities

marsbit05/13 07:06

How to Automate Any Workflow with Claude Skills (Complete Tutorial)

This is a comprehensive guide to mastering Claude Skills, a feature for creating permanent, reusable instruction sets that automate specific workflows. Unlike simple saved prompts, Skills function like trained employees, delivering consistent, high-quality outputs by defining the entire task process, standards, error handling, and output format. The guide is structured in four phases: **Phase 1: Installation (5 minutes).** Skills are folders containing a `SKILL.md` file. The user is instructed to find a relevant Skill online, install it, test it on a real task, and compare its performance to one-off prompts. **Phase 2: Building Your First Custom Skill.** Start by rigorously defining the Skill's purpose, trigger phrases, and providing a concrete example of perfect output. The `SKILL.md` file has two parts: a YAML frontmatter with a specific name/description/triggers, and a detailed, step-by-step workflow written in natural language with examples and quality standards. **Phase 3: Testing & Optimization for Production.** Test the Skill in three scenarios: 1) a standard, common task; 2) edge cases with missing or conflicting data; and 3) a pressure test with maximum complexity. Any failure indicates a needed instruction. Implement a weekly optimization cycle to continuously refine the Skill based on real usage. **Phase 4: Building a Complete Skill Library.** The goal is to create a team of Skills for all repetitive tasks. Examples are given for industries like real estate, marketing, finance, consulting, and e-commerce. The user should list their tasks, prioritize them, and build one new Skill per week, maintaining a master document to track their library. The conclusion emphasizes the compounding time savings: ten Skills saving 30 minutes each per week reclaims over 260 hours (6.5 work weeks) per year, fundamentally transforming one's work system.

marsbit05/12 09:45

How to Automate Any Workflow with Claude Skills (Complete Tutorial)

marsbit05/12 09:45

Your Claude Will Dream Tonight, Don't Disturb It

This article explores the recent phenomenon of AI companies increasingly using anthropomorphic language—like "thinking," "memory," "hallucination," and now "dreaming"—to describe machine learning processes. Focusing on Anthropic's newly announced "Dreaming" feature for its Claude Agent platform, the piece explains that this function is essentially an automated, offline batch processing of an agent's operational logs. It analyzes past task sessions to identify patterns, optimize future actions, and consolidate learnings into a persistent memory system, akin to a form of reinforcement learning and self-correction. The article draws parallels to similar features in other AI agent systems like Hermes Agent and OpenClaw, which also implement mechanisms for reviewing historical data, extracting reusable "skills," and strengthening long-term memory. It notes a key difference from human dreaming: these AI "dreams" still consume computational resources and user tokens. Further context is provided by discussing the technical challenges of managing AI "memory" or context, highlighting the computational expense of large context windows and innovations like Subquadratic's new model claiming drastically longer contexts. The core critique argues that this strategic use of human-centric vocabulary does more than market products; it subtly reshapes user perception. By framing algorithms with terms associated with consciousness, companies blur the line between tool and autonomous entity. This linguistic shift can influence user expectations, tolerance for errors, and even perceptions of responsibility when systems fail, potentially diverting scrutiny from the companies and engineers behind the technology. The article concludes by speculating that terms like "daydreaming" for predictive task simulation might be next, continuing this trend of embedding the idea of an "inner life" into computational processes.

marsbit05/11 00:15

Your Claude Will Dream Tonight, Don't Disturb It

marsbit05/11 00:15

Turing Award Laureate Sutton's New Work: Using a Formula from 1967 to Solve a Major Flaw in Streaming Reinforcement Learning

New research titled "Intentional Updates for Streaming Reinforcement Learning" (arXiv:2604.19033v1), involving Turing Award laureate Richard Sutton, addresses a core challenge in deep reinforcement learning (RL): the "stream barrier." Current deep RL methods typically rely on replay buffers and batch training for stability, failing catastrophically when learning online from single data points (streaming). The authors propose a fundamental shift: instead of prescribing how far to move parameters (a fixed step size), their "Intentional Updates" method specifies the desired change in the function's output (e.g., a 5% reduction in value prediction error). It then calculates the step size needed to achieve that intent. This idea is inspired by the Normalized Least Mean Squares (NLMS) algorithm from 1967. Applied to value and policy learning, this yields algorithms like Intentional TD(λ) and Intentional AC. The method inherently stabilizes learning by adapting the step size based on the local gradient landscape, preventing overshooting/undershooting. In experiments on MuJoCo continuous control and Atari discrete tasks, Intentional AC achieved performance rivaling batch-based algorithms like SAC in a streaming setting (batch size=1, no replay buffer), while being ~140x more computationally efficient per update. The work demonstrates significant robustness, reducing reliance on numerous stabilization tricks. A remaining challenge is bias in policy updates due to action-dependent step sizes. Overall, this approach advances efficient, online, "learn-as-you-go" RL, enabling adaptive systems without massive data buffers or compute clusters.

marsbit05/10 06:28

Turing Award Laureate Sutton's New Work: Using a Formula from 1967 to Solve a Major Flaw in Streaming Reinforcement Learning

marsbit05/10 06:28

活动图片