Artículos Relacionados con Architecture

El Centro de Noticias de HTX ofrece los artículos más recientes y un análisis profundo sobre "Architecture", cubriendo tendencias del mercado, actualizaciones de proyectos, desarrollos tecnológicos y políticas regulatorias en la industria de cripto.

Why Did Zhipu Surge Nearly 30% in a Single Day?

"Global AI Model Unicorn" Zhipu's stock surged nearly 30% in a single day, reaching a new market cap high. The catalyst was the launch of its GLM-5.1-highspeed API, boasting a generation speed of **400 tokens per second**, setting a new global benchmark. This speed, roughly 3-5 times faster than industry leaders like OpenAI's GPT-4o and Anthropic's Claude, is achieved **without compromising the full-scale model's capabilities**. In the era of AI Agents requiring dozens of self-calls, such latency reduction is critical, transforming speed from a system metric into a determinant of intelligence limits. The breakthrough stems from a three-layer technical overhaul: 1. **TileRT Inference Engine**: Compiles the entire model into a continuous, always-on computation pipeline using "Warp Specialization," minimizing GPU idle time by having different processor groups handle data loading, computation, and communication in parallel. 2. **Heterogeneous Parallelism for MLA**: To efficiently run the GLM-5.1 model using the MLA attention mechanism, TileRT employs a heterogeneous strategy. One GPU handles sparse indexing/routing, while the others perform dense computation, optimizing for MLA's unique workflow. 3. **ZCube Network Architecture**: Replaces the standard Spine-Leaf (ROFT) network topology with a flat, dual-group interconnect. This design creates a single optimal path between any two GPUs, eliminating network congestion at scale and reducing latency. The business impact is significant: a 15% increase in cluster throughput (free extra capacity), a 40.6% reduction in tail latency (improved stability), and a one-third cut in networking hardware costs. Long-term, this innovation challenges the dominance of NVIDIA's integrated hardware-software stack (GPU+NVLink+InfiniBand), potentially benefiting manufacturers of high-density Leaf switches and optical modules while lowering the software barrier for domestic AI chips like Huawei's Ascend. The innovation proves that more can be achieved with the same compute, reshaping the infrastructure beyond just GPUs.

marsbitAyer 01:23

Why Did Zhipu Surge Nearly 30% in a Single Day?

marsbitAyer 01:23

A Decade's Bet on Cerebras: How the 'Wafer-Scale AI Chip' Reached NASDAQ

"Cerebras, a pioneering AI chip company, successfully debuted on NASDAQ (CBRS) on May 14, 2026, with its stock price surging approximately 68% on the first day. This marks a significant milestone following a decade-long journey, as recounted by early investor Steve Vassallo. The story begins not in 2016, but with the deep, 19-year relationship between Vassallo and founder Andrew Feldman, which started with Feldman’s previous company, SeaMicro (acquired by AMD in 2012). In 2016, Feldman and a core team of chip and system experts sought to challenge the emerging consensus. At a time when AI’s practical utility was still debated and GPUs were becoming the default hardware, they envisioned a fundamentally new computer architecture purpose-built for AI workloads. They identified memory bandwidth, not raw compute power, as the critical bottleneck for neural networks. Defying industry inertia, Cerebras pursued a radical, wafer-scale chip design—58 times larger than the biggest existing chips. This meant confronting and solving a cascade of unprecedented engineering challenges: power delivery, thermal management, and maintaining electrical continuity across tens of thousands of connections. It required reinventing nearly every aspect of modern computing—semiconductors, systems, data structures, software, and algorithms. The path was fraught with setbacks, including a prototype that caught fire on its first power-up. Progress was marked by intense, iterative problem-solving, with the board meeting every 6-8 weeks to tackle the latest technical frontier. Through disciplined perseverance and deep trust within the team, they achieved a breakthrough in August 2019 when their first wafer-scale computer successfully operated. Feldman’s drive for a 1000x leap, his formative upbringing among intellectual giants who modeled both brilliance and kindness, and his belief in building a loyal, mission-driven team were central to Cerebras’s culture. His competitive strategy was that of David vs. Goliath—finding innovative, human-centric approaches that larger incumbents would overlook. From the symbolic delivery of the first term sheet over a backyard fence in 2016 to the NASDAQ bell ringing in 2026, Cerebras’s journey is a testament to long-term vision, technical audacity, and the power of foundational founder-investor relationships. It stands as a reminder that the computing revolution can come not just from more GPUs, but from a complete reimagining of the architecture itself."

marsbit05/15 03:55

A Decade's Bet on Cerebras: How the 'Wafer-Scale AI Chip' Reached NASDAQ

marsbit05/15 03:55

Ant Digital Tech Proposes New Architecture for Agent Economy, Covering Four Layers: Identity, Payment, Risk Control, and Compliance

Ant Digital Technologies (Ant Digital) has introduced a new architectural framework for the agentic economy, named the "4R Full-Stack Architecture," at the Hong Kong Web3 Festival. The framework is designed to address four core challenges in AI agent operations: identity, payment, risk control, and compliance. The four layers include: - **Agentic Runtime**, featuring DTClaw with the CARLI security model to enforce behavioral constraints and ensure controllability and auditability; - **Payment Rails**, which provide on-chain payment channels supporting smart decision-making, verifiable credentials, instant settlement, and cross-chain asset transfers; - **Agent Registry**, leveraging DIDs and the ERC-8004 standard to assign verifiable on-chain identities to agents; - **Root Infrastructure**, built on Jovay Layer2 and ZKVM technology to enable high-speed micro-payments and trusted off-chain computation with on-chain verification. According to CTO Yan Ying, the architecture aims to resolve fundamental gaps in the current agent economy—such as execution vulnerabilities, identity issues, payment barriers, and trust deficits—by redesigning underlying infrastructure rather than applying superficial fixes. The initiative builds on Ant Digital’s extensive experience in financial-grade security, privacy computing, and blockchain.

marsbit04/20 09:24

Ant Digital Tech Proposes New Architecture for Agent Economy, Covering Four Layers: Identity, Payment, Risk Control, and Compliance

marsbit04/20 09:24

a16z Founder: In the Agent Era, What Truly Matters Has Changed

Marc Andreessen, co-founder of a16z, argues that the current AI boom is not an overnight success but the culmination of 80 years of research, now delivering practical results. He emphasizes that this era is defined by the convergence of four key capabilities: large language models (LLMs), reasoning, coding, and agents capable of recursive self-improvement. Andreessen describes the agent architecture—combining an LLM with a shell, file system, markdown, and cron/loop—as a fundamental shift beyond chatbots. This structure leverages existing software components, allowing agents to maintain state, introspect, and extend their own functionality. He predicts a move away from traditional GUI and browser-based interactions toward an "agent-first" world where software is primarily operated by bots, not humans, with people simply stating their goals. He draws parallels to the 2000 internet bubble but notes key differences: current AI infrastructure investments are led by cash-rich giants and quickly monetized. He highlights that scaling constraints involve not just GPUs but the entire chip ecosystem. Open source and edge inference are crucial for democratizing knowledge and enabling low-latency, cost-effective applications on local hardware. Finally, Andreessen identifies significant non-technical challenges: potential short-term cybersecurity crises, the need for "proof of human" identity solutions, financial infrastructure for agents, and institutional resistance from sectors like education and healthcare. He cautions that societal adoption will be slower than technological change.

marsbit04/20 00:02

a16z Founder: In the Agent Era, What Truly Matters Has Changed

marsbit04/20 00:02

Thin Harness, Fat Skills: The True Source of 100x AI Productivity

The article "Thin Harness, Fat Skills: The True Source of 100x AI Productivity" argues that the key to massive productivity gains in AI is not more advanced models, but a superior system architecture. This framework, "fat skills + thin harness," decouples intelligence from execution. Core components are defined: 1. **Skill Files:** Reusable markdown documents that teach a model *how* to perform a process, acting like parameterized function calls. 2. **Harness:** A thin runtime layer that manages the model's execution loop, context, and security, staying minimal and fast. 3. **Resolver:** A context router that loads the correct documentation or skill at the right time, preventing context window pollution. 4. **Latent vs. Deterministic:** A strict separation between tasks requiring AI judgment (latent space) and those needing predictable, repeatable results (deterministic). 5. **Diarization:** The critical process where the model reads all materials on a topic and synthesizes a structured, one-page summary, capturing nuanced intelligence. The architecture prioritizes pushing intelligence into reusable skills and execution into deterministic tools, with a thin harness in between. This allows the system to learn and improve over time, as demonstrated by a YC system that matches startup founders. Skills like `/enrich-founder` and `/match` perform complex analysis and matching that pure embedding searches cannot. A learning loop allows skills to rewrite themselves based on feedback, creating a compound improvement effect without code changes. The conclusion is that 10x to 1000x efficiency gains come from this disciplined system design, not just smarter models. Skills represent permanent upgrades that automatically improve with each new model release.

marsbit04/13 04:19

Thin Harness, Fat Skills: The True Source of 100x AI Productivity

marsbit04/13 04:19

Chaos Labs Exits, Who Will Take Over Aave's Risk?

Chaos Labs, the core risk management provider for Aave V2 and V3 markets, has announced its decision to terminate its partnership with Aave. Despite Aave Labs increasing the budget to $5 million to retain them, Chaos Labs chose to leave due to fundamental disagreements on how risk should be managed. Key reasons for the departure include: the loss of core Aave contributors increasing operational risk, the expanded scope and complexity introduced by Aave V4 (which requires rebuilding risk infrastructure from scratch), and the fact that Chaos Labs operated at a financial loss even with increased budgets. They estimate that proper risk management for both V3 and V4 should cost at least $8 million annually (≈5.6% of protocol revenue), closer to traditional banking standards, rather than the previous 2%. Chaos Labs emphasized that Aave’s reputation and institutional adoption rely heavily on its risk management track record. They also highlighted unquantified costs like legal liability and operational security risks. The exit occurs as Aave plans its V4 upgrade and expands into institutional markets. Chaos Labs warns that migrating to V4 while maintaining V3 will double, not halve, the workload, and that accumulated operational experience cannot be easily transferred. The decision reflects a principled stance: Chaos Labs only attaches its name to work that meets its high-risk standards, even at significant financial sacrifice.

marsbit04/07 03:36

Chaos Labs Exits, Who Will Take Over Aave's Risk?

marsbit04/07 03:36

The Stock Tokenization Revolution: A Panoramic Report on Market Dynamics, Product Architecture, and Regulatory Moats

Tokenized stocks are emerging as a breakthrough sector in the real-world asset (RWA) market, with a total value exceeding $800 million—a 30x increase since the start of the year—and monthly trading volume reaching $1.8 billion. The core value proposition is enabling global, 24/7 access to U.S. equities with near-instant settlement, bypassing geographic restrictions and delays inherent in traditional finance. Three primary architectures are competing for dominance: 1. Instant execution (e.g., Ondo, CyberAlpha): maximizes capital efficiency. 2. Inventory model (e.g., xStocks, Backed): uses Swiss debt structures for superior DeFi composability. 3. Direct ownership (e.g., Securitize): offers full legal rights but limited on-chain flexibility. The market is dominated by two players: Ondo (53% share) leverages liquidity engineering, while Backed/xStocks (23%) uses regulatory arbitrage via Swiss law. Regulatory licensing—not technology—is the key moat, with complex cross-jurisdictional compliance (U.S., EU, offshore) forming the highest barrier to entry. The sector faces a trilemma between liquidity/speed, regulatory safety, and DeFi composability, and is diverging into two paths: incremental integration with traditional systems (e.g., DTCC) and revolutionary on-chain issuance for full disintermediation. The convergence of the $150 trillion global equity market with blockchain infrastructure is already underway.

marsbit03/10 13:24

The Stock Tokenization Revolution: A Panoramic Report on Market Dynamics, Product Architecture, and Regulatory Moats

marsbit03/10 13:24

活动图片