# AI Research Related Articles

HTX News Center provides the latest articles and in-depth analysis on "AI Research", covering market trends, project updates, tech developments, and regulatory policies in the crypto industry.

Large Language Models Ace All Exams, Yet Move Farther from AGI: What Does This Paper Reveal?

The article discusses the ongoing challenge of defining and achieving Artificial General Intelligence (AGI). It notes that industry leaders have set vague, often profit- or time-based benchmarks for AGI, while the concept itself lacks a consensus definition—a situation the article compares to a "Rorschach test." It highlights a recent 2025 paper by researcher Michael Timothy Bennett, who proposes a new, measurable definition. Bennett frames AGI not as mimicking human performance on tests, which current large language models (LLMs) have already mastered, but as an "artificial scientist." A true AGI, according to this view, should be able to widely and efficiently adapt to new environments and tasks within real-world constraints (like computational and energy limits), focusing on the *discovery of new knowledge* rather than the replication of existing data. The author contrasts this with the current dominant approach of "scale-maxing"—massively scaling up data, parameters, and compute. While powerful, this method leads to models that fail on out-of-distribution problems and lack core intelligent abilities: they are passive learners, cannot reason causally, and cannot actively experiment or balance exploration with exploitation. The article argues that Bennett's framework offers a crucial shift. It makes AGI a quantifiable engineering problem and proposes new evaluation "adaptation benchmarks" that test an AI's ability to actively learn in novel scenarios. The conclusion is that achieving AGI will require a fundamental reset—a fusion of multiple methodologies beyond simple scaling, moving AI from mimicking patterns to embodying the scientific spirit of inquiry and discovery.

marsbit05/28 00:24

Large Language Models Ace All Exams, Yet Move Farther from AGI: What Does This Paper Reveal?

marsbit05/28 00:24

AlphaGo's Creator Puts AI into a 23-Year-Old Artificial Society: All Three Toughest Challenges for AI Agents Are Here

Demis Hassabis, CEO of DeepMind, has embarked on a new AI research venture by partnering with the long-running space MMO, EVE Online. This collaboration, announced in early May, aims to use the game's 23-year-old, player-driven persistent universe as a testbed for tackling three core challenges in AI agent research: long-horizon planning, memory, and continual learning. Unlike previous DeepMind environments like AlphaGo (Go) or AlphaStar (StarCraft II), EVE Online features no fixed end state. Its single-shard universe has fostered complex, emergent player societies with real economies, political alliances, and wars that can span months or years. These conditions naturally demand the very skills—long-term strategic planning, maintaining memories over extended periods, and adapting to constant change—that are hardest for current AI agents to master. The research will initially use an offline version of EVE, providing a controlled, complex sandbox without interfering with the live player server. This move continues DeepMind's trajectory of using increasingly complex and open-ended virtual worlds for AI training, from Atari games and Go to StarCraft II and the SIMA project. The EVE environment represents a significant step towards testing AI in a persistent, socially complex, and continuously evolving world shaped by human behavior over decades.

marsbit05/25 00:08

AlphaGo's Creator Puts AI into a 23-Year-Old Artificial Society: All Three Toughest Challenges for AI Agents Are Here

marsbit05/25 00:08

a16z: Three Major AI Trends for 2026

a16z: Three AI Trends for 2026 1. AI Takes on Substantial Research Tasks: AI models are evolving to handle complex, abstract instructions and assist in research, particularly in reasoning. They are beginning to solve difficult problems and foster a new "generalist" research style that focuses on connecting ideas and making inferences from hypothetical answers. This requires new "nested agent" workflows where models collaborate and refine each other's outputs, though better model interoperability and compensation methods (potentially via blockchain) are needed. 2. The Shift from KYC to KYA (Know Your Agent): The bottleneck in the agent economy is shifting from intelligence to identity verification. With non-human identities vastly outnumbering humans in finance, there's a critical need for a "Know Your Agent" infrastructure. Agents require cryptographically signed credentials to transact, linking them to their principals, constraints, and liabilities. 3. Solving the Open Web's "Invisible Tax": AI agents are disrupting the economic foundation of the open web by extracting data from ad-supported sites (the context layer) while bypassing their revenue models. This creates an "invisible tax" that threatens content creation. Solutions are needed to automatically reward content creators, moving from static licensing to real-time compensation systems using technologies like nanopayments and attribution standards, potentially blockchain-enabled.

marsbit01/12 08:09

a16z: Three Major AI Trends for 2026

marsbit01/12 08:09

活动图片