DeepMind CEO and AlphaGo creator Demis Hassabis has been using games for AI research for over a decade.
This time, he has thrown AI into a "living universe" that has been running for 23 years: the space-themed massively multiplayer online game EVE Online, a game whose new player tutorial alone can deter players.
Chess games have an end, but EVE does not.
In early May, DeepMind officially announced a research collaboration with EVE Online for a simple reason: EVE's complex, player-driven universe is the perfect safe sandbox to test AI memory, continual learning, and long-term planning.
DeepMind's collaboration with EVE is not about pursuing fun gameplay or enhancing game mechanics. Instead, it aims to tackle the three toughest, most widely recognized challenges in current AI agent research. Hassabis is betting on finding answers in a 23-year-old game.
Fenris Creations (formerly CCP Games) announces partnership with DeepMind
On the same day, May 6th, the company behind EVE Online announced four things:
- Regained independence from its parent company Pearl Abyss;
- Renamed to Fenris Creations;
- Completed a $120 million transaction;
- As part of this independence, Google acquired a minority stake in Fenris Creations and simultaneously initiated a research partnership with Google DeepMind.
Fenris Creations CEO Hilmar Veigar Pétursson stated in the announcement:
This transition does not involve layoffs or restructuring. The team, products, and development plans remain unchanged. EVE continues.
Looking at operational figures, this company came to the table with "real ammunition" for collaboration, not to sell assets for survival.
EVE Online's revenue in 2025 exceeded $70 million, with November setting a historical revenue record, and Q4 becoming the second-highest revenue quarter in the game's 20-year history.
Fenris Creations' independence means EVE now has a parent company that can autonomously decide on research collaborations, no longer constrained by the strategic goals of a larger game publishing company.
A box of a board game product published by Fenris in 1997. The name "Fenris" predates EVE Online by 6 years. Renaming to Fenris Creations is a look back, not a fresh start.
Why did DeepMind choose EVE?
A 23-Year "Artificial Society"
An AI Benchmark Difficult to Replicate
When many people hear "games + AI research," their first thought is of AlphaGo or AlphaStar. EVE is different from both.
Go and StarCraft share a common characteristic: a match has a beginning, an end, and clear win/lose rules.
AlphaGo's goal was to win a Go game. AlphaStar's goal was to win a StarCraft match. Both represent a "single-game intelligence" research paradigm. But EVE has no endgame.
EVE Online is famous for its "single-shard / single shared universe," where a vast number of players compete, trade, form alliances, and wage war in a persistent world over the long term.
Players here have built real economic systems, political alliances, military coalitions, trade routes, historical grudges, and warfare plans that span years.
Some campaigns take an entire year from preparation to conclusion. The rise and fall of some alliances are studied by later players as real history.
Hilmar stated in the announcement: "EVE is one of the few places where we can explore questions of intelligence in an environment that already operates like the real world."
Hassabis further explained that he has played games since childhood, his career started with designing AI simulation games, and his work on AlphaGo, AlphaStar, and SIMA has been deeply tied to games. EVE is the choice for the next stage:
I'm thrilled to partner with Fenris Creations to safely explore new game experiences and advance AI research within this player-created, uniquely complex universe.
Most AI benchmarks are like medical checkups. EVE is more like throwing AI into an "artificial society" that has been running for 23 years.
The Three Toughest Challenges for Agents
Happen to be Daily Life for EVE Players
The official announcement explicitly lists three research directions: long-horizon planning, memory, and continual learning.
These three directions are widely acknowledged as the three toughest challenges in current AI agent research.
If you know someone who has played EVE Online for over ten years, ask them to open their account and show you their friend list. You'll likely see dozens of groups and hundreds of names, with notes in the remarks field like "Debt owed from the 2018 Delve campaign," "Traitor within Goonswarm, do not cooperate," "This guy is a spy, everyone in the corp knows."
This isn't a context window; it's cross-session long-term memory spanning at least a decade.
EVE players navigate the memory challenge every day. The continual learning challenge is the same.
In January 2014, the B-R5RB battle lasted about 21 hours, involving over 7,500 characters, the destruction of 75 Titans, with losses equivalent to roughly $300,000 in real-world currency. The trigger for the entire battle was a sovereignty bill that failed to auto-pay.
After this battle, the entire game's fleet tactics were rewritten. Alliance fleet compositions and tactical systems for years after revolved around post-battle analysis and iteration. Updates were made monthly, with every failure broken down into actionable strategic updates.
As for long-horizon planning, the standard time unit for EVE alliance warfare isn't hours; it's months. From preparation to execution, a cross-regional war involves shipbuilding, logistics, diplomacy, infiltration, and counter-espionage, with hundreds of players spontaneously collaborating without any task manager to advance a common goal over months.
This collaborative system evolved organically from the players over 23 years.
The three hardest challenges recognized in current AI agent evaluation happen to be the daily life of EVE players.
Twenty-three years of player-driven evolution in EVE have produced an environment that is always changing, always complex, with no shortcuts. This level of complexity cannot be synthetically created in a lab.
DeepMind's SIMA 2, released in November 2025, has evolved from "executing instructions" to "understanding goals, reasoning about processes, and learning while playing."
From a research question perspective, the EVE project shares the same "games as a training ground for agents" path as SIMA 2. The difference is that the venue has been swapped for a real universe that has been running for 23 years.
In-game battle scene from EVE Online. These large-scale, player-organized battles, often lasting for hours, are the core reason DeepMind chose EVE as a research environment for long-horizon planning and continual learning.
DeepMind is Entering an Offline Sandbox
Not the Live Player Universe
DeepMind's collaboration method with Fenris is more conservative than one might imagine. DeepMind does not have direct access to the live player servers.
DeepMind officially stated in the announcement: Initial research will be conducted on an offline version of EVE Online, using local servers in a controlled environment to test and evaluate models, without connecting to EVE Online's live operational servers.
On one hand, the offline version means DeepMind will not consume live player PvP data or disrupt the actual server economy, avoiding any privacy and compliance complexities.
On the other hand, the offline version of EVE can still retain the complex rule systems, ship and economic mechanics, star system structure, and other core design elements.
DeepMind is getting a "complex world pressure-tested by players for 23 years" as the examination hall where its agents must survive.
From Atari to EVE
Where This Path Leads
Looking back at DeepMind's choice of training grounds over the past decade, there's a clear evolutionary line.
2013 to 2015: Atari was the starting point. DQN put agents into games like *Breakout* and *Space Invaders* with clear levels and closed rules. It tested reaction and value estimation.
2016 to 2017: AlphaGo and AlphaZero. Go has neat rules, a huge but closed action space. It tested search and long-chain reasoning.
2019: AlphaStar entered *StarCraft II*. The first entry into a real-time, imperfect-information, multi-threaded博弈 environment. It tested decision-making under partial observability.
2024: SIMA aimed to be a generalist agent across multiple games. It tested transfer and generalization.
2025: SIMA 2 upgraded: not just executing instructions, but also conversing with users, reasoning about goals, and self-improving during gameplay.
DeepMind's SIMA 2, released in 2025, has evolved from "executing instructions" to "understanding goals, reasoning processes, and learning while playing."
Each generation of environment incorporates more aspects of the "real world" than the last: from closed rules to open rules, from perfect information to imperfect information, from single-game对抗 to cross-game migration.
However, these previous environments were still relatively closed, segmentable, and repeatable task fields. For example, Atari has fixed-rule arcade games; AlphaStar faced StarCraft matches that ended one by one; SIMA tested cross-game generalization in multiple 3D virtual environments.
The difference with EVE is that it is a persistent world that has been running long-term, driven by players, with continuously evolving economic and political structures.
It has been organically evolved over 23 years by real players in an open-ruled world: a complete player-driven economy (ISK price fluctuations comparable to real financial markets), political structures across alliances (diplomacy, espionage, ceasefires), and a whole warfare ecosystem from small skirmishes to 21-hour mega-battles.
The consensus within the field on agent evaluation is increasingly clear: running point task benchmarks hasn't produced anything new for a long time, but long-term memory, planning across weeks, and learning from failure still lack decent evaluation arenas.
Therefore, DeepMind's choice this time is: rather than creating another synthetic environment, step into an "artificial society" that has already been pressure-tested by human players for 23 years.
But a bigger question then emerges:
An AI agent that can persist, continually learn, and plan within EVE—what is still missing between it and an autonomous agent operating in the real world?
References:
https://x.com/GoogleDeepMind/status/2052011542707630461
https://www.ccpgames.com/news/2026/studio-behind-eve-online-goes-independent-rebrands-as-fenris-creations-enters-research-partnership-with-google-deepmind
https://deepmind.google/blog/sima-2-an-agent-that-plays-reasons-and-learns-with-you-in-virtual-3d-worlds/
This article is from the WeChat public account "新智元" (New Zhiyuan), author: ASI启示录 (ASI Revelation), editor: 元宇 (Yuanyu).













