Conversation with Mai-Lan from AWS: The Next Battlefield for S3 – How to Handle the Data Consumption Surge in the Agent Era

marsbitPubblicato 2026-05-08Pubblicato ultima volta 2026-05-08

Introduzione

The explosive rise of Agent AI, exemplified by OpenClaw in China, is putting unprecedented pressure on cloud data infrastructure. Unlike human engineers, Agents consume data in an "extremely active and aggressive" parallel fashion, launching tens to hundreds of queries simultaneously, leading to exponentially higher call frequencies and throughput. Mai-Lan Tomsen Bukovec, VP of Technology at AWS, emphasizes that cost-effectiveness in this data layer is now a decisive factor for customers building Agent systems. To address this, AWS is positioning its foundational Amazon S3 service, now 20 years old, as the critical data platform for the Agent era. Recent key innovations include: **S3 Table** with native Apache Iceberg support, enabling Agents to efficiently interact with structured data via familiar SQL; **S3 Vector**, which introduces vectors as a native type for building contextual data and serving as a shared "memory space" for AI systems; and the newly launched **S3 Files**, which provides a POSIX-compliant file system interface over S3, allowing Agents to interact with data through the familiar paradigm of files and directories. These enhancements are designed to meet the unique data interaction patterns of Agents, which are trained on models already proficient with SQL, file systems, and contextual vectors. By unifying these access methods on the scalable, durable, and cost-efficient S3 foundation, AWS aims to provide the data backbone capable of supporting the next w...

At the beginning of the year, the popularity of OpenClaw in the Chinese market allowed everyone to see the enormous potential of Agents. But what followed was a question that all cloud vendors must answer: When Agents begin to multiply like cybernetic lobsters and call data at high frequencies, are the AI cloud infrastructure layers, especially the data layer, ready?

For example, when enterprise data teams deploy Agents into production environments, they often encounter bottlenecks at the data layer. Building Agents across different platforms such as vector databases, relational databases, graph databases, and data lakehouses requires synchronized data pipelines to maintain the timeliness of context information. But in real production environments, this context information gradually becomes outdated.

The urgency of this problem stems from the fundamentally different data consumption patterns of Agents compared to human engineers.

"Agents are consuming data in an extremely active and aggressive way. Their call frequency to data warehouses or data lakes is astonishing."

Mai-Lan Tomsen Bukovec, Vice President of Technology at Amazon Web Services, recently pointed out in a discussion with the author that Agents operate through a "parallel comparison and selection" mode of work. That is, instead of one query at a time, they run dozens or hundreds in parallel simultaneously, comparing results to find the optimal path. This makes Agents far more aggressive data consumers than humans—with call frequencies several orders of magnitude higher and data throughput growing exponentially.

Mai-Lan further pointed out, "Customers are now very eager to build Agent infrastructure. Cost, or rather cost-effectiveness, is no longer a secondary factor but has become a decisive one. In the next six months to a year, with the explosion of Agents, the choice of underlying data services will become crucial."

Now, the OpenClaw frenzy is subsiding, leaving behind a pressure test warning for the underlying storage and compute capabilities of cloud vendors. Mai-Lan believes that AWS holds a natural advantage in this field. The scale of Amazon S3 (Amazon Simple Storage Service), and the cost efficiency of Amazon Redshift and Amazon Athena under high concurrency, are precisely prepared for this ultra-large-scale, ultra-high-frequency Agent data interaction mode.

Coinciding with the 20th anniversary of Amazon S3, and centered around customer demands for data processing in the AI era, Amazon S3 has recently implemented three major evolutions: S3 Table (Tabular), S3 Files (Files), and S3 Vector (Vector).

Take S3 Table's native support for Apache Iceberg, for example. Mai-Lan noted that when Agents process data, they tend to interact directly with data in Iceberg format via SQL. The underlying logic is that Agents are built on large language models (LLMs), and LLMs have developed mature processing capabilities for SQL syntax and Iceberg data formats during training. Storing all table data in Iceberg format on S3 allows Agents to efficiently handle data without needing to learn complex access APIs for multiple systems. Currently, Agents show a high degree of compatibility with S3 and Iceberg.

When Iceberg capabilities were introduced to S3, it triggered a new wave of innovation. Data sources like Postgres and Oracle began writing directly to Iceberg, and Agent systems could interact directly with these tables. And with the launch of S3 Vectors, more and more AI applications are using vectors as a shared memory medium, thereby injecting "state" into AI interaction experiences.

Mai-Lan also pointed out that vectors have been introduced as a native data type in S3. The application of vectors mainly concentrates on two dimensions: one is using vectors to build contextual information for data stored in S3, and the other is using vectors as shared memory. In the five months since S3 Vectors was released, market feedback has met expectations. A large number of customers have started using this feature, generating vectors via embedding models to enrich the context of their data. The usage of S3 Vectors as the memory space for Agent systems has seen explosive growth.

It is worth mentioning that S3 Files was released a few weeks ago, enabling Agents to process data in S3 via the POSIX standard—that is, through a file system approach. In Agent systems, LLMs pay high attention to the "file" form. Whether it's Python libraries or Shell scripts, they are content familiar from LLM training. Agents naturally prefer to use files as data interfaces.

For this reason, the design concept of S3 Files is to mount an EFS file system on an S3 bucket. Through this mechanism, users can process S3 data in the file system based on POSIX standards: small files can be accessed faster via EFS caching, while large files are streamed directly from S3. This allows Agents to interact natively with S3 data using the familiar language of the file system and treat the shared file system as a "shared memory space" from S3.

From the perspective of the development of LLM memory capabilities, this progress is significant. Current AI experiences are gradually introducing deeper conversational context and personalized interactions—whether between Agents, between humans and Agents, or between Agents and data, model performance is continuously evolving. By further extending this natural interface of the file system, the memory capabilities of Agent systems are expected to achieve deeper enhancements.

The author notes that from its start in 2006 primarily handling semi-structured data like images, to later analytical data, from the initial data warehouse to the rise of the data lake, AWS is now vigorously promoting Amazon S3 to become the key foundation for carrying AI workloads to meet current customer demands. Mai-Lan believes that the design core of Amazon S3 is to drive the growth of mainstream data types in a cost-effective way, while always adhering to principles such as data availability, durability, and resilience. And this is precisely why customers have entrusted their data operations to S3 for the past 20 years, and it will also carry its possibilities for the next 20 years.

(Author | Yang Li, Editor | Yang Lin)

Domande pertinenti

QWhat is the core difference in data consumption patterns between AI agents and human engineers as highlighted in the article?

AThe article emphasizes that AI agents consume data in an 'extremely active and aggressive' manner. They operate on a 'parallel comparison' or 'optimization by comparison' model, issuing dozens or even hundreds of parallel queries simultaneously to find the best path. This results in a data consumption frequency and throughput that is several orders of magnitude higher than that of human engineers.

QWhat are the three major innovations recently implemented for Amazon S3 to meet the demands of the AI era?

ATo address AI-era data processing needs, Amazon S3 has recently introduced three major innovations: S3 Table (with native support for Apache Iceberg format), S3 Files (enabling POSIX file system access to S3 data), and S3 Vector (introducing vectors as a native data type for building context and shared memory).

QWhy does the article suggest that S3's support for Apache Iceberg is particularly beneficial for AI agents?

AThe article states that AI agents, built on large language models (LLMs), are already proficient in handling SQL syntax and Iceberg data formats due to their training. By storing all table data in Iceberg format on S3, agents can interact with the data efficiently without needing to learn multiple complex access APIs. This creates a high degree of compatibility between agents and the S3/Iceberg ecosystem.

QHow does the newly released S3 Files feature enable better interaction for AI agents with data in S3?

AS3 Files allows agents to interact with S3 data via the POSIX file system standard. It works by mounting an EFS file system on an S3 bucket. This lets agents use familiar file system operations: small files are accelerated via EFS cache, while large files are streamed directly from S3. This provides agents with a natural 'file' interface, treating the shared file system as a 'shared memory space' sourced from S3.

QAccording to Mai-Lan, what has become a decisive factor for customers looking to build Agent infrastructure, moving beyond just being a secondary consideration?

AMai-Lan points out that for customers eager to build Agent infrastructure, 'cost, or rather cost-performance ratio, is no longer a secondary factor but has become a decisive factor.' She emphasizes that in the coming 6 to 12 months, the choice of underlying data services will be crucial as Agent adoption explodes.

Letture associate

Gnosis DAO Faces Massive Treasury Redemption Proposal, "Treasury Raiders" Return

A group of activist investors, often labeled as "treasury raiders," have submitted proposal GIP-150 to Gnosis DAO, calling for a one-time, voluntary, and proportional treasury redemption. The proposal would allow participating GNO holders to claim a share of the over $220 million in DAO reserves. Proponents argue this addresses the persistent and widening discount of GNO's market price relative to the treasury's net asset value. Despite recent DAO funding to Gnosis Ltd., the discount has increased. The current vote, closing May 12th, shows 65% opposition among early votes. The redemption would value each eligible token around $170, a ~30% premium to the current $131 market price. GNO held by Gnosis Ltd. is excluded. DeFi community reactions are mixed. Some commentators acknowledge the "risk-free value" (RFV) arbitrage logic but criticize the proposal as a short-term cash grab lacking legitimacy, as Gnosis never promised treasury backing for the token price. Others oppose it due to Gnosis's contributions to ecosystem infrastructure (Safe, CoW Swap, etc.). Founder Sebastian Bürgel lamented the targeting of respected builders. Aragon's team, previously targeted in similar RFV campaigns, called for better mechanisms to align incentives. This follows a pattern of 2023 RFV-style actions against projects like Rook and Aragon. Recently, Beefy Finance implemented a buyback to preempt such pressure. The proposal's author, Wismerhill, expressed past admiration for Gnosis but now sees this vote as a test of whether holders prioritize short-term arbitrage or long-term ecosystem value.

marsbit40 min fa

Gnosis DAO Faces Massive Treasury Redemption Proposal, "Treasury Raiders" Return

marsbit40 min fa

TON Enters the Telegram Era: The On-Chain Experiment of Super Apps is Unfolding

The TON token recently surged nearly 120% in 4 days, approaching $3. This rally is primarily driven by Telegram founder Pavel Durov's announcement that Telegram will become the core driver of the TON network, replacing the TON Foundation and serving as its largest validator. This move signals a fundamental shift: Telegram is no longer just supporting TON from a distance but is formally taking over its governance and operations. This changes TON's valuation narrative from being a crypto project with Telegram's user base to becoming the foundational blockchain infrastructure for Telegram's future commercial ecosystem—transitioning from a crypto narrative to an internet-platform-level story. TON's recent technical upgrades focus on 10x faster speeds, 6x lower fees, and near-instant confirmations. These optimizations target Telegram's internal high-frequency, micro-transaction scenarios like tipping, bot services, and Mini App purchases. The goal is to enable seamless, near-zero-cost transactions for its nearly 1 billion users, making blockchain usage almost invisible—akin to platforms like WeChat Pay. TON's path is unique: it already has a massive user base and is building the blockchain system to serve it, aiming to onboard users into Web3 without them realizing it. The vision is to integrate wallet, payment, bot, and Mini App functionalities into a closed loop within Telegram, positioning TON as the value-exchange infrastructure for a super-app. In essence, this surge reflects a market reassessment: TON is emerging as the first blockchain ecosystem with a genuine super-app gateway. Its true competitors may not be other Layer 1 blockchains but global internet payment systems. With Telegram now fully committed, the experiment of on-chaining a super-app is underway.

marsbit1 h fa

TON Enters the Telegram Era: The On-Chain Experiment of Super Apps is Unfolding

marsbit1 h fa

Trading

Spot
Futures

Articoli Popolari

Come comprare ERA

Benvenuto in HTX.com! Abbiamo reso l'acquisto di Caldera (ERA) semplice e conveniente. Segui la nostra guida passo passo per intraprendere il tuo viaggio nel mondo delle criptovalute.Step 1: Crea il tuo Account HTXUsa la tua email o numero di telefono per registrarti il tuo account gratuito su HTX. Vivi un'esperienza facile e sblocca tutte le funzionalità,Crea il mio accountStep 2: Vai in Acquista crypto e seleziona il tuo metodo di pagamentoCarta di credito/debito: utilizza la tua Visa o Mastercard per acquistare immediatamente CalderaERA.Bilancio: Usa i fondi dal bilancio del tuo account HTX per fare trading senza problemi.Terze parti: abbiamo aggiunto metodi di pagamento molto utilizzati come Google Pay e Apple Pay per maggiore comodità.P2P: Fai trading direttamente con altri utenti HTX.Over-the-Counter (OTC): Offriamo servizi su misura e tassi di cambio competitivi per i trader.Step 3: Conserva Caldera (ERA)Dopo aver acquistato Caldera (ERA), conserva nel tuo account HTX. In alternativa, puoi inviare tramite trasferimento blockchain o scambiare per altre criptovalute.Step 4: Scambia Caldera (ERA)Scambia facilmente Caldera (ERA) nel mercato spot di HTX. Accedi al tuo account, seleziona la tua coppia di trading, esegui le tue operazioni e monitora in tempo reale. Offriamo un'esperienza user-friendly sia per chi ha appena iniziato che per i trader più esperti.

324 Totale visualizzazioniPubblicato il 2025.07.17Aggiornato il 2025.07.17

Come comprare ERA

Discussioni

Benvenuto nella Community HTX. Qui puoi rimanere informato sugli ultimi sviluppi della piattaforma e accedere ad approfondimenti esperti sul mercato. Le opinioni degli utenti sul prezzo di ERA ERA sono presentate come di seguito.

活动图片