Uncovering the Truth About Agent Commerce, Payments, and Infrastructure

marsbitPublicado em 2026-06-07Última atualização em 2026-06-07

Resumo

Decoding Agent Commerce, Payments, and Infrastructure: The Reality Over the past year, I've been building infrastructure for the Agent economy, engaging with major players like Stripe, Visa, Coinbase, Google, and dozens of startups. A clear conclusion emerges: true, large-scale demand does not yet exist. Startups face structural challenges. Data points illustrate this gap. Stripe's Agent commerce platform has over 1,000 merchants but only single-digit transacting agents. Visa's Agent payment token requires 9-month KYC and a $250M revenue threshold, accessible only to giants like Amazon. On-chain analysis reveals actual daily Agent transaction volume is around $17k, half of which are test transactions. The article analyzes four potential markets: **1. Agent-to-Merchant (A2M):** Current AI shopping UX is often inferior to traditional e-commerce for visual, comparison-heavy purchases (clothing, electronics). Chat interfaces are a step back. Real merchant interest is defensive "Agent Engine Optimization," fearing future obsolescence, not current demand. Potential exists in high-frequency, low-decision purchases (e.g., food delivery) or simplifying terrible UX (complex checkouts, non-native shoppers), but these require massive consumer distribution channels dominated by giants like DoorDash and Amazon. **2. Agent-to-API (A2A):** Developers already have subscriptions and billing for core APIs (compute, data). The argument for micro-payments via crypto for sub-dollar API calls ...

Author:jessy

Compiled by:Jiahuan, ChainCatcher

Over the past year, I've been dedicated to building infrastructure for the Agent economy, engaging with teams from Stripe, Visa, Coinbase, Google, and dozens of startups pushing Agent commerce forward. I've mapped out the entire industry, launched products, and sought market fit.

There is no real demand yet, and startups face numerous structural issues when venturing into this space.

Last month, Stripe announced 288 new products at its Sessions conference. Nearly 40% of the total documentation views were for its Agent commerce docs. Their Agent commerce market boasts over 1,000 activated merchants. Yet, during the Sessions conference, the number of registered Agents conducting transactions was in the single digits.

Visa mentioned that their Agent payment tokens (tokenized payment credentials bound to an Agent for making payments on a user's behalf) currently require 3 to 9 months for KYC approval and essentially need a minimum revenue threshold of $250 million to qualify. Today, only companies at the scale of Amazon or Walmart can complete this identity verification loop.

Coinbase reported that as of April, there were 69,000 active Agents and 165 million transactions on the x402 protocol. However, independent on-chain analysis shows actual daily transaction volume is around $17,000, with about half being test transactions (according to CoinDesk, March 2026).

Agent to Merchant

We built shop.fast.xyz to directly validate the real-world application of concierge-style commerce. It includes real products, merchants, and transactions.

For most product categories, the current user experience of AI shopping is entirely inferior to traditional e-commerce. When buying clothes, electronics, or furniture, you want to see pictures, browse options, and compare them side-by-side.

The conversational format of a chatbot is actually a step backward. You're replacing a rich visual interface with a plain text conversation, and humans are fundamentally visual shoppers.

Agents excel at areas we thought would be difficult. They can understand user needs and handle instructions like "similar to this but cheaper" well. The model layer works.

But it cannot replace the experience of lining up ten products side by side and choosing one. Chat interfaces can be enhanced with carousels and interactive displays, but at that point, you're essentially rebuilding an e-commerce frontend inside a chat window. For visually-driven comparison shopping, we haven't found a compelling reason why a chat interface is better than a native e-commerce interface.

We see real demand from merchants, but it's a defensive demand.

Merchants want their stores to be queryable by Agents. Not because current customers are buying through Agents, but because they fear being left behind if this becomes a mainstream channel.

This is an "Agent Engine Optimization (AEO)" strategy, but it's currently a nice-to-have, not a necessity. Merchants are preparing for a wave that hasn't arrived.

Conversational commerce can indeed improve the experience in certain scenarios: high-frequency, low-decision-cost purchases where the user already knows exactly what they want. Ordering food delivery is the clearest example. It's a huge market, extremely frequent, with quick decisions ("order me Pad Thai from the usual place"). Conversational Agents have a chance here.

But major food delivery platforms haven't opened their APIs. The only way is "computer use": letting the AI navigate the app visually like a human. This is slow, fragile, and the inference cost simply doesn't work for a $15 lunch order.

Another potential breakthrough lies where: the UI navigation of certain stores is extremely complex and painful. Layers of discounts, promo codes, loyalty programs, and confusing checkout processes.

An Agent that understands "use my coupons, deduct my reward points, find the cheapest shipping, operate in my native language" can simplify today's dreadful user experiences. This is particularly important for elderly users, non-native speakers shopping on foreign websites, or highly specific scenarios with very niche needs.

Both of these breakthrough points require massive consumer (B2C) distribution channels. You're competing with DoorDash (the largest US delivery platform with 56% market share) and Amazon for user entry points.

Consumer-scale distribution is a strength of giants. The supply side for concierge-style commerce is ready, while the demand side is constrained by user experience and distribution channels. Building more infrastructure doesn't solve these two problems.

Agent to API

We discussed actual payment needs with dozens of developers. The situation is almost strikingly consistent: Agent-to-API usage today is recurring, including compute, inference, and data sources. Developers already have subscription services, archived API keys, and billing relationships with core providers.

The typical stablecoin argument is: on Stripe, the minimum effective cost for credit card processing is about 2.9% plus 30 cents, making sub-one-dollar API calls uneconomical. But with today's low transaction volume, prepaid credits solve this. Developers top up their accounts in advance, problem solved.

The deeper issue is the supplier market. Most mainstream SaaS companies don't want to offer ephemeral API access costing fractions of a cent. Their business model is multi-year enterprise contracts. Companies whose revenue relies on large commitment contracts will resist pricing mechanisms that circumvent their existing models.

Machine commerce is structurally a long-tail market, comprising smaller services, niche data sources, individual developers, and MCP servers. Protocols like MPP and x402 are well-suited for this niche.

But by definition, this is a market serving advanced users with specific needs, and historically, developers have been one of the groups least willing to pay.

When Stripe Projects launched, it partnered with 32 supplier partners like Vercel, Supabase, Cloudflare, Twilio, etc., covering most of the tools developers use to build and deploy software, all accessible through existing billing systems. The top-of-stack demand for developers is already met.

The opportunity for new payment rails exists in everything beyond these top 30 services: the opportunity is real, but its scale is inherently much smaller than the flashy numbers suggest.

The same pattern applies to content acquisition. Agents are already scraping and summarizing articles, and publishers are pushing back.

But when content monetization arrives at scale, it will happen through the CDN providers already sitting between publishers and the internet (Cloudflare has already launched AI audit tools for this), or through large-scale licensing agreements between publishers and AI labs.

This infrastructure opportunity will ultimately flow to giants that already own distribution channels.

Agent to Agent

The Agent-to-Agent business model is a long-term vision, currently almost entirely theoretical, with no one achieving meaningful transaction volume. Startups are tackling core challenges: Agent discovery, trust establishment, terms negotiation, and dispute resolution.

When this transaction structure truly materializes, it will be fundamentally different from existing payment rails. Neither party in the transaction involves a human identity. Latency is sub-second. Funds ranging from fractions of a cent to millions of dollars move in the same flow.

Add to that multi-party settlements, which completely defy the bilateral buyer-seller model assumed by existing payment rails. When this happens, we believe it will come fast and at scale.

This is a long-term bet on dedicated settlement infrastructure, and it's real. But a "real long-term bet" and "current market" are two different things.

For months, we were also among those evangelizing this market and had built a complete infrastructure around it over the past few years. With our distributed network, we could theoretically scale to over 1 billion TPS, with latency under 50 milliseconds and average consensus of 10 milliseconds. But we must meet the market where it actually is today.

Agent to Finance

This is arguably the only category with existing demand. The customer base already exists and is willing to pay. Today, fund managers, finance teams, and DeFi users are paying for financial tools. Embedding AI into existing workflows is a natural product evolution.

Agent finance also creates entirely new behaviors. An Agent that can autonomously monitor and rebalance hundreds of positions in real-time operates in a way humans cannot replicate manually. This isn't just automation; it's a substantive capability enhancement.

The challenge is the competitive landscape. The financial industry is heavily regulated and relies heavily on established business relationships. Incumbents have licenses, compliance infrastructure, and client relationships. Startups can find a niche in less-regulated areas (like DeFi), areas where giants move slowly, or where AI creates capabilities giants don't possess.

But compared to the other three categories, the competitive dynamics here are more favorable to incumbents, as layering AI on top of existing products and customer bases is far easier than the reverse.

The Real Game

So, why are people still building this stuff? Two reasons.

First, motivation. Industry giants have ample cash flow to bet on a future that may take years to materialize. For them, the cost of entering five years early is a rounding error, while the cost of entering one year late is existential. So they must build.

Second, cognitive bias. When your core business is payments, every problem looks like a payments problem. The Agent economy needs a payments layer, so build the payments layer.

But payments are just one piece of a much larger puzzle. The real challenge isn't moving money between Agents, but coordinating work between Agents and humans, verifying the work was done, and settling the outcome. Payments are just part of settlement. Settlement is just part of coordination. And coordination is the real prize.

Coordination at scale will naturally give rise to settlement mechanisms as a necessity. Payments are just one instrument in the symphony, not the entire score. The companies solving coordination will swallow the payments business, not the other way around.

Most incumbents are building defensively for a future of large-scale machine transactions. Because their funding runway is infinite, the timeline doesn't matter to them.

But startups don't have that luxury. We must go where the market actually is; we can't wait for the wave to arrive.

A year of building has led us in an unexpected direction. There is real market activity there, growing rapidly, and underserved. It lies outside the four categories we've outlined.

Perguntas relacionadas

QAccording to the article, what are the main structural problems startups face in the Agent economy currently?

AThe article outlines several key structural problems: 1. A lack of genuine, current demand for Agent commerce. 2. Major payment infrastructure players like Visa require extremely high minimum revenue thresholds ($250M) and lengthy KYC processes, making them accessible only to giants like Amazon. 3. For Agent-to-consumer commerce, the chat-based user experience is inferior to traditional visual e-commerce for most product categories, and achieving consumer-scale distribution is dominated by large platforms like DoorDash and Amazon. 4. For Agent-to-API payments, while there is an opportunity in the long tail of services, the scale is limited and most major SaaS providers prefer enterprise subscription models over micro-payments.

QWhat is the author's assessment of the 'Agent-to-Agent' business model?

AThe author assesses the Agent-to-Agent business model as a long-term vision that is currently almost entirely theoretical, with no one having achieved meaningful transaction volume yet. Startups are working on core challenges like Agent discovery, trust establishment, terms negotiation, and dispute resolution. The author believes that when this model materializes, it will involve transactions fundamentally different from existing payment rails (e.g., sub-second latency, micro-cent to million-dollar transactions, multi-party settlements) and will happen quickly and at scale. However, they stress that a 'real long-term bet' is different from the 'current market.'

QWhy do major companies like Stripe continue to build Agent commerce infrastructure despite the current lack of demand?

AMajor companies continue to build for two main reasons: 1. **Motivation/Incentive:** They have ample cash flow to bet on a future that may take years to materialize. For them, the cost of being five years early is negligible, while the cost of being one year late could be catastrophic. Therefore, they feel they must build. 2. **Cognitive Bias:** When a company's core business is payments, every problem looks like a payments problem. They see the Agent economy needing a payment layer, so they build it, even though payment is just one part of a larger challenge.

QWhat does the author identify as the 'real prize' in the Agent economy, beyond payment infrastructure?

AThe author identifies **orchestration** as the real prize. The core challenge is not just moving money between Agents, but rather coordinating work between Agents and humans, verifying the work output, and settling on the results. Payment is just one part of settlement, and settlement is just one part of orchestration. The author argues that large-scale orchestration will naturally give rise to settlement mechanisms as a necessity. Companies that solve the orchestration problem will subsume payments, not the other way around.

QWhich category of Agent interaction does the author say has existing demand and paying customers?

AThe author states that **Agent-to-Finance** is arguably the only category with existing demand. A customer base already exists and is willing to pay, as fund managers, finance teams, and DeFi users already pay for financial tools. Integrating AI into these existing workflows is a natural product evolution. Furthermore, Agent finance enables entirely new behaviors, like autonomous, real-time monitoring and rebalancing of hundreds of positions, which is a substantive capability enhancement over manual processes.

Leituras Relacionadas

Kalshi, MTS, and a16z's Ambition

The article "Kalshi, MTS, and a16z's Ambition" explores prediction markets as a focal point of excitement in 2025 for investors, crypto enthusiasts, and media. It traces their intellectual lineage from Friedrich Hayek's ideas on dispersed knowledge and market coordination to Robin Hanson's Logarithmic Market Scoring Rule (LMSR), which incentivizes truthful information sharing. The piece argues that a16z's significant investment in prediction market platform Kalshi (valued at $220B) transcends mere financial speculation. a16z frames prediction markets as a new form of "media" that provides "presence"—a way for individuals to actively engage with and influence world events through financial stakes, countering postmodern detachment. By wagering on outcomes, users become "super observers," and the market's aggregated probabilities gain authoritative power to define event truth and importance. The article uses media company MTS ("Monitoring The Situation") as a case study of a16z's "new media" strategy: rapidly producing high-intensity, multi-format content to "take over the timeline." However, prediction markets like Kalshi are presented as the ultimate piece in this media empire. Their real-money, crowd-sourced probabilities possess a unique "reality distortion field" and perceived objectivity, potentially swaying public opinion and granting a private company unprecedented interpretive power over reality. Ultimately, Kalshi's immense valuation is attributed not just to its exchange model, but to its role as a foundational component in a16z's envisioned new media landscape, where prediction markets define narrative and truth.

链捕手Há 5h

Kalshi, MTS, and a16z's Ambition

链捕手Há 5h

Trading

Spot
Futuros
活动图片