Uncovering the Truth About Agent Commerce, Payments, and Infrastructure

marsbitPublished on 2026-06-07Last updated on 2026-06-07

Abstract

Decoding Agent Commerce, Payments, and Infrastructure: The Reality Over the past year, I've been building infrastructure for the Agent economy, engaging with major players like Stripe, Visa, Coinbase, Google, and dozens of startups. A clear conclusion emerges: true, large-scale demand does not yet exist. Startups face structural challenges. Data points illustrate this gap. Stripe's Agent commerce platform has over 1,000 merchants but only single-digit transacting agents. Visa's Agent payment token requires 9-month KYC and a $250M revenue threshold, accessible only to giants like Amazon. On-chain analysis reveals actual daily Agent transaction volume is around $17k, half of which are test transactions. The article analyzes four potential markets: **1. Agent-to-Merchant (A2M):** Current AI shopping UX is often inferior to traditional e-commerce for visual, comparison-heavy purchases (clothing, electronics). Chat interfaces are a step back. Real merchant interest is defensive "Agent Engine Optimization," fearing future obsolescence, not current demand. Potential exists in high-frequency, low-decision purchases (e.g., food delivery) or simplifying terrible UX (complex checkouts, non-native shoppers), but these require massive consumer distribution channels dominated by giants like DoorDash and Amazon. **2. Agent-to-API (A2A):** Developers already have subscriptions and billing for core APIs (compute, data). The argument for micro-payments via crypto for sub-dollar API calls ...

Author:jessy

Compiled by:Jiahuan, ChainCatcher

Over the past year, I've been dedicated to building infrastructure for the Agent economy, engaging with teams from Stripe, Visa, Coinbase, Google, and dozens of startups pushing Agent commerce forward. I've mapped out the entire industry, launched products, and sought market fit.

There is no real demand yet, and startups face numerous structural issues when venturing into this space.

Last month, Stripe announced 288 new products at its Sessions conference. Nearly 40% of the total documentation views were for its Agent commerce docs. Their Agent commerce market boasts over 1,000 activated merchants. Yet, during the Sessions conference, the number of registered Agents conducting transactions was in the single digits.

Visa mentioned that their Agent payment tokens (tokenized payment credentials bound to an Agent for making payments on a user's behalf) currently require 3 to 9 months for KYC approval and essentially need a minimum revenue threshold of $250 million to qualify. Today, only companies at the scale of Amazon or Walmart can complete this identity verification loop.

Coinbase reported that as of April, there were 69,000 active Agents and 165 million transactions on the x402 protocol. However, independent on-chain analysis shows actual daily transaction volume is around $17,000, with about half being test transactions (according to CoinDesk, March 2026).

Agent to Merchant

We built shop.fast.xyz to directly validate the real-world application of concierge-style commerce. It includes real products, merchants, and transactions.

For most product categories, the current user experience of AI shopping is entirely inferior to traditional e-commerce. When buying clothes, electronics, or furniture, you want to see pictures, browse options, and compare them side-by-side.

The conversational format of a chatbot is actually a step backward. You're replacing a rich visual interface with a plain text conversation, and humans are fundamentally visual shoppers.

Agents excel at areas we thought would be difficult. They can understand user needs and handle instructions like "similar to this but cheaper" well. The model layer works.

But it cannot replace the experience of lining up ten products side by side and choosing one. Chat interfaces can be enhanced with carousels and interactive displays, but at that point, you're essentially rebuilding an e-commerce frontend inside a chat window. For visually-driven comparison shopping, we haven't found a compelling reason why a chat interface is better than a native e-commerce interface.

We see real demand from merchants, but it's a defensive demand.

Merchants want their stores to be queryable by Agents. Not because current customers are buying through Agents, but because they fear being left behind if this becomes a mainstream channel.

This is an "Agent Engine Optimization (AEO)" strategy, but it's currently a nice-to-have, not a necessity. Merchants are preparing for a wave that hasn't arrived.

Conversational commerce can indeed improve the experience in certain scenarios: high-frequency, low-decision-cost purchases where the user already knows exactly what they want. Ordering food delivery is the clearest example. It's a huge market, extremely frequent, with quick decisions ("order me Pad Thai from the usual place"). Conversational Agents have a chance here.

But major food delivery platforms haven't opened their APIs. The only way is "computer use": letting the AI navigate the app visually like a human. This is slow, fragile, and the inference cost simply doesn't work for a $15 lunch order.

Another potential breakthrough lies where: the UI navigation of certain stores is extremely complex and painful. Layers of discounts, promo codes, loyalty programs, and confusing checkout processes.

An Agent that understands "use my coupons, deduct my reward points, find the cheapest shipping, operate in my native language" can simplify today's dreadful user experiences. This is particularly important for elderly users, non-native speakers shopping on foreign websites, or highly specific scenarios with very niche needs.

Both of these breakthrough points require massive consumer (B2C) distribution channels. You're competing with DoorDash (the largest US delivery platform with 56% market share) and Amazon for user entry points.

Consumer-scale distribution is a strength of giants. The supply side for concierge-style commerce is ready, while the demand side is constrained by user experience and distribution channels. Building more infrastructure doesn't solve these two problems.

Agent to API

We discussed actual payment needs with dozens of developers. The situation is almost strikingly consistent: Agent-to-API usage today is recurring, including compute, inference, and data sources. Developers already have subscription services, archived API keys, and billing relationships with core providers.

The typical stablecoin argument is: on Stripe, the minimum effective cost for credit card processing is about 2.9% plus 30 cents, making sub-one-dollar API calls uneconomical. But with today's low transaction volume, prepaid credits solve this. Developers top up their accounts in advance, problem solved.

The deeper issue is the supplier market. Most mainstream SaaS companies don't want to offer ephemeral API access costing fractions of a cent. Their business model is multi-year enterprise contracts. Companies whose revenue relies on large commitment contracts will resist pricing mechanisms that circumvent their existing models.

Machine commerce is structurally a long-tail market, comprising smaller services, niche data sources, individual developers, and MCP servers. Protocols like MPP and x402 are well-suited for this niche.

But by definition, this is a market serving advanced users with specific needs, and historically, developers have been one of the groups least willing to pay.

When Stripe Projects launched, it partnered with 32 supplier partners like Vercel, Supabase, Cloudflare, Twilio, etc., covering most of the tools developers use to build and deploy software, all accessible through existing billing systems. The top-of-stack demand for developers is already met.

The opportunity for new payment rails exists in everything beyond these top 30 services: the opportunity is real, but its scale is inherently much smaller than the flashy numbers suggest.

The same pattern applies to content acquisition. Agents are already scraping and summarizing articles, and publishers are pushing back.

But when content monetization arrives at scale, it will happen through the CDN providers already sitting between publishers and the internet (Cloudflare has already launched AI audit tools for this), or through large-scale licensing agreements between publishers and AI labs.

This infrastructure opportunity will ultimately flow to giants that already own distribution channels.

Agent to Agent

The Agent-to-Agent business model is a long-term vision, currently almost entirely theoretical, with no one achieving meaningful transaction volume. Startups are tackling core challenges: Agent discovery, trust establishment, terms negotiation, and dispute resolution.

When this transaction structure truly materializes, it will be fundamentally different from existing payment rails. Neither party in the transaction involves a human identity. Latency is sub-second. Funds ranging from fractions of a cent to millions of dollars move in the same flow.

Add to that multi-party settlements, which completely defy the bilateral buyer-seller model assumed by existing payment rails. When this happens, we believe it will come fast and at scale.

This is a long-term bet on dedicated settlement infrastructure, and it's real. But a "real long-term bet" and "current market" are two different things.

For months, we were also among those evangelizing this market and had built a complete infrastructure around it over the past few years. With our distributed network, we could theoretically scale to over 1 billion TPS, with latency under 50 milliseconds and average consensus of 10 milliseconds. But we must meet the market where it actually is today.

Agent to Finance

This is arguably the only category with existing demand. The customer base already exists and is willing to pay. Today, fund managers, finance teams, and DeFi users are paying for financial tools. Embedding AI into existing workflows is a natural product evolution.

Agent finance also creates entirely new behaviors. An Agent that can autonomously monitor and rebalance hundreds of positions in real-time operates in a way humans cannot replicate manually. This isn't just automation; it's a substantive capability enhancement.

The challenge is the competitive landscape. The financial industry is heavily regulated and relies heavily on established business relationships. Incumbents have licenses, compliance infrastructure, and client relationships. Startups can find a niche in less-regulated areas (like DeFi), areas where giants move slowly, or where AI creates capabilities giants don't possess.

But compared to the other three categories, the competitive dynamics here are more favorable to incumbents, as layering AI on top of existing products and customer bases is far easier than the reverse.

The Real Game

So, why are people still building this stuff? Two reasons.

First, motivation. Industry giants have ample cash flow to bet on a future that may take years to materialize. For them, the cost of entering five years early is a rounding error, while the cost of entering one year late is existential. So they must build.

Second, cognitive bias. When your core business is payments, every problem looks like a payments problem. The Agent economy needs a payments layer, so build the payments layer.

But payments are just one piece of a much larger puzzle. The real challenge isn't moving money between Agents, but coordinating work between Agents and humans, verifying the work was done, and settling the outcome. Payments are just part of settlement. Settlement is just part of coordination. And coordination is the real prize.

Coordination at scale will naturally give rise to settlement mechanisms as a necessity. Payments are just one instrument in the symphony, not the entire score. The companies solving coordination will swallow the payments business, not the other way around.

Most incumbents are building defensively for a future of large-scale machine transactions. Because their funding runway is infinite, the timeline doesn't matter to them.

But startups don't have that luxury. We must go where the market actually is; we can't wait for the wave to arrive.

A year of building has led us in an unexpected direction. There is real market activity there, growing rapidly, and underserved. It lies outside the four categories we've outlined.

Related Questions

QAccording to the article, what are the main structural problems startups face in the Agent economy currently?

AThe article outlines several key structural problems: 1. A lack of genuine, current demand for Agent commerce. 2. Major payment infrastructure players like Visa require extremely high minimum revenue thresholds ($250M) and lengthy KYC processes, making them accessible only to giants like Amazon. 3. For Agent-to-consumer commerce, the chat-based user experience is inferior to traditional visual e-commerce for most product categories, and achieving consumer-scale distribution is dominated by large platforms like DoorDash and Amazon. 4. For Agent-to-API payments, while there is an opportunity in the long tail of services, the scale is limited and most major SaaS providers prefer enterprise subscription models over micro-payments.

QWhat is the author's assessment of the 'Agent-to-Agent' business model?

AThe author assesses the Agent-to-Agent business model as a long-term vision that is currently almost entirely theoretical, with no one having achieved meaningful transaction volume yet. Startups are working on core challenges like Agent discovery, trust establishment, terms negotiation, and dispute resolution. The author believes that when this model materializes, it will involve transactions fundamentally different from existing payment rails (e.g., sub-second latency, micro-cent to million-dollar transactions, multi-party settlements) and will happen quickly and at scale. However, they stress that a 'real long-term bet' is different from the 'current market.'

QWhy do major companies like Stripe continue to build Agent commerce infrastructure despite the current lack of demand?

AMajor companies continue to build for two main reasons: 1. **Motivation/Incentive:** They have ample cash flow to bet on a future that may take years to materialize. For them, the cost of being five years early is negligible, while the cost of being one year late could be catastrophic. Therefore, they feel they must build. 2. **Cognitive Bias:** When a company's core business is payments, every problem looks like a payments problem. They see the Agent economy needing a payment layer, so they build it, even though payment is just one part of a larger challenge.

QWhat does the author identify as the 'real prize' in the Agent economy, beyond payment infrastructure?

AThe author identifies **orchestration** as the real prize. The core challenge is not just moving money between Agents, but rather coordinating work between Agents and humans, verifying the work output, and settling on the results. Payment is just one part of settlement, and settlement is just one part of orchestration. The author argues that large-scale orchestration will naturally give rise to settlement mechanisms as a necessity. Companies that solve the orchestration problem will subsume payments, not the other way around.

QWhich category of Agent interaction does the author say has existing demand and paying customers?

AThe author states that **Agent-to-Finance** is arguably the only category with existing demand. A customer base already exists and is willing to pay, as fund managers, finance teams, and DeFi users already pay for financial tools. Integrating AI into these existing workflows is a natural product evolution. Furthermore, Agent finance enables entirely new behaviors, like autonomous, real-time monitoring and rebalancing of hundreds of positions, which is a substantive capability enhancement over manual processes.

Related Reads

AI Investors' 2026 Anxiety: When Models Devour Everything, What Moat Is Left for Startups?

In 2026, a wave of investor anxiety questions the defensibility of AI startups as models improve, fearing that most companies are just "thin wrappers" destined to be absorbed by foundation models or chipmakers. The author argues against this despair, positing that true moats lie not in benchmark performance but in areas models cannot easily reach. The logic of despair is that if models excel at all measurable tasks, only compute and cutting-edge model weights hold lasting value. However, the essay contends that the most valuable work is inherently "untrainable." Benchmarks measure what can be measured and thus optimized for, but real-world correctness often resides in private, complex systems. Examples include legacy codebases, intricate legal transactions, or hospital workflows. This kind of correctness is proprietary, costly to establish, and cannot be validated quickly—it requires time and trust within an organization. As models commodify visible, measurable tasks from both above (labs absorbing scaffolding) and below (saturation by cheaper models), value shifts to "untrainable ground." This encompasses work where correctness is a private truth, locked behind integration barriers, licenses, liability frameworks, and entrenched user habits. Trust and adoption are slow, human-centric processes that smarter models cannot accelerate. Successful companies defend their position by embedding deeply into client operations, owning the definition of "good" within a specific domain (e.g., Harvey in law, OpenEvidence in medicine), and pricing on outcomes rather than tokens. While labs compete fiercely, they are incentivized to keep the application layer vibrant. The future belongs not to those competing on generic benchmarks but to those navigating unscoreable terrain, doing the "unsexy work" of translation between models and messy human realities. The most cited benchmark scores are thus maps of territory about to become worthless, signaling who will lose the right to define what counts as good.

marsbit29m ago

AI Investors' 2026 Anxiety: When Models Devour Everything, What Moat Is Left for Startups?

marsbit29m ago

Trump's Crypto Empire: A $2.3 Billion Wealth Transfer Experiment

In June 2026, Reuters investigations revealed that since Donald Trump's return to the White House, his family has accumulated roughly $2.3 billion in profits from four core crypto ventures: World Liberty Financial (WLFI), the $TRUMP meme coin, American Bitcoin, and ALT5 Sigma (later renamed AI Financial). Coincidentally, overall investor losses in these projects were estimated to be a similar amount. The businesses, spanning DeFi, stablecoins, meme coins, Bitcoin mining, and digital payments, largely relied not on technological innovation but on converting the political influence and notoriety of the Trump brand into financial assets sold to the market. This marks a dramatic shift from Trump's earlier skepticism of cryptocurrencies. The ventures operated on a similar logic: leveraging the Trump name to generate market hype and trust, attracting investment through token sales or public listings, and enabling the family to capture profits upfront through equity, token allocations, and fees, while later entrants often bore the brunt of the risk as markets cooled. WLFI, the most profitable venture, generated an estimated $1.6 billion for the family, primarily through sales of its locked, illiquid governance token and its USD1 stablecoin. The $TRUMP meme coin, a direct monetization of the presidential IP, brought in over $600 million for Trump-linked entities before its price crashed nearly 97% from its peak. American Bitcoin gained a "Trump stock" premium for its mining operations, and ALT5 Sigma/AI Financial combined Trump, AI, and crypto themes for a temporary valuation surge. The episode underscores how political influence can be packaged into financial assets, creating substantial wealth for promoters while highlighting the risks for investors who base decisions on hype and brand allegiance over fundamental business models and cash flows.

marsbit1h ago

Trump's Crypto Empire: A $2.3 Billion Wealth Transfer Experiment

marsbit1h ago

CFTC Proposes New Rules for Prediction Markets, Redefining Which Events Can Be Listed and Who Can Participate

The U.S. Commodity Futures Trading Commission (CFTC) has proposed new rules to establish a clearer regulatory framework for prediction markets. The proposal aims to modify how "event contracts" are reviewed, creating a structured process to determine if contracts involving terrorism, assassination, war, or illegal activities violate the public interest. This moves away from a blanket ban toward a case-by-case assessment of whether a contract's subject matter is acceptable for financial trading. A key focus is distinguishing between predicting the impact of risks and predicting the occurrence of harm. The proposal suggests that many sports-based prediction markets—such as those on game outcomes, scores, or season standings—may be permissible as they can provide price discovery and meaningful information. However, markets on easily manipulated events like specific player injuries, referee calls, or outcomes of youth sports would face stricter scrutiny. The rules directly target insider trading and manipulation risks, highlighting cases where individuals with non-public information or the ability to influence an event's outcome could unfairly profit. This underscores a shift toward ensuring market fairness. The proposal does not end the regulatory debate, particularly with state gambling regulators who argue that sports prediction markets are essentially sports betting and should fall under state jurisdiction. Nonetheless, the CFTC's action signals a move toward formalizing prediction markets, pushing the industry from a phase of rapid, often unregulated expansion into a more institutionalized, rule-based environment that more closely resembles traditional financial markets.

marsbit1h ago

CFTC Proposes New Rules for Prediction Markets, Redefining Which Events Can Be Listed and Who Can Participate

marsbit1h ago

Trading

Spot
Futures
活动图片