DWF In-Depth Report: AI Outperforms Humans in DeFi Yield Optimization, but Lags 5x Behind in Complex Trading

marsbitPublished on 2026-04-17Last updated on 2026-04-17

Abstract

DWF Ventures' report highlights the growing role of AI agents in DeFi, accounting for nearly 19% of on-chain activity. While agents excel in rule-based tasks like yield optimization—achieving returns of over 9.75% for assets like USDC—they significantly underperform humans in complex trading. In head-to-head competitions, top human traders outperformed the best AI agents by more than 5x. Key factors affecting agent performance include model selection, risk management, position holding time, and leverage levels. The path to full autonomy remains challenging due to infrastructure limitations, trust issues, and risks like Sybil attacks and strategy crowding. As agent adoption grows, robust infrastructure and transparency will be critical for scalability and trust.

Author: DWF Ventures

Compiled by: Deep Tide TechFlow

Deep Tide Guide: AI Agents already account for nearly one-fifth of DeFi trading volume and indeed outperform humans in well-defined scenarios like yield optimization. But when it comes to autonomous trading, top-tier AI performance is less than one-fifth that of top-tier humans. This research breaks down the real performance of AI in different DeFi scenarios and is a must-read for anyone interested in automated trading.

Key Points

Automation and agent activity currently account for about 19% of all on-chain activity, but true end-to-end autonomy has not yet been achieved.

In narrow, well-defined use cases like yield optimization, agents have demonstrated performance superior to humans and bots. However, for multifaceted actions like trading, humans outperform agents.

Among agents themselves, model selection and risk management have the greatest impact on trading performance.

As agents are adopted at scale, there are multiple risks concerning trust and execution, including Sybil attacks, strategy crowding, and privacy trade-offs.

Agent Activity Continues to Grow

Agent activity has grown steadily over the past year, with both trading volume and number of transactions increasing. We've seen significant developments led by Coinbase's x402 protocol, with players like Visa, Stripe, and Google also joining in to launch their own standards. Most of the infrastructure being built currently aims to serve two types of scenarios: channels between agents or agent invocations triggered by humans.

While stablecoin transactions are widely supported, the current infrastructure still relies on traditional payment gateways as the underlying layer, meaning it remains dependent on centralized counterparties. Therefore, the ultimate "full autonomy" scenario, where agents can self-fund, self-execute, and continuously optimize based on changing conditions, has not yet been realized.

Agents are not entirely new to DeFi. Automation via bots has existed in on-chain protocols for years, capturing MEV or obtaining excess returns not achievable without code. These systems operate very well under well-defined parameters that don't change frequently or require additional supervision. However, markets have become more complex over time. This is where we see a new generation of agents emerging, with the on-chain space becoming an experimental ground for such activity over the past few months.

The Actual Performance of Agents

According to the report, agent activity has grown exponentially, with over 17,000 agents launched since 2025. The total volume of automated/agent activity is estimated to cover over 19% of all on-chain activity. This is not surprising, as it's estimated that over 76% of stablecoin transfer volume is generated by bots. This indicates a huge growth space for agent activity in DeFi.

There is a broad spectrum of agent autonomy, ranging from chatbot-like experiences requiring high human supervision to agents that can formulate strategies adapting to market conditions based on goal inputs. Compared to bots, agents have several key advantages, including the ability to respond to and execute on new information within milliseconds, and the ability to extend coverage to thousands of markets while maintaining the same rigor.

Currently, most agents are still at the analyst to co-pilot level, as most are still in the testing phase.

Yield Optimization: Agents Excel

Liquidity provision is an area where automation already occurs frequently, with the total TVL held by agents exceeding $39 million. This figure primarily measures assets deposited directly into agents by users but does not include capital routed through vaults.

Giza Tech is one of the largest protocols in this space, launching its first agent application, ARMA, late last year, designed to enhance yield capture on major DeFi protocols. It has attracted over $19 million in assets under management and generated over $4 billion in agent trading volume. The high ratio of trading volume to total assets under management indicates that agents frequently rebalance capital, enabling higher yield capture. Once capital is deposited into the contract, execution is automated, thus providing users with a simple one-click experience requiring almost no supervision.

ARMA's performance is measurably excellent, generating an annualized yield of over 9.75% for USDC. Even after considering additional rebalancing fees and the agent's 10% performance fee, the yield still exceeds ordinary lending on Aave or Morpho. Nonetheless, scalability remains a key issue, as these agents have not yet been battle-tested to manage or scale to the size of major DeFi protocols.

Trading: Humans Lead Significantly

However, for more complex actions like trading, the results are much more varied. Current trading models operate based on human-defined inputs and provide outputs according to preset rules. Machine learning extends this by enabling models to update their behavior based on new information without explicit reprogramming, advancing it to a co-pilot role. As fully autonomous agents join, the trading landscape will change dramatically.

Several trading competitions have been held between agents and between humans and agents, showing significant variation between models. Trade XYZ hosted a human vs. agent trading competition for stocks listed on its platform. Each account had an initial capital of $10,000, with no restrictions on leverage or trading frequency. The results were overwhelmingly in favor of humans, with top humans outperforming top agents by more than 5 times.

Meanwhile, Nof1 hosted an agent trading competition among models, pitting several models (Grok-4, GPT-5, Deepseek, Kimi, Qwen3, Claude, Gemini) against each other, testing different risk configurations from capital preservation to maximum leverage. The results revealed several factors that can help explain performance differences:

Holding Time: There was a strong correlation; models that held positions for an average of 2-3 hours significantly outperformed those that flipped frequently.

Expected Value: This measures whether a model makes money on average per trade. Interestingly, only the top 3 models had a positive expected value, meaning most models lost money on more trades than they profited from.

Leverage: Lower average leverage levels of 6-8x proved to perform better than models running over 10x leverage, as high levels accelerated losses.

Prompting Strategy: Monk Mode was by far the best-performing strategy, while Situational Awareness performed the worst. Based on the model's characteristics, it showed that focusing on risk management and fewer external sources led to better performance.

Base Model: Grok 4.20 significantly outperformed other models by over 22% across different prompting strategies and was the only model that was profitable on average.

Other factors like long/short preference, trade size, and confidence score did not have sufficient data or were not proven to have any positive correlation with model performance. Overall, the results indicate that agents tend to perform better within clearly defined constraints, meaning humans are still very much needed for target configuration.

How to Evaluate Agents

Given that agents are still in their early stages, there is no comprehensive evaluation framework yet. Historical performance is often used as a benchmark for evaluating agents, but they are influenced by underlying factors that provide stronger indications of robust agent performance.

Performance Across Different Volatilities: Includes disciplined loss control when conditions deteriorate, indicating the agent's ability to identify off-chain factors that could affect trade profitability.

Transparency vs. Privacy: Both sides have their trade-offs. Transparent agents essentially have no strategic advantage if their trades can be actively copied. Private agents face the risk of insider extraction by the creator, who could easily front-run their own users.

Information Sources: The data sources an agent accesses are crucial for determining how it makes decisions. Ensuring sources are credible and without single dependencies is vital.

Security: It is important to have smart contract audits and proper fund custody architecture to ensure there are fallbacks in black swan events.

The Next Steps for Agents

For the mass adoption of agents, there is still a lot of work to be done on the infrastructure side. This boils down to key issues around agent trust and execution. The actions of autonomous agents have no guardrails, and instances of poor fund management have already occurred.

ERC-8004 went live in January 2026, becoming the first on-chain registry enabling autonomous agents to discover each other, establish verifiable reputations, and collaborate securely. This is a key unlock for DeFi composability, as trust scores are embedded in the smart contracts themselves, allowing for permissionless activity between agents and protocols. This does not guarantee that agents will always operate in a non-malicious manner, as security vulnerabilities like colluding on reputation and Sybil attacks can still occur. Therefore, there is still significant room to be filled in areas like insurance, security, and economic staking for agents.

As agent activity expands in DeFi, strategy crowding becomes a structural risk. Yield farming is the clearest precedent, where returns compress as strategies become popular. The same dynamic could apply to agent trading. If a large number of agents are trained on similar data and optimize for similar goals, they will converge on similar positions and similar exit signals.

A January 2026 Cornell University paper, CoinAlg, formalized a version of this problem. Transparent agents can be arbitraged because their trades are predictable and can be front-run. Private agents avoid this risk but introduce a different risk where the creator retains an information advantage over their own users and can extract value through the very opacity meant to protect them.

Agent activity will only continue to accelerate, and the infrastructure laid today will determine how the next phase of on-chain finance operates. As agent usage increases, they will self-iterate and become sharper at adapting to user preferences. Therefore, the main differentiator will come down to the infrastructure that can be trusted, and these will capture the largest market share.

Related Questions

QWhat percentage of on-chain activity is currently estimated to be covered by automation and agent activity?

AAutomation and agent activity is estimated to cover approximately 19% of all on-chain activity.

QIn which specific DeFi use case have agents demonstrated superior performance compared to humans and bots?

AAgents have demonstrated superior performance in the narrow, well-defined use case of yield optimization.

QAccording to the report, how much better did the top human traders perform compared to the top AI agents in trading competitions?

AThe top human traders outperformed the top AI agents by more than 5 times.

QWhat was the key factor that most significantly impacted trading performance among different AI agents, according to the Nof1 competition results?

AModel selection and risk management were the factors that most significantly impacted trading performance among different AI agents.

QWhat is the name of the first on-chain registry, launched in January 2026, that enables autonomous agents to discover each other and build verifiable reputations?

AThe first on-chain registry is called ERC-8004.

Related Reads

a16z: The Next Frontier of AI, The Triple Flywheel of Robotics, Autonomous Science, and Brain-Computer Interfaces

a16z presents a comprehensive investment thesis for the next frontier of AI: Physical AI, centered on a synergistic flywheel of robotics, autonomous science, and novel human-computer interfaces (HCIs) like brain-computers. While the current AI paradigm scales on language and code, the most disruptive future capabilities will emerge from three adjacent fields leveraging five core technical primitives: 1) learned representations of physical dynamics (via models like VLA, WAM, and native embodied models), 2) embodied action architectures (e.g., dual-system designs, diffusion-based motion generation, and RL fine-tuning like RECAP), 3) simulation and synthetic data as scaling infrastructure, 4) expanded sensory channels (touch, neural signals, silent speech, olfaction), and 5) closed-loop agent systems for long-horizon tasks. These primitives converge to power three key domains: * **Robotics:** The literal embodiment of AI, requiring all primitives for real-world physical interaction and manipulation. * **Autonomous Science:** Self-driving labs that conduct hypothesis-experiment-analysis loops, generating structured, causally-grounded data to improve physical AI models. * **Novel HCIs:** Devices (AR glasses, EMG wearables, BCIs) that expand human-AI bandwidth and act as massive data-collection networks for real-world human experience. These domains form a mutually reinforcing flywheel: Robotics enable autonomous labs, which in turn generate valuable data for robotics and materials science. New interfaces provide rich human-physical interaction data to train better robots and scientists. Together, they represent a new scaling axis for AI, moving beyond the digital realm to interact with and learn from physical reality, promising significant emergent capabilities and value.

marsbit5m ago

a16z: The Next Frontier of AI, The Triple Flywheel of Robotics, Autonomous Science, and Brain-Computer Interfaces

marsbit5m ago

Conversation with Bitwise Advisor: From K-Shaped Economy to AI Taking Jobs, How Can Bitcoin Save the Younger Generation?

Jeff Park, a macro strategist and advisor at Bitwise, argues that the traditional financial system is broken, particularly for young generations. He describes a "K-shaped economy" where asset inflation enriches the wealthy while leaving others behind, with unaffordable housing as a key symptom. Park explains that real estate is often a depreciating asset due to maintenance costs and taxes, yet it remains unattainable for many young people due to distorted demand from global capital flows. He proposes Bitcoin as a superior store of value—scarce, portable, and free from maintenance costs or excessive taxation. By diverting capital away from real estate, Bitcoin could help lower housing prices and increase accessibility. Park also discusses the decline of traditional "smart investing" (e.g., value stocks) and the rise of "ideological investing" in non-correlated assets like crypto, luxury goods, and collectibles. On AI, Park warns it could trigger extreme social inequality by eliminating jobs while boosting corporate profits. He believes this will push younger generations toward Bitcoin, not only as a hedge but also as a symbol of decentralization and data sovereignty—offering an alternative to centralized AI systems that use personal data without fair compensation. He advises a diversified portfolio with Bitcoin as a core holding to hedge against currency devaluation and systemic risk.

marsbit1h ago

Conversation with Bitwise Advisor: From K-Shaped Economy to AI Taking Jobs, How Can Bitcoin Save the Younger Generation?

marsbit1h ago

Trading

Spot
Futures

Hot Articles

Discussions

Welcome to the HTX Community. Here, you can stay informed about the latest platform developments and gain access to professional market insights. Users' opinions on the price of AI (AI) are presented below.

活动图片