Slow Down: The Answer in the Age of Agents

marsbitPublished on 2026-03-29Last updated on 2026-03-29

Abstract

In the era of generative AI, the software industry is shifting from amazement to efficiency anxiety. However, as coding agents are increasingly used in production, issues like amplified errors, uncontrolled complexity, and reduced system reliability emerge. The author argues that agents lack human-like learning from mistakes and, without proper bottlenecks and feedback, minor issues quickly escalate. Their limited perspective and low recall in complex codebases further worsen structural chaos. The core problem isn’t the technology but the premature surrender of human judgment and control driven by anxiety. Instead of fully outsourcing work to agents, the author advocates for a balanced approach: assign agents localized, well-defined tasks while retaining human oversight for system design, quality assurance, and critical decisions. Slowing down becomes a strength—it ensures understanding, enables informed trade-offs, and maintains control. Ultimately, what’s scarce in the AI age isn’t faster code generation but the judgment to manage complexity and the discipline to choose quality over speed.

Editor's Note: As generative AI rapidly integrates into software engineering, industry sentiment is shifting from "awe at capabilities" to "anxiety about efficiency." Not writing fast enough, not using it enough, or not automating thoroughly enough seem to create pressure to avoid being left behind. But as coding Agents truly enter production environments, more practical issues emerge: errors are amplified, complexity spirals out of control, systems become increasingly incomprehensible, and efficiency gains do not translate proportionally into quality improvements.

Based on firsthand practice, this article offers a sober reflection on the current "agentic coding" frenzy. The author points out that Agents do not learn from mistakes like humans do; without bottlenecks and feedback mechanisms, minor issues are rapidly magnified. Furthermore, in complex codebases, their local perspective and limited recall capabilities exacerbate the chaos of the system structure. The essence of these problems lies not in the technology itself, but in humans, driven by anxiety, prematurely relinquishing judgment and control.

Therefore, rather than succumbing to the anxiety of "must we fully embrace AI," it's better to recalibrate the relationship between humans and tools: let Agents handle local, controllable tasks, while firmly keeping system architecture, quality control, and key decision-making in our own hands. In this process, "slowing down" becomes a capability—it means you still understand the system, can make trade-offs, and still retain a sense of control over your work.

In an era of constantly evolving tools, what is truly scarce might not be faster generation capabilities, but the judgment to handle complexity and the fortitude to make choices between efficiency and quality.

The original text follows:

About a year ago, coding Agents that could genuinely help you "complete entire projects from start to finish" began to appear. Earlier tools like Aider and the early Cursor existed, but they were more like assistants than "agents." The new generation of tools is extremely attractive, and many people spent a lot of their free time doing all those projects they always wanted to do but never had time for.

I think that's fine in itself. Working on things in your free time is inherently enjoyable, and most of the time you don't really need to worry about code quality and maintainability. It also gives you a path to learn new tech stacks.

During the Christmas holidays, Anthropic and OpenAI even gave out some "free credits," sucking people in like a slot machine. For many, this was the first real experience of the magic of "Agents writing code." More and more people got involved.

Now, coding Agents are also starting to enter production codebases. Twelve months on, we are beginning to see the consequences of this "progress." Here are my current thoughts.

Everything is Broken

While this is mostly anecdotal, software today gives a feeling of being "fragile and ready to break." 98% availability is becoming the norm rather than the exception, even for large services. User interfaces are filled with outrageous bugs, the kind that QA teams should catch at a glance.

I admit this situation existed before Agents appeared. But now, the problem is clearly accelerating.

We can't see what's happening inside companies, but occasionally information leaks out, like the rumored "AI-induced AWS outage." Amazon Web Services was quick to "correct" the story, but then immediately launched a 90-day remediation plan internally.

Satya Nadella (Microsoft CEO) has also recently emphasized that more and more code within the company is written by AI. While there's no direct evidence, there is a feeling that Windows quality is declining. Even from blogs published by Microsoft themselves, they seem to tacitly acknowledge this.

Companies that claim "100% of the product code is AI-generated" almost always outputting the worst products you can imagine. No offense, but memory leaks measured in GB, chaotic UIs, incomplete features, frequent crashes... these are hardly the "quality endorsements" they think they are, let alone positive examples of "letting the Agent do everything for you."

Privately, you hear more and more, from both large companies and small teams, saying one thing: they have been backed into a corner by "Agent-written code." No code reviews, handing design decisions to Agents, piling on features nobody needs—the outcome is predictably bad.

Why We Shouldn't Use Agents This Way

We have almost abandoned all engineering discipline and subjective judgment, instead falling into an "addictive" way of working: the sole goal is to generate the most code in the shortest time, with no consideration for the consequences.

You're building an orchestration layer to command an army of automated Agents. You install Beads, completely unaware that it's essentially almost uninstallable "malware." Just because the internet says "everyone is doing it." If you don't, you're "not gonna make it" (ngmi).

You're consuming yourself in a constant "recursive loop."

Look—Anthropic used a group of Agents to make a C compiler. It has problems now, but the next-gen model will fix it, right?

Look again—Cursor used a large group of Agents to make a browser. It's basically unusable now and needs manual intervention from time to time, but the next-gen model will handle it, right?

"Distributed," "divide and conquer," "autonomous systems," "lights-out factory," "solving software in six months," "SaaS is dead, my grandma just built a Shopify with Claw"...

These narratives sound exciting.

Sure, this approach might "still work" for your side project that almost no one uses (including yourself). Maybe, just maybe, there exists a genius who can use this method to create a non-garbage, actually-used software product. If you are that person, I sincerely admire you.

But at least in my circle of developer acquaintances, I haven't seen a case where this method actually works. Of course, maybe we're all just too incompetent.

Errors Compound Without Learning, Without Bottlenecks, with Delayed Explosions

The problem with Agents is: they make mistakes. That's fine in itself; humans make mistakes too. They might be correctness errors, easy to identify and fix, and adding a regression test makes it more stable. Or they might be code smells that linters can't catch: an unused method here, an unreasonable type there, some duplicate code, etc. Individually, these are harmless; human developers make these minor mistakes too.

But "machines" are not people. After making the same mistake a few times, humans usually learn not to repeat it—either scolded into awareness or through genuine process improvement.

Agents lack this learning capability, at least by default. They will repeat the same mistakes over and over, and might even "create" wonderful combinations of different errors based on training data.

You can certainly try to "train" it: write rules in AGENTS.md telling it not to make this mistake; design a complex memory system for it to query historical errors and best practices. This can work for certain specific types of problems. But the prerequisite is—you must first observe it making this error.

The more critical difference is: humans are a bottleneck, Agents are not.

A human cannot spit out twenty thousand lines of code in a few hours. Even with a non-trivial error rate, only a limited number of errors can be introduced per day, and their accumulation is slow. Usually, when the "pain from errors" accumulates to a certain level, humans (instinctively averse to pain) will stop to fix them. Or the person is replaced, and someone else fixes it. In short, problems get handled.

But when you use a whole orchestrated army of Agents, there is no bottleneck and no "pain sensation." These originally trivial minor errors compound at an unsustainable rate. You have been removed from the loop, unaware that these seemingly harmless small issues have grown into a behemoth. By the time you truly feel the pain, it's often too late.

Until one day, you want to add a new feature and find the current system architecture (essentially a pile of errors) cannot support the change; or users start complaining frantically because the latest release has problems, or even lost data.

That's when you realize: you can no longer trust this code.

Worse, the thousands of unit tests, snapshot tests, and end-to-end tests you had the Agent generate are also no longer trustworthy. The only way left to determine if "the system is working properly" is manual testing.

Congratulations, you've screwed yourself (and the company).

Purveyors of Complexity

You have completely lost track of what's happening in the system because you handed control to the Agent. And Agents, by nature, are "purveyors of complexity." They have seen tons of terrible architectural decisions in their training data, and these patterns are reinforced during their RL process. Letting them design the system leads to predictable results.

What you end up with is: an extremely complex system, a mishmash of poor imitations of "industry best practices," which you failed to constrain before the problems got out of hand.

But the problem goes further. Your Agents do not share execution context with each other, cannot see the entire codebase, and do not understand the decisions you or other Agents made previously. Therefore, their decisions are always "local."

This directly leads to the problems mentioned earlier: massive code duplication, structures abstracted for abstraction's sake, various inconsistencies. These problems compound, eventually forming an irredeemably complex system.

This is actually very similar to human-written enterprise codebases. Except that kind of complexity is usually the result of years of accumulation: the pain is distributed across many people, no single person reaches the "must fix" breaking point, and the organization itself has high tolerance, so complexity "co-evolves" with the organization.

But in a human + Agent combination, this process is greatly accelerated. Two people, plus a bunch of Agents, can reach this level of complexity in weeks.

Agentic Search Has Low Recall

You might pin your hopes on the Agent to "clean up the mess," to help you refactor, optimize, and clean the system. But the problem is: they can't do it anymore.

Because the codebase is too large, the complexity too high, and they can only ever see locally. This isn't just about the context window being too small, or long-context mechanisms failing against millions of lines of code. The problem is more subtle.

Before the Agent attempts to fix the system, it must first find all the code that needs modification, as well as existing implementations that can be reused. This step we call agentic search.

How the Agent does this depends on the tools you give it: it could be Bash + ripgrep, a queryable code index, an LSP service, a vector database...

But no matter the tool, the essence is the same: the larger the codebase, the lower the recall. And low recall means: the Agent cannot find all relevant code, and therefore cannot make correct modifications.

This is also why those minor "code smell" errors appeared in the first place; it didn't find the existing implementation, so it reinvented the wheel, introducing inconsistency. Eventually, these problems spread and compound, blooming into an extremely complex "flower of rot."

So how do we avoid all this?

How We Should Collaborate with Agents (For Now)

Coding Agents are like sirens, luring you in with extremely fast code generation speed and that "intermittent yet occasionally stunning" intelligence. They can often complete simple tasks with astonishing speed and high quality. The real problems start when you get the idea—"This is so powerful, computer, do my work for me!"

There's nothing wrong with assigning tasks to Agents per se. Good Agent tasks typically have several characteristics: the scope can be well-defined, not requiring understanding of the entire system; the task is closed-loop, meaning the Agent can evaluate the result itself; the output is not on the critical path, just some temporary tool or internal software, not affecting real users or revenue; or you just need a "rubber duck" to aid thinking—essentially taking your ideas and colliding them with the compressed knowledge of the internet and synthetic data.

If these conditions are met, then it's a task suitable for an Agent, provided that you, the human, remain the final quality gatekeeper.

For example, using Andrej Karpathy's auto-research method to optimize application startup time? Great. But you must be clear that the code it spits out is absolutely not production-ready. Auto-research works because you give it an evaluation function, allowing it to optimize around a specific metric (like startup time or loss). But this evaluation function only covers a very narrow dimension. The Agent will righteously ignore all metrics not in the evaluation function, like code quality, system complexity, and even correctness in some cases—if your evaluation function itself is flawed.

The core idea is simple: let Agents do the boring things that don't teach you anything new, or the exploratory work you never had time to try. Then you evaluate the results, pick out the parts that are actually reasonable and correct, and complete the final implementation. Of course, you can also use an Agent for this final step.

But what I want to emphasize more is: really, slow down a bit.

Give yourself time to think about what you are actually doing and why. Give yourself a chance to say "no," "No, we don't need this." Set a clear upper limit for the Agent: how much code it is allowed to generate per day, an amount that should match your actual ability to review it. All parts that determine the "overall shape" of the system, like architecture, APIs, etc., should be written by hand. You can use autocomplete to get a "feel of handwritten code," or pair program with an Agent, but the key is: you must be in the code.

Because, writing code yourself, or watching it being built step by step, brings a sense of "friction." It is precisely this friction that makes you clearer about what you want to do, how the system works, and the overall "feel." This is where experience and "taste" come into play, and this is precisely what the most advanced models currently cannot replace. Slowing down, enduring a bit of friction, is exactly how you learn and grow.

In the end, what you get will be a system that is still maintainable—at least no worse than before Agents appeared. Yes, past systems weren't perfect either. But your users will thank you because your product is "usable," not a pile of slapped-together garbage.

You will do fewer features, but more correctly. Learning to say "no" is a capability in itself. You can also sleep soundly because you at least still know what's happening in the system; you still hold the initiative. It is this understanding that allows you to compensate for the recall problems of agentic search, making the Agent's output more reliable and requiring less patching.

When the system has problems, you can step in and fix it; when the design was不合理 from the start, you can understand the issue and refactor it into a better form. Whether there's an Agent or not isn't really that important.

All of this requires discipline. All of this depends on people.

Trending Cryptos

CitreaCTR

wrapped stUSDTWSTUSDT

Velodrome FinanceVELODROME

BrevisBREV

PancakeSwapCAKE

JUSTJST

Suspected 4th Coldcard attack wave sweeps 389 Bitcoin: Galaxy’s Thorn

Coldcard hardware wallet users are facing a new wave of coordinated attacks targeting a firmware flaw, with researcher Alex Thorn flagging 218 recent transactions moving approximately 389 Bitcoin from potentially impacted addresses. The attack pattern shows a high rate of transactions targeting unique victim addresses, differing from previous waves. The vulnerability, which causes affected devices to generate weaker wallet seeds, is estimated to have impacted over 1,100 wallets, leading to around $90 million in Bitcoin stolen. Thorn advises affected users who control their keys may attempt to move funds with a higher-fee transaction before the attacker's transactions are confirmed.

cointelegraph2m ago

Suspected 4th Coldcard attack wave sweeps 389 Bitcoin: Galaxy’s Thorn

cointelegraph2m ago

Bitcoin Miners Are Waving the White Flag, But Their Stocks Are Soaring

Bitcoin miners are capitulating as evidenced by a sustained drop in network hash rate and a record-steep 19.9% decline in mining difficulty, signaling the shuttering of unprofitable machines. However, in a significant divergence from historical patterns, the stocks of publicly traded mining companies have soared, with one major player gaining over 430% in the past year, even as BTC's price fell roughly 46%. This surge is largely attributed to these companies pivoting toward the more lucrative AI narrative. Simultaneously, miners face a structural squeeze. Daily block reward revenue in BTC terms has hit a new all-time low following the latest halving, with current dollar-denominated daily revenue around $30 million compared to a longer-term average of ~$40 million. Fee revenue remains negligible, covering less than one block's subsidy over a 28-day average and accounting for only about ten minutes of the network's daily security budget. This capitulation cycle is unique: miner stress and crypto price weakness have decoupled due to alternative revenue streams (AI), while the long-term reliance on increasing bitcoin prices to offset shrinking subsidies continues, with fee income still far from filling the impending gap.

marsbit5m ago

Bitcoin Miners Are Waving the White Flag, But Their Stocks Are Soaring

marsbit5m ago

Will the Fed Definitely Raise Interest Rates in September? How Will Crypto and U.S. Stocks Withstand the Pressure?

The market's expectation for a September Fed rate hike surged dramatically in early August, jumping from under 50% to over 80% within a week. This shift followed a contentious July FOMC meeting, where a 9-3 vote to hold rates revealed growing dissent from hawkish members advocating for an immediate hike to combat persistent inflation. The primary catalyst for this repricing is rising oil prices, driven by renewed geopolitical tensions around the Strait of Hormuz, which threaten global supply. Energy costs directly influence inflation metrics, making the upcoming July CPI report (due August 12th) a critical data point. If it shows inflation reaccelerating, the probability of a September hike will solidify. For Bitcoin and crypto assets, this is typically bearish news. Bitcoin continues to behave as a high-beta, liquidity-sensitive risk asset. A rate hike raises the opportunity cost of holding non-yielding assets and could drive capital toward money markets, pressuring crypto prices in the short term. However, historical patterns suggest that if a hike is perceived as the end of a tightening cycle rather than the start, any negative price impact may be brief. U.S. stocks, particularly crypto-linked equities like Coinbase and growth-oriented tech stocks, are also vulnerable. Higher rates increase discount rates in valuation models, putting pressure on high-multiple companies. This coincides with a pivotal tech earnings season where investor focus has shifted from massive AI capital expenditure to tangible revenue and cash flow generation. Companies with negative cash flow and weak growth narratives could face heightened volatility if borrowing costs rise in September. In summary, a September Fed hike has evolved into a mainstream market scenario. Key factors to watch are oil prices, the July CPI report, and Fed communications, which will determine the final decision and its impact on volatile crypto and equity markets.

marsbit5m ago

Will the Fed Definitely Raise Interest Rates in September? How Will Crypto and U.S. Stocks Withstand the Pressure?

marsbit5m ago

A 'Overlooked' Market Event: Joint US-Japan-South Korea Intervention, Rare US Treasury Involvement, and Bessent's Quiet 'Market Rescue'?

Summary: The United States, Japan, and South Korea executed their largest coordinated foreign exchange intervention in nearly 30 years. The action targeted depreciation pressure on the Japanese yen and South Korean won. This move is seen as a significant effort by the US to stabilize the financial markets of its key allies and prevent the spillover of risks. Key details: * Japan reportedly intervened on July 30 using approximately 8.45 trillion yen (about $52.8 billion). South Korean authorities also intervened that day, selling dollars to support the won. * Notably, the US Treasury Department intervened directly in yen markets for the first time in roughly 30 years. The New York Fed, reportedly acting on behalf of the Treasury, sold euros to buy yen via Goldman Sachs and Morgan Stanley on July 31. Analysts view the use of the euro-yen pair as a way to alleviate yen pressure without adding selling pressure to the US dollar. * Prior to the action, the New York Fed conducted "rate checks" on both USD/JPY and EUR/JPY, a newer signaling tool that falls between verbal and physical intervention. The intervention is interpreted as going beyond traditional currency stabilization. Analysts, such as Michael Hartnett of Bank of America, suggest it resembles a "Price Keeping Operation" for the AI era. The core US objectives are perceived to be: 1. Preventing rapid yen depreciation from triggering a sharp rise in Japanese government bond yields. 2. Containing financial stress from spreading across Asian markets like South Korea and Japan. 3. Reducing the risk of disorderly capital flows impacting the US bond market. This coordinated action underscores the importance of Japan and South Korea as critical partners in the US semiconductor and AI supply chain. Stabilizing their financial markets is seen as vital to mitigating risks to the broader tech industry and the US market itself. The intervention coincides with market pressures, including the KOSDAQ index hitting a low since October 2022. While seen as a move to control volatility, some analysts caution it may not fundamentally reverse existing market trends.

marsbit9m ago

A 'Overlooked' Market Event: Joint US-Japan-South Korea Intervention, Rare US Treasury Involvement, and Bessent's Quiet 'Market Rescue'?

marsbit9m ago

Will the Federal Reserve Definitely Raise Interest Rates in September? How Will Cryptocurrencies and US Stocks Bear the Pressure?

In early August 2024, market expectations for a September Federal Reserve rate hike surged dramatically, from below 50% to over 80%, driven by renewed inflation concerns. This shift followed a contentious July FOMC meeting where a 9-3 vote to hold rates revealed a growing hawkish faction advocating for an immediate hike, citing prolonged above-target inflation. The key catalyst is escalating conflict near the Strait of Hormuz, which has pushed oil prices up approximately 20% in July, threatening to reignite inflation. The next critical data point is the July CPI report on August 12th; a hot reading could solidify hike expectations. For crypto assets, particularly Bitcoin, this represents near-term pressure. Bitcoin continues to exhibit high-beta, risk-on characteristics, making it sensitive to tightening liquidity and higher opportunity costs. However, historical precedent suggests that if a hike is perceived as the cycle's end rather than its start, the negative impact may be brief, with markets quickly pivoting to anticipate future rate cuts. U.S. stocks, especially crypto-linked equities like Coinbase and high-valuation tech stocks, face amplified volatility. Higher rates increase discount rates in valuation models, pressuring growth stocks. This coincides with a pivotal tech earnings season where investor focus has shifted from massive AI capital expenditures to demonstrable revenue and cash flow generation. Companies with negative cash flows and weak growth narratives could see severe pressure if a September hike materializes, as financing costs would rise. Key indicators to watch include oil prices, upcoming inflation data, and Fed commentary at events like the Jackson Hole symposium.

Odaily星球日报9m ago

Will the Federal Reserve Definitely Raise Interest Rates in September? How Will Cryptocurrencies and US Stocks Bear the Pressure?

Odaily星球日报9m ago

Trading

Spot

Hot Articles

The Cornerstone of the Autonomous AI Economy: How Talus is Reshaping On-Chain Intelligent Agents

Talus is a decentralized AI Agent framework built on the Sui, designed to solve the structural problems of current AI systems: centralization, opacity, and a lack of native economic identity.

43.4k Total ViewsPublished 2026.03.18Updated 2026.03.18

The Cornerstone of the Autonomous AI Economy: How Talus is Reshaping On-Chain Intelligent Agents

In-depth Analysis of AI and Crypto: The Era of Symbiosis between Algorithms and Ledgers

By 2026, the integration of artificial intelligence and cryptocurrency has advanced from proof-of-concept to a new stage of "system-level integration".

3.0k Total ViewsPublished 2026.03.26Updated 2026.03.26

In-depth Analysis of AI and Crypto: The Era of Symbiosis between Algorithms and Ledgers

U.S. Equity TradFi Assets: Traditional Finance as a Steady Anchor Amid the AI IPO Boom

In 2026, the U.S. IPO market has regained momentum.

37.0k Total ViewsPublished 2026.07.08Updated 2026.07.08

U.S. Equity TradFi Assets: Traditional Finance as a Steady Anchor Amid the AI IPO Boom

Discussions

Welcome to the HTX Community. Here, you can stay informed about the latest platform developments and gain access to professional market insights. Users' opinions on the price of AI (AI) are presented below.

Slow Down: The Answer in the Age of Agents

Abstract

Everything is Broken

Why We Shouldn't Use Agents This Way

Errors Compound Without Learning, Without Bottlenecks, with Delayed Explosions

Purveyors of Complexity

Agentic Search Has Low Recall

How We Should Collaborate with Agents (For Now)

Trending Cryptos

Related Questions

Related Reads

Suspected 4th Coldcard attack wave sweeps 389 Bitcoin: Galaxy’s Thorn

Bitcoin Miners Are Waving the White Flag, But Their Stocks Are Soaring

Will the Fed Definitely Raise Interest Rates in September? How Will Crypto and U.S. Stocks Withstand the Pressure?

A 'Overlooked' Market Event: Joint US-Japan-South Korea Intervention, Rare US Treasury Involvement, and Bessent's Quiet 'Market Rescue'?

Will the Federal Reserve Definitely Raise Interest Rates in September? How Will Cryptocurrencies and US Stocks Bear the Pressure?

Trading

Hot Articles

The Cornerstone of the Autonomous AI Economy: How Talus is Reshaping On-Chain Intelligent Agents

In-depth Analysis of AI and Crypto: The Era of Symbiosis between Algorithms and Ledgers

U.S. Equity TradFi Assets: Traditional Finance as a Steady Anchor Amid the AI IPO Boom

Discussions

Top Questions