From IDE to Terminal: A Practical Guide to Agent Engineering

marsbitPublished on 2026-06-03Last updated on 2026-06-03

In March 2026, Matt Van Horn posted a long thread on platform X with a direct title: "Every Claude Code Trick I Know." The post's view count eventually settled at 913,000, with the comments section turning into a heated debate.

The spark of the debate wasn't a specific instruction or configuration parameter, but a statement he made at the beginning: No IDE. When describing his working process, he mentioned that he never once opened a graphical editor; all development operations occurred within the terminal command line and a file called plan.md. Some found this absurd and asked how code reading, debugging, and refactoring were handled; meanwhile, senior engineers commented, "This is the way I've been looking for."

Two months later, Chinese developer meng shao compiled and published this methodology along with community-derived practices, naming it "Comprehensive Record of Practical Tips for Agent Engineering." It is not a tool review, but a set of operating principles centered around the "Research → Plan → Work" cycle, underpinned by 22 specific, reproducible or discussable techniques.

This article compiles some of the most discussed AI workflow guides online, from which we might glean some common insights.

What You Actually Lose When You Close the IDE

A graphical IDE provides developers with more than just a code editor. It's a complete sensory system: syntax highlighting lets you instantly distinguish variables from keywords, real-time error hints tell you where problems lie as you type, breakpoint debugging allows you to observe variable state changes line by line, and file trees and breadcrumb navigation prevent you from getting lost in large projects. This entire visual feedback mechanism forms a default assumption: the person writing code needs to see the state of every line with their own eyes.

The terminal command line + Markdown file workflow strips away this visual protective shell. All that remains in front of you is a blinking cursor and a plan file you wrote yourself. No red squiggly lines marking errors, no auto-complete pop-ups, no file tree. Matt Van Horn used the term "No IDE" in his post. Its true meaning isn't to reject all graphical tools, but to shift the control logic of development from "manual line-by-line confirmation" to "batch delegation and execution."

Boris Cherny, head of Claude Code, shared his own Claude Code usage patterns via Threads and other channels from late 2025 to early 2026. His approach is CLI-first, using plan mode as the starting point for all tasks. This differs fundamentally from the IDE mindset: in a traditional IDE, the human is the active code producer, with AI completions merely assisting; in the plan-driven terminal workflow, the human is the direction setter and verifier, while the generation of code and the choice of implementation path are autonomously handled by the Agent.

Abandoning the IDE means giving up the安全感 (sense of security) that comes from "typing every line of code by hand." This sounds like a loss, but for developers already experienced with extensive context switching in large projects, it's also a form of offloading. Because you no longer need to constantly jump between reading code, writing code, searching documentation, and switching files. You just need to articulate a requirement clearly, then delegate the execution process to a set of concurrently operating Agents.

The framework meng shao proposed in his summary is "humans lead the direction, Agents execute." In the IDE era, humans had to both lead the direction and handle the execution details; these two roles were intertwined. The new workflow attempts to separate these two roles, leaving only the former to humans.

plan.md Is Not Documentation, It's a Contract

The most frequently mentioned filename in this workflow is plan.md. It sounds like project documentation, but its actual function is completely different.

A project's README or development documentation is for humans to read, explaining architectural decisions, recording interface agreements, and helping new members get onboarded. The primary reader of plan.md is not a human, but an Agent. Its structure revolves around three things: problem definition, solution description, and a task list in the form of checkboxes. meng shao put this relationship bluntly in his post: the role of plan.md is to "constrain the Agent from slacking off."

LLMs have a known issue in long conversations, which the community calls "context corruption." As the conversation history grows longer, the model's attention to the initial objectives naturally decays. It might forget the original requirement boundaries midway, or skip certain verification steps out of laziness. A community project named "洁癖.skill" specifically addresses this, providing methods to automatically organize session files and update persistent memory stores. Its core insight is that an Agent's long-term performance doesn't depend on the quality of a single prompt, but on whether it has a stable external memory mechanism.

plan.md is this external memory. It's persisted on the filesystem and doesn't disappear when a session ends. Every new Agent session can reload context from this file, rather than relying on the decaying chat history from the previous round.

Compound Engineering is the core open-source plugin supporting this workflow, developed by Kieran Klaassen of Every Inc. It provides a set of command-line instructions, among which /ce:plan can automatically generate a draft of plan.md based on the developer's input. After generation, the developer's job is to review and correct: the Agent might have wrong assumptions about technical choices, might underestimate the complexity of a module, or might completely misunderstand business logic. Human intervention at this point isn't about tweaking code, but injecting domain knowledge that the Agent lacks into the plan file, then handing the corrected plan back to the Agent for execution.

This design aligns closely with a point emphasized by Boris Cherny in his Claude Code usage principles: concentrate human expert judgment at the single node of plan review, rather than dispersing it across every step of execution. In his words, if every step requires human confirmation, it's essentially no different from manually writing code.

An effective plan.md isn't long. It usually contains clear acceptance criteria, each corresponding to a checkbox. These checkboxes are execution anchors for the Agent and acceptance criteria for the human. After completing the Work phase, developers don't read the code; they check whether these checkboxes have been ticked off one by one.

The Three Phases of a Requirement: Research → Plan → Work

The core skeleton of this workflow is a closed loop consisting of three phases: Research, Plan, and Work. It's not complex, but the tool support and the manner of human intervention in each stage have clear divisions of labor.

The goal of the Research phase is to allow the Agent to establish an information advantage before starting work. Common methods involve using the /ce:brainstorm instruction or loading specific Research skill packages. Matt Van Horn open-sourced a skill called last30days-skill, which had garnered over 10,000 GitHub stars by the end of March 2026. Its function is to let the Agent concurrently scrape content related to a specified topic from communities like Reddit, platform X, Hacker News, etc., from the past 30 days, and then output a structured analysis summary. Suppose you're starting a project involving a new tech stack; your Agent can pull back the latest community evaluations, known pitfalls, and recommended alternatives for this tech stack within minutes, instead of you manually opening a dozen browser tabs.

The output of Research isn't the final answer; it's informational input. This information enters the Plan phase, becoming material for generating plan.md.

The Plan phase uses the /ce:plan instruction. Based on discoveries from the Research phase, the developer's initial requirements, and the project's existing code context, the Agent generates a draft plan.md. Compound Engineering's design philosophy is to spend 80% of the time on planning and review, and 20% on execution. This ratio seems radical at first glance, but the logic is straightforward: a clearly written plan with well-defined boundaries and specific acceptance criteria can minimize rework costs in the execution phase to the lowest possible level.

What developers need to do in the Plan phase includes: correcting the Agent's wrong assumptions about technical solutions, adding business constraints the Agent is unaware of, adjusting task breakdown granularity and execution order, and adding edge cases the Agent tends to skip into the checkboxes. This review process itself is a form of "external memory injection," as the human writes tacit knowledge—difficult to emerge naturally in conversation—into the plan file at this point.

The Work phase uses the /ce:work instruction. The Agent reads plan.md, breaks down tasks into subtasks, and dispatches them to sub-Agents for parallel execution. Boris Cherny once shared a number: using this plan-driven workflow, he produced 259 PRs in a month. This number doesn't speak to code quality, but rather that when decision-making authority in the execution phase is delegated to the Agent, the human bottleneck is no longer typing speed.

A key point easily overlooked between the three phases: the Research and Plan phases can cycle multiple times. If the Agent exposes information gaps during the Plan phase, you can switch back to Research mode for supplementation at any time. If errors are discovered after the Work phase begins, you can also return to the Plan phase for corrections. This loop isn't a linear waterfall flow; it's a ring that allows rollback and adjustment.

Six Actionable Techniques You Can Try Right Away

Among the currently public information, the following six techniques all have clear origins and enough detail for a developer to try directly. They are not abstract principles, but specific operations down to "what to type in the terminal," "what to write in a file," and "which tool to use."

Technique One: Generate a plan as soon as you have an idea; don't mentally work it out first.

meng shao placed this point quite early in the original post. His logic is that the human brain is good at reviewing, not at constructing the complete tree structure of complex logic from scratch. When receiving a new requirement, developers habitually rehearse implementation paths in their minds repeatedly, but this process is highly inefficient due to the limited capacity of working memory. The correct approach is to have the Agent generate the first draft of the plan as soon as the idea appears, even if it's crude, contains errors, and needs significant correction. Reviewing a flawed plan is much easier than writing a perfect plan from scratch.

Technique Two: Don't read the plan yourself; have the Agent summarize it in a few sentences.

Reading long plan files itself is an item of cost. One of Boris Cherny's 22 Tips is to use natural language instructions to have the Agent provide a TLDR, or use "eli5" to explain the plan's core points in the simplest terms. If you have 5 minutes for review, spend the first 3 on the Agent's summary, and the last 2 only checking parts you find risky. The essence of this technique is delegating "reading comprehension" as well, with humans only looking at what they absolutely must see.

Technique Three: Parallel multi-terminal use.

Matt Van Horn mentioned in his post that he would open four or more terminal windows simultaneously, having different Agent instances handle different subtasks. One working on a front-end page, one writing a back-end interface, one running tests, one scraping documentation. This practice turns traditional single-threaded development into multi-threaded scheduling. The cost is that you need to manage the output and status of different terminals yourself, with no graphical workbench to provide unified monitoring. For developers accustomed to having a global view within a single IDE window, there's an initial anxiety of "not knowing which one went wrong."

Technique Four: Use voice instead of keyboard input for complex architecture instructions.

meng shao mentioned using tools like monologue for input. The specific practice is, when faced with a lengthy system design idea, instead of typing it out word by word, use voice to narrate the entire thought process, letting a speech-to-text tool feed the content into the terminal or plan file. Matt Van Horn termed this "Get Voice-Pilled." He believes voice is better than typing for maintaining the flow of complex architectural thinking because the speed of speech better matches the natural unfolding rhythm of logical flow. The actual effectiveness of this technique for Chinese developers currently lacks sufficient feedback, and the Chinese speech support capability of monologue also requires further confirmation.

Technique Five: Use email to trigger asynchronous tasks.

Matt Van Horn shared a specific scenario in an April 2026 post: while putting his child to sleep, he used the agentmail tool to send an email to his own Claude Code instance, triggering a remote code build task. When the child fell asleep, the build was already complete, with results waiting for him to verify in the terminal. This essentially liberates development activities from the physical constraint of "having to sit in front of a computer," allowing the Agent to work continuously in the background. The prerequisite is that your trust in the Agent is high enough to let it execute autonomously without seeing the screen.

Technique Six: Use the tool stack as a skills marketplace.

Projects like AgentSkills present a core idea: developers don't need to build every Agent capability from scratch. General capabilities like file management, news monitoring, and web scraping already have community-maintained skill packages available for loading. In the terminal workflow, loading a new skill is on the order of installing a plugin; you just need to declare the source and parameters of the skill package in the configuration, and the Agent can automatically acquire the corresponding tool-calling capabilities. Claude Code Video Toolkit serves as one example of community practice, adding video content understanding capabilities to the CLI environment. Although its application scenarios are still relatively niche, it illustrates that the capability boundaries of terminal Agents can be continuously expanded through skill packages.

When This Workflow Will Bite Back

There's no shortage of opposition to this workflow. Synthesizing community discussions, problems mainly focus on three directions.

The first issue is the boundary of the applicable audience. Boris Cherny's 22 Tips themselves have an implicit prerequisite: users need sufficient architectural decomposition ability and prompt constraint ability. People who can independently complete system design know the boundaries of "what to let the Agent do and what not." For developers who still rely on IDE error hints, code highlighting, and breakpoint debugging to understand code logic, closing the graphical editor means closing their familiar information acquisition channels. This isn't a question of skill level, but a difference in the sensory channels their work style depends on. Beginner developers build mental models of program execution through line-by-line debugging; this learning process lacks a substitute in the terminal workflow.

The second issue is risk concentration. In a traditional IDE, errors are gradually exposed during execution: compilation errors, type mismatches, runtime exceptions. Humans have the opportunity to discover and correct problems at every step. In a plan-driven workflow, all decision review is compressed into the single node of the Plan phase. If the human's review at this node isn't thorough enough, errors can be amplified—faithfully and efficiently—by the Agent during the Work phase. By the time you discover it, multiple files might already be tainted by faulty logic.

The third issue was raised by Matt Van Horn himself; he called it "AI Psychosis." This term doesn't refer to AI having problems, but to humans having problems: constructing Agent loops itself can bring a strong intellectual high, similar to the positive feedback of constantly optimizing in a strategy game. Developers might become obsessed with polishing the Agent workflow itself, constantly trying new techniques, adding new sub-Agents, optimizing the structure of the Plan, while forgetting a basic question: what value are you actually delivering to the user? The tool becomes the goal, and requirements are placed second.

These three issues point to the same conclusion: the plan.md-driven terminal workflow is an efficiency amplifier designed for people who "clearly know what they want," not a learning tool for those "still figuring out what they should do." Its applicability boundary lies not in the choice of tech stack, but in the user's stage. If you are the person who has already drawn the complete system architecture on paper, this workflow can speed up your execution by an order of magnitude. If you are still trying to understand the problem itself by writing code, then the visual feedback system of the IDE is a crutch you shouldn't discard yet.

Currently, the core components of this workflow include the runtime environment layer (Claude Code CLI or Codex CLI), the process management layer (the Compound Engineering plugin), and the skill extension layer (projects like last30days-skill and agentmail). They are all rapidly iterating; instruction formats, file conventions, and plugin systems are subject to change. Developers entering now need to accept the fact that the pitfalls you encounter might not have community answers yet, because the entire chain is still in the early practice stage, far from reaching the standard of a "stable toolchain." But this is precisely the window period where the first pioneers can establish a cognitive advantage.

Related Questions

QWhat is the main shift in developer workflow described in the article, and what tool is central to it?

AThe main shift is moving from a traditional IDE-centric workflow to a terminal-driven workflow centered around an AI agent. The central tool is a Markdown file named 'plan.md', which acts as a contract between the human developer and the AI agent to outline the problem, solution, and checklist for execution.

QAccording to the article, what are the three core phases of the workflow, and what is the primary goal of the first phase?

AThe three core phases are Research, Plan, and Work, forming a closed-loop cycle. The primary goal of the Research phase is to establish information advantage for the agent before execution begins, often by gathering and analyzing community feedback and documentation on a given topic.

QWhat is the 'context corruption' problem mentioned, and how does 'plan.md' address it?

A'Context corruption' refers to the known issue where Large Language Models (LLMs) naturally lose attention to early goals as a conversation history gets longer. The 'plan.md' file addresses this by serving as persistent, external memory on the filesystem. It provides a stable context that can be reloaded in new agent sessions, preventing the loss of initial requirements and constraints.

QName one of the six actionable tips provided in the article and briefly explain its purpose.

ATip 1: Generate a plan immediately with an agent when you have an idea, rather than mentally working it out yourself. The purpose is to leverage the agent's ability to quickly build a detailed (if imperfect) plan, which is easier for the human to review and correct than creating a perfect plan from scratch.

QWhat are two potential downsides or risks of adopting this terminal and agent-driven workflow?

A1. It is unsuitable for novice developers who rely on IDE visual feedback (like syntax highlighting and step-by-step debugging) to build their mental model of code execution. 2. It introduces risk concentration. If the human's review during the Plan phase is insufficient, errors can be efficiently amplified by the agent during the Work phase, potentially corrupting multiple files before detection.

Related Reads

After Marvell's 32% Surge, the Chinese Chip Family Behind It Emerges

The stock price of Marvell Technology surged 32.5% on June 2nd, driven by NVIDIA CEO Jensen Huang highlighting its custom ASICs and optical interconnects as core to AI data center architecture. This event brought attention to the Chinese semiconductor family behind Marvell: the Dai siblings. The story centers on three siblings, all UC Berkeley graduates, whose three-decade entrepreneurial journey aligns with major semiconductor industry shifts. In 1995, youngest sister Dai Wei Li co-founded Marvell with her husband Sehat Sutardja and his brother, focusing on storage controllers. Eldest brother Dai Wei Min founded EDA company Ultima, later sold to Cadence, and later founded VeriSilicon (芯原) in China, becoming a leading semiconductor IP provider. Second brother Dai Wei Jin co-founded EDA firm Silicon Perspective (sold to Cadence) and GPU IP company Vivante, later acquired by VeriSilicon. The combined "Dai-Sutardja" family network extends beyond Marvell. Their ventures and investments form a comprehensive ecosystem for the post-Moore's Law, chiplet era. Key holdings include: Dream Big Semiconductor (AI SuperNICs, acquired by Arm), Alphawave (high-speed SerDes IP, acquired by Qualcomm), and Silicon Box (a chiplet advanced packaging foundry). VeriSilicon itself thrives on the AI ASIC and IP boom in China. Collectively, the family's AI infrastructure-related portfolio is estimated at over $22 billion. Their strategy represents a distinct path: building critical components for open standards and key manufacturing capacity in the chiplet era, rather than pursuing standalone AI chip dominance. While this path may not create the next NVIDIA, it has enabled repeated successful exits and sustained influence within the global semiconductor industry.

marsbit1h ago

After Marvell's 32% Surge, the Chinese Chip Family Behind It Emerges

marsbit1h ago

Microsoft is Afraid of Being Marginalized by AI Giants

Microsoft, once the defining force of the PC era, now faces a familiar challenge in the AI age: the risk of being relegated to a profitable but invisible infrastructure provider. This anxiety was laid bare at Build 2026, where CEO Satya Nadella unveiled a major strategic pivot. The catalyst was a quiet April agreement that dissolved Microsoft's exclusive licensing and cloud-hosting deal with OpenAI, its once-vital partner. This erased Microsoft's key AI moat. With OpenAI and Anthropic defining AI applications and gaining enterprise traction—even within Microsoft's own ranks—Nadella had to answer: without exclusivity, what is Microsoft's role? The answer was a suite of seven in-house AI models, a developer-focused AI workstation (Surface RTX Spark Dev Box), and, most crucially, the Agent 365 platform for enterprise AI governance. The models, notably targeting Anthropic's strengths in coding and enterprise, signal a defensive move. However, the broader strategy is to make the models themselves less decisive. Financially, Microsoft's AI revenue is strong, driven largely by Azure running others' models. Yet its user-facing products like Copilot show weak penetration and engagement. Microsoft earns infrastructure money but lacks direct user mindshare. Nadella's core fear is being "hollowed out." As OpenAI and Anthropic prepare for IPOs and gain financial independence, they may build their own infrastructure, threatening Azure's lucrative AI revenue stream. Microsoft's window is to entrench itself deeper: not as the model creator, but as the indispensable platform for securely deploying, managing, and governing all AI models within the enterprise through Agent 365. Build 2026 revealed Microsoft's bet: in the AI era, the ultimate power lies not in any single model, but in the enterprise "operating system" that controls them. Nadella is determined to ensure Microsoft is the driver of this new era, not just a passenger.

marsbit1h ago

Microsoft is Afraid of Being Marginalized by AI Giants

marsbit1h ago

CPU, Quietly Returning to the Center of the AI Computing Power Stage

Over the past three years, AI computing power narratives have been dominated by GPUs. However, starting in 2026, this story began to shift. While training large models remains GPU-intensive, the rapid growth of inference and AI agent workloads, which require high levels of task orchestration, concurrency, and data flow management, has highlighted a renewed critical role for CPUs. These are tasks GPUs are not designed to handle. Intel's recent launch of the Xeon 6+ processor, built on its Intel 18A process and featuring up to 288 efficiency cores (E-cores), exemplifies this strategic pivot. It is positioned not as a mere companion to GPUs but as the essential "control plane" for AI infrastructure, optimized for high-density, energy-efficient, and high-throughput workloads characteristic of AI agents and inference. This "CPU resurgence" is not about CPUs outperforming GPUs in raw computation. It reflects a systemic bottleneck: as AI scales from training single models to deploying countless intelligent agents, the demand for coordination and data handling surges. Major cloud providers are also developing their own high-density ARM-based server CPUs for similar workloads. However, Intel's success with this strategy faces significant challenges. Competition includes NVIDIA's integrated CPU-GPU solutions, the expanding adoption of cloud vendors' in-house ARM CPUs, and the crucial market test of Intel's 18A manufacturing process against rivals like TSMC's N2. In conclusion, CPUs are indeed reclaiming a central, though redefined, role in AI compute—managing the complex orchestration that enables massive-scale AI deployment. While the trend is clear, which company will ultimately lead this CPU resurgence remains an open question to be decided in the data centers of 2027 and beyond.

marsbit1h ago

CPU, Quietly Returning to the Center of the AI Computing Power Stage

marsbit1h ago

Trading

Spot
Futures
活动图片