Five Core Forms of AI Agent in YC's Eyes

marsbitPublished on 2026-05-20Last updated on 2026-05-20

Abstract

The article outlines five core architectural patterns for effective AI Agents, emerging from tools like Codex and Claude, that move beyond simple prompts towards reusable, process-based capabilities. 1. **Skills**: Reusable, parameterized workflows that function like method calls, allowing a single process (e.g., "/investigate") to handle various tasks based on input parameters. 2. **Thin Harness**: A lightweight execution framework (~200 lines) that manages the AI model's "hands and feet"—handling loops, file I/O, and context—without becoming bloated. 3. **Resolvers**: Routing tables that map tasks to specific Skills, preventing "context corruption" when managing dozens of Skills and ensuring outputs go to the correct locations. 4. **Latent vs. Deterministic Layer**: A critical separation where LLMs handle judgment, synthesis, and pattern recognition, while deterministic code handles tasks requiring precision, consistency, and low cost (like calculations). 5. **Memory**: A persistent, accumulating knowledge base (e.g., a markdown folder) with a "current trusted conclusion" section and an append-only timeline, enabling the system to learn and retain context over time. Together, these patterns create a "process power"—a durable competitive advantage. Unlike one-off prompt-based applications whose value quickly commoditizes, a well-designed AI Agent system encodes experience into reusable, parameterized workflows, offloads stable rules to code, and continuously learns thr...

Editor's Note: As AI Agents move beyond one-off prompts and vibe coding into more complex workflow stages, the truly important question is no longer 'Can the model complete the task?' but rather 'Can we solidify AI capabilities into reusable, accumulable process assets?'

This article starts from Garry Tan's GBrain and summarizes five core forms that many are converging on when using Agent tools like Codex, Claude Code, Hermes: parameterizable Skills, lightweight execution frameworks (Thin Harness), routing Resolvers, an execution layer separating model judgment from deterministic code, and Memory for long-term context accumulation.

These modules, combined, point towards a new kind of 'process capability': codifying experience into workflows, abstracting tasks into parameters, handing stable rules to code, entrusting judgment and synthesis to models, and continuously depositing learnings through a memory layer. Compared to one-off generated applications or prompts, this kind of system is harder to replicate and more likely to become the foundation for individuals, small teams, and even companies to form a long-term competitive advantage in the AI era.

The original text follows:

I spent some time studying Garry Tan's GBrain. As someone without a technical background and not working in venture capital, I want to distill a few general structural forms I see in it, and where its real significance lies.

I believe many people are gradually converging on the same set of core structures. They can roughly be summarized into 5 forms, which also represent the natural evolution direction in how agentic AI tools like Codex, Claude Code, Hermes, OpenClaw are being used.

Related Reading: 'Thin Harness, Fat Skills: The Real Source of 100x AI Productivity'

Skills: From SOP to 'Method Call'

Skills are almost everyone's most natural starting point. Even without prompting, users intuitively start building them because their form feels very familiar. I initially understood it as a kind of SOP (Standard Operating Procedure), a documented process for getting something done. The user provides 'what to do,' and the Skill provides 'how to do it.'

Tan's understanding is that a Skill is more like a 'method call.' In programming, a method call means invoking a programmatic flow with parameters. The same code runs every time; what changes are the parameters: what data, what question, what objective. For example, the same process_invoice function can handle every invoice in the system, not just the one it was originally written for.

Skills are a similar structure. A Skill named /investigate might contain seven fixed steps; these steps don't change. What changes are the parameters: TARGET (who or what to investigate), QUESTION (what you want to figure out), DATASET (where to look for information). Point it at a whistleblower case in healthcare, and it works like a research analyst; point it at SEC filings, and it works like a legal investigator. The same file, the same seven steps; the difference is provided by the external world.

This is different from traditional SOPs. Most SOPs are written for a specific role or task, like 'Processing Accounts Payable.' Each use case gets its own set of documents. Skills are more abstract; the same process can handle a category of problems. A well-designed Skill can do the work of dozens of SOPs because the case-specific information is extracted from the document and moved into parameters. In practice, some Skills are closer to SOPs, others closer to method calls.

Thin Harness: Model is the Brains, Harness is the Hands and Feet

Models, like Opus, GPT-5.5, etc., are raw intelligence; Harnesses, like Claude Code, Codex CLI, Hermes, OpenClaw, are the execution frameworks that give models real 'hands and feet.' They handle loop execution, reading/writing files, managing context, enforcing safety constraints. Their core code is around 200 lines.

Garry mentions a common mistake most people make: continuously stuffing more things into the Harness. I did the same. I ended up with 100 tool definitions and a pile of MCP servers. The result was the context window being filled with tool descriptions irrelevant to the current task. The model started confusing which tool to use, latency increased, accuracy dropped, eventually leading to so-called 'context corruption.'

Resolvers: Using Routing Tables to Solve Context Corruption

The solution to context corruption is to build a routing table. A Resolver's job is to map 'the incoming task type X' unambiguously to 'Skill Y should be called.' When you only have 5 Skills, you don't need a Resolver; but when you have 100 Skills, the descriptions become fuzzy, and the model easily fails to call the right Skill at the right time. Resolvers replace fuzzy pattern matching with explicit rules.

Tan also runs a similar Resolver-like mechanism for files: a separate routing table to decide where the output of a Skill should land in the filesystem. This is the same 'audit–route' structure applied to a different problem. This ensures outputs consistently go into the correct folders, not where the model temporarily guesses.

Skillify is another supporting idea of his: it's a quality loop to turn one-off Skills into long-term reusable infrastructure. Tan describes a 10-step process including: contract definition, using deterministic code where appropriate, unit testing, integration testing, LLM-as-judge evaluation, Resolver entry, audit script, checking which Skills have no call path, and end-to-end smoke tests. The test criterion is simple: if you have to ask the model the same question twice, it's a failure.

Latent vs. Deterministic: Judgment to the Model, Deterministic Tasks to Code

It's crucial to distinguish which work should be given to the LLM and which to a deterministic system. LLMs excel at judgment, synthesis, pattern recognition, and reading between the lines; but they are not good at arithmetic, combinatorial optimization, or anything requiring the same answer every time. LLMs are inherently probabilistic; when a deterministic solution can solve the problem, don't use an LLM.

Most non-technical people often underestimate the value of the deterministic layer. The default reaction is to throw everything at the model. But if something can be done deterministically, you almost always should. And you don't need to be a programmer because the model can write the code for you. What really needs training is a discipline: each time, ask yourself, can this be done reliably and cheaply with code? If the answer is yes, have the model write that code.

Memory: Making the System Truly Accumulative

For a system to be useful, it must have some form of memory. I'm not yet sure what the correct form is; many people are building it in different ways: vector embeddings, semantic similarity, knowledge graphs, hybrid storage, etc. Tan's approach is the same as mine: just a folder of markdown files.

His structure is: one page per person, one page per company, one page per concept. The top of each page is the 'Current Believed Conclusion,' a synthesized judgment that is continuously rewritten and updated as new evidence arrives; the bottom is an append-only timeline.

Choosing markdown yields several consequences. First, the files are the system's primary record, not some export. You can open it in VS Code, edit it manually, and the Agent will automatically read those changes. Second, typed relationships, like works_at, invested_in, founded, attended, advises, are automatically extracted via regex each time something is written, so the knowledge graph can connect itself without consuming tokens. This specific schema fits his work, but for others, it likely needs to be re-tailored based on one's own profession and business context.

Furthermore, a signal detector runs in the background. If a person is mentioned once, a stub page is created; if they are mentioned three times across sources, it triggers a web info completion process; after a meeting, a full process runs. A nightly 'dream cycle' scans conversations, completes outdated entity information, and fixes broken references. The base layer is text; everything built on top is cheap and composable.

There are, of course, more details underneath, but I believe these are the most important contours, and they are to a considerable degree universal.

Personally, I've already built about half of such an architecture. Previously, I hadn't reached the scale necessitating a true Resolver, but now I have, so I recently did a minor refactor to make my system model-agnostic and built a Resolver into it. The key part I haven't built yet is the background-running signal detector and the nightly dream cycle—the automatic info completion and organization mechanisms—which is what I want to try adding next.

I suspect that different builders converging on a similar structure is itself a signal: while this form might not work for everyone, it's broadly likely useful. Even if specific implementation details vary importantly, this overall structure is being independently discovered by more and more people.

A question I've been asking myself lately is: How do you build a sustainable competitive advantage with AI?

Everyone is excited about vibe-coded apps and one-off prompts, which are of course very cool. I myself started that way and got hooked. But anything that can be built with a one-off prompt will, in equilibrium, have its price driven down to the cost of the tokens required to build it—just a few cents.

For example, someone replicating MyFitnessPal, selling it at half the price, and making a million dollars is impressive. But soon, someone else will replicate it and sell it cheaper. The cycle continues until the profit margin is completely squeezed.

The truly sustainable thing is a kind of 'process capability.' Using the framework from Hamilton Helmer's 7 Powers, the architecture described above embodies process power.

7 Powers posits that companies can sustain above-average returns over the long term because they possess one of seven structural powers. Any advantage not rooted in these powers will eventually be competed away.

For small and medium businesses and early-stage companies, five of Helmer's seven powers are essentially closed doors. Economies of scale require scale; network effects and switching costs can be built but require a large user base first; counter-positioning or exclusive resources often mean patents or similar assets, which most companies don't have; brand typically takes a decade to build and can't be shortcut.

The remaining two are counter-positioning and process power.

Counter-positioning refers to a business model existing giants cannot copy because doing so would harm their own core business. Such opportunities exist sometimes but are not always available.

Thus, the most realistic path left is process power. And a well-designed AI system is precisely a tool for generating process power.

This is essentially the same work as building high-quality SOPs or proprietary software in-house: processes are encoded, cases are parameterized, the underlying deterministic systems are fast and reliable, and the memory layer continuously absorbs past learnings. It amplifies the 'productization of services': you can offer a service or product at lower cost or higher quality because the entire job has been structured.

Imagine an accountant building such a system. The memory layer is a folder; each client has a markdown file containing a current believed conclusion—like entity structure, annual tax positions, ongoing audits—and a timeline logging meetings, decisions, and changes.

She has Skills, like /year-end-review, /quarterly-estimate, /audit-prep. The same process can be executed parametrically for different clients.

She has a deterministic layer including tax forms, depreciation schedules, IRS documents, client historical returns, etc.

Add a mechanism like log tidying or a dream cycle. For example, the system automatically notices at night that a partner's K-1 allocation dropped 40% without a stated strategy change; or notices that a certain client's home office deduction structure could be migrated to another client—the structure is reusable, but the identity and privacy stay put.

Thus, she can charge a small premium, serve more clients per year, and competitors find it hard to replicate because this structure didn't appear out of thin air after her success; it was accumulated from the start.

On the surface, the tool is just a folder of markdown files. But every line in every file comes from a great deal of intentional testing, building, and iteration. What forms the competitive moat isn't the files themselves, but the process capability they embody.

On the Eve of the Tech IPO Boom, Polymarket Joins Forces with NASDAQ to Seize 'Valuation Authority'

At the dawn of a major tech IPO wave, prediction market platform Polymarket is forming an exclusive partnership with Nasdaq Private Market (NPM) to launch contracts predicting the valuations of pre-IPO giants like OpenAI, Anthropic, and SpaceX. This move aims to capture the "valuation adjudication right" in the booming but opaque private market, where accurate valuation data is scarce for retail investors. Facing stiff competition from rival Kalshi, which holds an 89% US market share and full regulatory approval, Polymarket's deal with NPM is a strategic counterattack. NPM provides crucial, real-time secondary market transaction data from employee stock sales to settle the new prediction contracts. In return, Polymarket's market-generated probability data becomes a valuable "institutional signal" for NPM's clients. This follows Polymarket's broader strategy of monetizing its prediction data through partnerships with giants like ICE (Intercontinental Exchange) and Dow Jones, integrating its signals into mainstream financial data feeds and media. The Nasdaq deal specifically targets the lucrative and high-profile arena of pre-IPO tech valuations, potentially raising the entry barrier for competitors in this niche.

marsbit23m ago

On the Eve of the Tech IPO Boom, Polymarket Joins Forces with NASDAQ to Seize 'Valuation Authority'

marsbit23m ago

Stop Blindly Clicking Confirm: Ethereum Wants You to Understand What You're Signing

The article criticizes the common practice of "blind signing" in Web3, where users confirm cryptographic transactions without understanding the content, typically presented as unreadable hexadecimal code. It introduces Ethereum's new initiative, "Clear Signing," as a solution aimed at making transaction signatures human-readable. Clear Signing, based on the ERC-7730 standard, allows protocols to provide standardized, structured metadata (like a "translation dictionary") that explains the semantic meaning of a transaction—such as "Swap 1,000 USDC for at least 0.42 WETH on Uniswap V3." This metadata is stored in a public registry for wallets to fetch and display in plain language, moving beyond mere structured data formats like EIP-712. The article clarifies that Clear Signing is not an instant fix but a foundational infrastructure shift that requires widespread adoption by protocols and wallet support. It complements, rather than replaces, existing wallet security features like transaction simulation. The initiative represents a crucial step toward genuine user sovereignty by restoring informed consent, making blockchain accessible to non-technical users and addressing a long-standing security and usability debt in the ecosystem.

marsbit28m ago

Stop Blindly Clicking Confirm: Ethereum Wants You to Understand What You're Signing

marsbit28m ago

Can Alibaba Cloud Rewrite Itself?

Over the past five months, Alibaba Cloud's MaaS (Model as a Service) revenue has surged 15x, marking a strategic overhaul where the company is shifting its 17-year-old system designed for "humans using cloud" to a new paradigm centered on "Agents consuming Tokens." At its recent summit, Alibaba Cloud announced a full-stack upgrade encompassing "chip-cloud-model-inference," all optimized for AI Agents. Key launches include the new AI product portal "QianWen Cloud," hyper-node servers powered by the in-house AI chip Zhenwu M890, and the latest flagship model, Qwen3.7-Max. Senior VP Liu Weiguang described this as building "China's largest AI factory," where chips are raw materials, the cloud is the workshop, models are machines, and the inference platform is the assembly line, with Tokens as the final product. The company is now emphasizing its chip strategy, unveiling the Zhenwu M890 and a two-year roadmap for future chips. With over 560,000 chips deployed across 400+ clients, Alibaba Cloud aims to control the marginal cost per Token, mirroring Google's integration of TPU and Gemini for optimal cost-performance. The cloud infrastructure itself is being rewritten. Traditional cloud interfaces are being transformed into standardized, Agent-callable Skills. A new scheduling logic focuses on "task scheduling" over "resource scheduling" to handle the unpredictable, elastic workloads of Agents. Liu noted that AI applications now automatically provision cloud resources, with one customer's daily automated provisioning equaling two weeks of manual work. For models, the focus has shifted from conversational prowess to execution capability. Qwen3.7-Max demonstrated this by autonomously writing and optimizing a production-grade AI compute kernel for the new Zhenwu M890 chip over 35 hours, achieving a 10x performance improvement. The underlying Bailian platform was upgraded for efficiency, and it maintains an open ecosystem, hosting third-party models. This restructuring extends beyond technology to sales, organization, and metrics. Alibaba Cloud has established dedicated MaaS sales teams, separated from traditional IaaS, with new KPIs focusing on high-quality Tokens that solve real problems, the number of core business systems integrated with models, and the efficiency of Agent task completion. The underlying bet is clear: AI represents an opportunity orders of magnitude larger than before. Despite the uncertainty, Alibaba Cloud is aggressively rebuilding its entire system, betting on an AI-driven future where Tokens could become its largest product line.

marsbit37m ago

marsbit37m ago

Those Who Rushed into SpaceX's Private Secondary Market Are Bewildered in the Greatest Wealth Creation Wave Ever

Investors are rushing into SpaceX’s private secondary market ahead of its historic IPO, but many are finding confusion instead of clarity. While early backers like Darsana Capital are poised for astronomical returns—turning a $600M bet into roughly $150B—other buyers face uncertainty about whether they actually own SpaceX shares at all. The frenzy stems from AI-driven FOMO, as soaring valuations for private companies like OpenAI and SpaceX create intense demand for pre-IPO exposure. This has fueled a booming but opaque secondary market, where special purpose vehicles (SPVs) layer investments, adding fees and obscuring ownership. Some investors are three or four layers removed from the actual stock, unable to verify their holdings. With companies staying private longer—SpaceX for 24 years—secondary trading has grown complex and risky. Platforms have faced fraud, bankruptcy, and regulatory scrutiny. Now, firms like Anthropic and OpenAI are publicly rejecting unauthorized transfers, warning that shares sold through certain platforms may be invalid. SpaceX’s IPO filing in June will finally reveal the official shareholder list, resolving these uncertainties. Until then, buying into SpaceX through secondary channels remains a high-stakes gamble—a blind box in a market overflowing with capital and complexity.

marsbit49m ago

Those Who Rushed into SpaceX's Private Secondary Market Are Bewildered in the Greatest Wealth Creation Wave Ever

marsbit49m ago

Warsh's First Conundrum: Rate Cuts, Inflation, and a Fractured Fed

Walsh's First Dilemma: Rate Cuts, Inflation, and a Divided Fed Kevin Warsh officially assumed the Fed Chairmanship on May 15th, inheriting a central bank deeply divided over inflation. Contrary to market expectations of a dovish stance due to his appointment by President Trump, Warsh's historical record shows early and consistent hawkish concerns about inflation. The Fed he leads is fractured, with three FOMC members recently dissenting against even hinting at future rate cuts. The immediate challenge is surging inflation. While the Iran-related oil shock is a temporary factor, core CPI and services inflation are accelerating, showing signs of becoming entrenched—echoing the Fed's 2022 "transitory" misstep. Warsh faces the task of building consensus within a committee where several members believe policy may not be restrictive enough, especially if the neutral interest rate (r-star) is higher than currently estimated. Politically, Warsh is caught between Trump's desire for rate cuts and the economic reality of persistent price pressures. Any move perceived as bowing to political pressure could undermine Fed independence. Market implications are significant. Long-term Treasury yields (e.g., 30-year at 5.19%) could rise further, especially if the June FOMC statement hints at possible tightening. Tech stocks face continued valuation pressure from higher rates. The key variable is progress in Iran negotiations; a breakthrough before the June meeting could temporarily ease oil-driven inflation, but stubborn services inflation would remain. All eyes are on Warsh's first post-FOMC press conference on June 17th. His wording on inflation and policy will reveal how much the market has mispriced his stance and the Fed's likely path forward.

marsbit58m ago

Warsh's First Conundrum: Rate Cuts, Inflation, and a Fractured Fed

marsbit58m ago

Trading

Spot

Futures

Hot Articles

How to Buy CORE

Welcome to HTX.com! We've made purchasing CORE (CORE) simple and convenient. Follow our step-by-step guide to embark on your crypto journey.Step 1: Create Your HTX AccountUse your email or phone number to sign up for a free account on HTX. Experience a hassle-free registration journey and unlock all features.Get My AccountStep 2: Go to Buy Crypto and Choose Your Payment MethodCredit/Debit Card: Use your Visa or Mastercard to buy CORE (CORE) instantly.Balance: Use funds from your HTX account balance to trade seamlessly.Third Parties: We've added popular payment methods such as Google Pay and Apple Pay to enhance convenience.P2P: Trade directly with other users on HTX.Over-the-Counter (OTC): We offer tailor-made services and competitive exchange rates for traders.Step 3: Store Your CORE (CORE)After purchasing your CORE (CORE), store it in your HTX account. Alternatively, you can send it elsewhere via blockchain transfer or use it to trade other cryptocurrencies.Step 4: Trade CORE (CORE)Easily trade CORE (CORE) on HTX's spot market. Simply access your account, select your trading pair, execute your trades, and monitor in real-time. We offer a user-friendly experience for both beginners and seasoned traders.

5.2k Total ViewsPublished 2024.03.29Updated 2025.03.21

Discussions

Welcome to the HTX Community. Here, you can stay informed about the latest platform developments and gain access to professional market insights. Users' opinions on the price of CORE (CORE) are presented below.