The Right Way to Use Skills: 5 Reflections After Anthropic Publicly Shared Its Internal Methodology

marsbitPublished on 2026-06-08Last updated on 2026-06-08

Abstract

A deep dive into Anthropic's internal methodology for building effective AI "Skills" reveals five key insights for maximizing their value. First, Skills should focus on capturing "Gotchas" and tacit organizational knowledge—like common pitfalls and undocumented rules—rather than restating general information the AI already knows. Second, think of Skills as a form of "Context Engineering"; they are best structured as folders, not monolithic documents. A core `SKILL.md` file should act as a navigational index, progressively pulling in detailed references, examples, and assets only as needed to avoid overwhelming the model's context window. Third, whenever possible, automate repetitive tasks with scripts. This preserves the model's reasoning capacity for judgment and analysis, while scripts reliably handle the execution, saving tokens and improving accuracy. Instructions within a Skill provide the "why" and the expert judgment, while scripts provide the concrete "how." Fourth, a Skill's description is critical and often misunderstood. It should not be a list of features but a routing rule that clearly signals *when* the Skill should be triggered based on user intent and common phrasing. Finally, as Skills scale from personal tools to team-wide assets, management is crucial. Anthropic advocates for a lightweight, organic approach: let new Skills spread organically within small groups first. Those that prove genuinely useful through adoption naturally graduate to a formal marke...

Author: AI Product Aying

I read a blog post by the Anthropic team titled "Lessons from building Claude Code: How we use skills." This is probably the most in-depth practical summary I've seen about Skills so far.

Skills aren't that complicated, but doing them well isn't that easy either.

I remember when Skills first became popular, everyone loved making all kinds of writing style Skills, composition Skills. It seemed like as long as you stuffed your writing style into it, the model could consistently output in that style.

But later, after trying a bunch myself, I found it often just didn't work.

Because a writing style Skill might stuff in thousands or even tens of thousands of words. Once the Skill loads, it eats up a big chunk of the context. When the context gets heavy, the model's reasoning ability actually tends to drop.

You often end up with this situation: the style is learned, but the content becomes shallow, and the analytical ability weakens.

There's another common scenario.

When many people write Skills, they love stuffing them with various operation instructions. Step one do this, step two do that, step three do this. When you run it, you'll find the model's execution isn't stable.

Later I slowly understood that a lot of this repetitive execution work is actually more suitable to be solidified into a Script, rather than written as long Instructions.

After reading this Anthropic article, my biggest takeaway is that many people are actually using Skills, but they might not truly understand Skills.

Skill is essentially about Context Engineering. There's a lot of experience involved in deciding when knowledge should go into a Skill, when it should be split into References, when it should be written as a Script, and when Gotchas should be used to constrain the model.

After understanding how Skills work, looking back at those excellent Skills, you'll find they're never solving prompt problems; they're solving problems related to context, experience accumulation, and capability reuse.

If you want to deeply research Skills, I highly recommend reading two articles:

https://claude.com/blog/lessons-from-building-claude-code-how-we-use-skills

https://research.perplexity.ai/articles/designing-refining-and-maintaining-agent-skills-at-perplexity

#01 Don't Write Nonsense

Skills are essentially about accumulating "tacit knowledge" within an organization. So, don't repeat common sense the model already knows in a Skill. What's truly valuable is the information the model fundamentally doesn't know.

Anthropic internally often emphasizes that what Skills really need to document are the Gotchas, the common pitfalls.

For example:

1. This table cannot be sorted by `created_at`

2. Staging returning 200 doesn't mean success

3. `request_id` and `trace_id` are the same thing

Because this kind of information often exists in employees' experience. So you must remember what a Skill essentially is.

Skill = Writing down the experienced master's knowledge.

Through Skills, you accumulate the experience originally scattered in different people's minds.

#02 Skill is Actually Context Engineering

This might be one of Anthropic's most profound points.

A Skill is not a markdown file; it's a folder. For people who have used Skills, this sounds like stating the obvious.

But I've been mulling it over these past few days and slowly realized: they precisely want to use the folder form to express the concept of Context Engineering.

Let's look again at a typical Skill structure:

skill/ ├── SKILL.md ├── references/ - place detailed instructions, API references, edge cases ├── scripts/ - place executable scripts ├── examples/ - place examples ├── assets/ - place templates, images, fixed materials

When a Skill is invoked, the model first reads SKILL.md. If we cram all information into this file, context will explode very quickly.

Assume this is a payment troubleshooting Skill, containing Stripe error code explanations, historical failure cases, troubleshooting scripts, and final report templates.

If all this content is piled into SKILL.md, every time the Skill is invoked, Claude has to read it all again.

Even if the user just wants to confirm the meaning of one error code, even if they just want to check why a payment status hasn't updated. A large amount of completely unnecessary information also gets shoved into the context.

Anthropic's approach is completely different.

SKILL.md is more like a navigation page. Its job is to tell the model, when encountering a Stripe error, go to `references` to find the corresponding explanation.

When needing to reference historical cases, go to `examples` to check similar issues; when needing to actually execute troubleshooting actions, run the script in `scripts`; finally, when generating the troubleshooting report, use the template in `assets`.

The whole process is a gradual exposure.

I strongly suggest you save the image below.

#03 Use Scripts Whenever Possible

Don't let the model waste its limited context and reasoning power on repetitive labor. Hand these tasks over to scripts.

For example. When many people write Skills, they write like this:

1. Query registration data; 2. Query payment data; 3. Calculate conversion rate; 4. Analyze root causes.

This way of writing is fine, of course. The model can complete it. But every time it executes, it has to run through the entire analysis process from the beginning.

Querying data, organizing data, handling various edge cases — this work is all repetitive.

Since these capabilities have been verified countless times. Why make the model reinvent it each time? Just provide the concrete scripts directly.

And through scripts, Skill execution becomes more accurate and also saves tokens.

From this perspective, the Scripts in a Skill are actually solidifying organizational capability. Behind each script is often the best practice summarized by the team after countless past pitfalls.

After solidifying these capabilities, Claude can work based on this accumulated experience every time, instead of starting from scratch again and again.

So I increasingly feel that within a Skill, Instructions and Scripts solve problems at two different levels.

Instructions provide experience and judgment; Scripts provide capability and execution.

For example, a payment troubleshooting Skill might have this line:

If Stripe returns 200, don't assume payment success directly; you need to further check the `payment_events` table.

This belongs to Instructions. Because it's experience. Whereas `check_payment_events()` belongs to Script, because it's execution capability.

If you only have the Script, the model knows *how* to check, but may not know *why* to check.

If you only have Instructions, the model knows it *should* check. But has to re-implement it every time. Both are indispensable.

#04 Description is More Like a Routing Rule

The way many people write Skill Descriptions is inherently wrong.

Because people are used to writing them as feature introductions. For example: PR Management Skill helps users monitor PR status, handle CI issues, automatically complete Merges.

But the problem is, the model doesn't find Skills by their functionality. When Claude Code starts up, it first scans the names and Descriptions of all Skills.

Then, based on the user's current question, it decides which Skill should be loaded.

So the most important information in the Description is not what this Skill can *do*, but under what circumstances it *should* be loaded.

The Description actually handles the routing work for the entire Skill.

In the real world, few people say "help me invoke a PR management tool." People are more likely to say: "help me keep an eye on this PR," "the CI is down again," and so on.

So a good Description should try to describe the user's *intent*, not list features.

I even think you can use a very simple method to check.

After writing the Description, delete the entire Skill, keeping only this one line Description. Then ask yourself: after the model sees the user's question, can it know when to load this Skill?

If it can't, you probably need to keep revising.

#05 Skill Management and Distribution

Another point is about Skill management.

When one person uses Skills, it's pretty simple. Write a few Skills yourself, maintain them yourself, upgrade them yourself. But I believe most teams will eventually face the same problem.

When Skills grow from a few to dozens, or even hundreds, how should these Skills be managed? How should they be upgraded? How should they be distributed to team members?

I think Anthropic's experience in this area is quite worth referencing.

When the team size is relatively small, Skills can travel directly with the code repository. Just put them in the project's .claude/skills directory. Everyone shares the same set of Skills and the same working methods.

But as the number of Skills increases, a new problem appears.

When Claude Code starts up, it scans the names and Descriptions of all Skills, then decides which Skill should be invoked for the current task. The more Skills there are, the higher the routing cost.

This is also why Anthropic later started making a Marketplace. But what's even more interesting is how they manage the Marketplace.

When many companies encounter this problem, their first reaction is often to establish an approval process. Whoever writes a Skill submits an application first; after approval, it enters the official Skill library. We did this internally before too, but it's very heavy. Managing for the sake of management.

I found Anthropic's organization is very lightweight.

Let new Skills spread in a small scope first; let colleagues install and try them themselves.

If more and more people start using it, it shows this Skill truly solves a real problem. At this stage, the author can then submit it to the formal Marketplace.

So they don't first debate whether a Skill is valuable; they first let it be tested in real usage scenarios. If many people use it, it naturally enters the formal system. The Skills that remain this way are basically the ones the team truly needs.

Related Questions

QAccording to the article, what is the fundamental purpose of a Skill in AI systems like Claude?

AThe fundamental purpose of a Skill is to be a form of Context Engineering. It aims to capture and codify the 'tacit knowledge' or 'experienced master's knowledge' within an organization, such as gotchas, common pitfalls, and specific operational insights that the AI model wouldn't inherently know. It's about solving problems related to context, experience accumulation, and capability reuse, rather than just being a lengthy prompt or instruction set.

QBased on Anthropic's methodology, what is the key structural concept for organizing a Skill to avoid context overload?

AThe key structural concept is to treat a Skill not as a single markdown file, but as a folder with organized subdirectories. A typical Skill folder includes `SKILL.md` (acting as a navigation page), `references/` for detailed documentation, `scripts/` for executable scripts, `examples/` for case studies, and `assets/` for templates. This structure allows for progressive exposure of information, where only the necessary components are loaded into the context as needed, preventing 'context explosion' and preserving the model's reasoning capabilities.

QWhat is the recommended distinction between 'Instructions' and 'Scripts' within a Skill, and why is it important?

AInstructions and Scripts solve problems at different levels. Instructions provide 'experience and judgment'—they tell the AI *what* to do and *why*, based on accumulated knowledge (e.g., 'If Stripe returns 200, don't assume success; check the payment_events table'). Scripts provide 'capability and execution'—they are concrete, reusable pieces of code that perform repetitive tasks (e.g., a `check_payment_events()` function). This distinction is important because scripts prevent the model from wasting context and reasoning power on re-implementing verified actions, making execution more accurate and token-efficient, while instructions ensure the model applies the correct logic and understanding.

QWhat is the primary function of a Skill's Description, and what common mistake do people make when writing it?

AThe primary function of a Skill's Description is to act as a routing rule. It should clearly indicate *when* the Skill should be loaded based on the user's intent or the problem context, not just list the Skill's features. The common mistake is writing it as a feature introduction (e.g., 'This Skill helps monitor PR status...'). Instead, it should describe user intent (e.g., phrases users might say like 'help me watch this PR' or 'the CI is broken again') so the AI can accurately decide which Skill to invoke for a given query.

QHow does Anthropic manage the distribution and evolution of Skills within a team as their number grows, according to the article?

AAnthropic employs a lightweight, usage-driven approach. Initially, Skills are shared within a project's `.claude/skills` directory. For broader distribution and management (like in a Marketplace), they avoid heavy approval processes. Instead, new Skills are first shared informally among colleagues for installation and trial. If a Skill gains organic adoption and proves useful by solving a real problem for many users, the author can then submit it to the official Marketplace. This method ensures that only genuinely valuable and tested Skills become part of the formal system.

Related Reads

Elderly Borrow Money to Trade Stocks, Entire Nation Adds Leverage: 'Ant Army' Panics as South Korean Stock Market Plunges

Titled "Panic Among 'Ant Army' as South Korean Stocks Plunge After Elders Borrow to Invest, Everyone Leverages Up," this article details a dramatic reversal in South Korea's red-hot stock market. After a sustained rally toward 9,000 points driven by AI semiconductor hype, the KOSPI index recently crashed, triggering circuit breakers. The sell-off was led by major chipmakers Samsung Electronics and SK Hynix, whose combined weight in the index is over 50%. The plunge exposed the extreme leverage and speculative behavior that fueled the boom. Individual investors, dubbed the "ant army," had borrowed heavily or used leverage ETFs to chase gains, with trading accounts outnumbering the population. A significant portion of this leveraged money came from older citizens, some of whom reportedly cashed out insurance policies to invest. ETF trading became dominated (over 90%) by high-risk leveraged and inverse products. The correction was triggered by a pullback in U.S. tech stocks, leading to a foreign capital exodus and a weakening Korean won, creating a vicious cycle. While President Lee Jae-myung attempted to reassure markets and NVIDIA's CEO signaled support during a visit, officials like Finance Minister Ju Yeong-geun expressed concern over the dangerous "herd mentality." The article highlights a pervasive, high-risk investment culture where everyone from office workers to retirees and even parents opening accounts for newborns sought quick profits, largely concentrated in a few tech stocks, setting the stage for a sharp and painful correction.

marsbit4m ago

Elderly Borrow Money to Trade Stocks, Entire Nation Adds Leverage: 'Ant Army' Panics as South Korean Stock Market Plunges

marsbit4m ago

From Hunyuan to WeChat AI: Tencent's Slow Paced Journey Reaches the Delivery Juncture

On June 8, 2026, WeChat's developer platform announced the internal testing of "WeChat AI," an AI assistant integrated into the WeChat ecosystem. It allows users to invoke, access, and operate Mini Programs through natural language conversation. The platform offers two access modes: an "Automatic Mode" where developers authorize platform access to their source code for zero-configuration AI operation, and a "Developer Mode" for building custom skills. While the name "WeChat AI" is provisional, this marks WeChat's first step in opening its vast Mini Program ecosystem—comprising over 400,000 developers and hundreds of millions of daily active users—to AI-driven conversational interaction. This move represents the latest step in Tencent's deliberate AI strategy, moving from technical R&D and standalone product validation to integration within its super-app. The underlying foundation is Tencent's self-developed Hunyuan large language model. Ranked first domestically in application-oriented capabilities like Agent task execution in 2025, Hunyuan's focus on stability and precision over raw parameter count aligns with WeChat AI's need for reliable, low-latency operations involving sensitive tasks like payments and bookings. Prior C-side validation came from "Yuanbao," a standalone AI app whose Monthly Active Users (MAU) surpassed 114 million during the 2026 Chinese New Year红包 campaign, though daily activity later subsided. This "pulse growth" highlighted the challenge of user retention for standalone apps, informing the decision to integrate AI natively into WeChat's high-frequency scenarios. However, WeChat AI's "Automatic Mode," which requires source code access, raises developer concerns about code security, data visibility, and liability for AI errors. A deeper, ecosystem-level tension exists between the efficiency of centralized AI task调度 and the potential "short-circuiting" of merchant pages, which could erode their branding, advertising revenue, and user engagement. As Tencent Chairman Pony Ma noted, balancing centralized AI调度 with the protection of decentralized merchant traffic is a core challenge. In summary, Tencent's AI path—comprising the stable Hunyuan base model, the user-validated Yuanbao app, and the newly testing WeChat AI integration—is logically coherent. The success of WeChat AI now hinges on resolving developer trust, establishing fair ecosystem rules for merchants, and ensuring operational reliability to gain user confidence for deep, transactional use.

marsbit5m ago

From Hunyuan to WeChat AI: Tencent's Slow Paced Journey Reaches the Delivery Juncture

marsbit5m ago

STRC Briefly Fell Below $91: Will Strategy Be Hunted by 'Market Fear'?

The article draws a parallel between FTX's 2022 collapse and the current situation facing MicroStrategy (Strategy), a major corporate holder of Bitcoin. The author argues that MicroStrategy's financial model, heavily reliant on issuing equity and convertible debt at a premium to its Bitcoin holdings, is under stress. The core issue is the compression of MSTR's stock premium over its Bitcoin holdings (NAV). This erodes the viability of its "flywheel" – using equity sales to buy more Bitcoin. The company has shifted towards preferred shares (like STRC) and debt to raise capital, incurring significant dividend and interest obligations (approximately $1.7 billion annually). With cash reserves dwindling and debt maturities looming, MicroStrategy faces mounting pressure to generate cash. The article outlines three problematic options: 1) cutting preferred dividends, damaging investor confidence; 2) issuing more MSTR stock at low premiums, diluting existing shareholders; or 3) selling Bitcoin, which founder Michael Saylor had vowed against but recently did in a small symbolic transaction. The author suggests that, like FTX, a crisis of confidence could trigger a rapid downward spiral as investors flee. While noting Saylor's actions are legal—unlike SBF's fraud at FTX—the article warns the structural risk born from financial engineering and over-leverage is significant. The preferred path out is a sharp rise in Bitcoin's price to restart the premium flywheel, but this would only create a larger, more complex system vulnerable to future failure. The author concludes by advocating for direct Bitcoin ownership over exposure through MicroStrategy's increasingly risky financial structure.

Foresight News20m ago

STRC Briefly Fell Below $91: Will Strategy Be Hunted by 'Market Fear'?

Foresight News20m ago

The Battle for the AI Payment Race: Traditional Card Networks Face Off Against Coinbase

With the rise of AI agents conducting transactions, a battle for the underlying payment infrastructure is underway. Two distinct and incompatible approaches have emerged for enabling autonomous AI payments. The first approach is championed by traditional card networks Visa and Mastercard. They leverage their existing tokenized card credential systems, extending them to allow verified AI agents to make purchases within user-defined limits. Services like Mastercard's Agent Pay and Visa's Intelligent Commerce integrate with major AI platforms (e.g., OpenAI, Anthropic) and keep transactions within the established, decades-old card payment model. This system offers advantages for consumer retail, including robust fraud protection, chargeback mechanisms, and extensive merchant networks. The second approach, led by Coinbase, utilizes stablecoins on open internet protocols. Its x402 protocol reactivates the HTTP 402 status code for machine-to-machine micropayments, using USDC for settlement directly on-chain. This method eliminates the need for accounts or card fees, making it highly efficient for high-frequency, low-value, cross-border transactions between AI agents—such as paying for API calls, data streams, or computational resources—where traditional card fees and settlement times are impractical. While card networks excel in consumer-facing scenarios requiring dispute resolution, stablecoin protocols are tailored for machine economies. A key challenge for both is agent identity verification and transaction authorization. Notably, Visa and Mastercard are hedging their bets by also investing in stablecoins. Visa has rapidly grown its stablecoin settlement volume and is collaborating with Coinbase to bridge its network with the x402 protocol. Mastercard plans to acquire stablecoin platform BVNK. Their strategy is to become the fee-collecting gateway for all payment flows, regardless of the channel. Current applications reflect this division: consumer AI shopping tools (e.g., ChatGPT's checkout, Amazon's "Shop for Me") predominantly use card networks, while machine-focused services (e.g., Amazon Bedrock's core payments) adopt stablecoins via the x402 protocol. In the short term, a coexistence model is expected, with cards dominating retail and stablecoins powering machine transactions. The long-term outcome depends on whether AI-driven commerce evolves to resemble traditional retail or becomes a vast network of machine micropayments. By investing in both tracks, the incumbent card networks are positioning themselves to capture transaction fees regardless of which future prevails.

marsbit31m ago

The Battle for the AI Payment Race: Traditional Card Networks Face Off Against Coinbase

marsbit31m ago

Trading

Spot
Futures
活动图片