OpenClaw Token Saving Ultimate Guide: Use the Strongest Model, Spend the Least Money / Includes Prompts

marsbitPublished on 2026-02-11Last updated on 2026-02-11

Abstract

This guide provides strategies to reduce OpenClaw token usage by 60-85% when using expensive models like Claude Opus. The main costs come not just from your input and the model's output, but from hidden overhead: a fixed System Prompt (~3000-5000 tokens), injected context files like AGENTS.md and MEMORY.md (~3000-14000 tokens), and conversation history. Key strategies include: 1. **Model Tiering:** Use the cheaper Claude Sonnet for 80% of daily tasks (chat, simple Q&A, cron jobs) and reserve Opus for complex tasks like writing and deep analysis. 2. **Context Slimming:** Drastically reduce the token count in injected files (AGENTS.md, SOUL.md, MEMORY.md) and remove unnecessary files from `workspaceFiles`. 3. **Cron Optimization:** Lower the frequency, merge tasks, and downgrade non-critical cron jobs to Sonnet. Configure deliveries for notifications only when necessary. 4. **Heartbeat Tuning:** Increase the interval (e.g., 45-60 minutes), set a silent period overnight, and slim down the HEARTBEAT.md file. 5. **Precise Retrieval with QMD:** Implement the local, zero-cost qmd tool for semantic search. This allows the agent to retrieve only specific relevant paragraphs from documents instead of reading entire files, saving up to 90% of tokens per query. 6. **Memory Search Selection:** For small memory files, use local embedding; for larger or multi-language needs, consider Voyage AI's free tier. By implementing these changes—model switching, context reduction, and smarter...

Author: xiyu

Want to use Claude Opus 4.6 but don't want the bill to explode at the end of the month? This guide will help you cut 60-85% of the cost.

1. Where do tokens go?

You think tokens are just "what you say + what the AI replies"? Actually, it's far more than that.

Hidden costs of each conversation:

  • System Prompt (~3000-5000 tokens): OpenClaw core instructions, cannot be changed

  • Context file injection (~3000-14000 tokens): AGENTS.md, SOUL.md, MEMORY.md, etc., included in every conversation – this is the biggest hidden cost

  • Message history: Gets longer the more you chat

  • Your input + AI output: This is what you thought was the "whole" thing

A simple "How's the weather today?" actually consumes 8000-15000 input tokens. Calculated with Opus, just the context costs $0.12-0.22.

Cron is even worse: Each trigger = a brand new conversation = re-injecting all context. A cron running every 15 minutes, 96 times a day, costs $10-20 per day under Opus.

Heartbeat is the same principle: Essentially also a conversation call, the shorter the interval, the more money it burns.

2. Model Tiering: Sonnet for Daily, Opus for Critical

The first major money-saving trick, with the most dramatic effect. Sonnet is priced at about 1/5 of Opus, and is fully sufficient for 80% of daily tasks.

markdown

Prompt:

Please help me change OpenClaw's default model to Claude Sonnet,

and only use Opus when deep analysis or creation is needed.

Specific needs:

1) Set default model to Sonnet

2) cron tasks default to Sonnet

3) Only specify Opus for writing, deep analysis tasks

Opus scenarios: Long-form writing, complex code, multi-step reasoning, creative tasks

Sonnet scenarios: Daily chat, simple Q&A, cron checks, heartbeat, file operations, translation

Actual test: After switching, monthly cost dropped 65%, experience almost no difference.

3. Context Slimming: Cut the Hidden Token Hogs

The "background noise" per call can be 3000-14000 tokens. Streamlining injected files is the optimization with the highest cost-performance ratio.

markdown

Prompt:

Help me streamline OpenClaw's context files to save tokens.

Specifically include: 1) Delete unnecessary parts of AGENTS.md (group chat rules, TTS, unused features), compress to within 800 tokens

2) Simplify SOUL.md to concise key points, 300-500 tokens

3) Clean up expired information in MEMORY.md, control within 2000 tokens

4) Check workspaceFiles configuration, remove unnecessary injected files

Rule of thumb: For every 1000 tokens reduced in injection, calculated at 100 Opus calls per day, save about $45 per month.

4. Cron Optimization: The Most Hidden Cost Killer

markdown

Prompt: Help me optimize OpenClaw's cron tasks to save tokens.

Please:

1) List all cron tasks, their frequency, and model

2) Downgrade all non-creative tasks to Sonnet

3) Merge tasks in the same time period (e.g., combine multiple checks into one)

4) Reduce unnecessary high frequency (system check from 10 minutes to 30 minutes, version check from 3 times/day to 1 time/day)

5) Configure delivery to notify on demand, no message when normal

Core principle: More frequent is not always better, most "real-time" demands are false demands. Merging 5 independent checks into 1 call saves 75% context injection cost.

5. Heartbeat Optimization

markdown

Prompt: Help me optimize OpenClaw heartbeat configuration:

1) Set work hour interval to 45-60 minutes

2) Set 23:00-08:00 at night as silent period

3) Streamline HEARTBEAT.md to the minimum number of lines

4) Merge scattered check tasks into heartbeat for batch execution

6. Precise Retrieval: Use qmd to Save 90% Input Token

When the agent looks up information, it defaults to "reading the full text" – a 500-line file is 3000-5000 tokens, but it only needs 10 lines from it. 90% of input tokens are wasted.

qmd is a local semantic retrieval tool that builds a full-text + vector index, allowing the agent to pinpoint paragraphs instead of reading the entire file. All computed locally, zero API cost.

Use with mq (Mini Query): Preview directory structure, precise paragraph extraction, keyword search – only read the needed 10-30 lines each time.

markdown

Prompt:

Help me configure qmd knowledge base retrieval to save tokens.

Github address: https://github.com/tobi/qmd

Needs:

1) Install qmd

2) Build index for the working directory

3) Add retrieval rules in AGENTS.md, force agent to prioritize qmd/mq search over direct read full text

4) Set up scheduled index updates

Actual effect: Each information lookup dropped from 15000 tokens to 1500 tokens, a 90% reduction.

Difference from memorySearch: memorySearch manages "memories" (MEMORY.md), qmd manages "looking up information" (custom knowledge base), they do not affect each other.

7. Memory Search Choice

markdown

Prompt: Help me configure OpenClaw's memorySearch.

If I don't have many memory files (dozens of md),

recommend using local embedding or Voyage AI?

Please explain the cost and retrieval quality differences of each.

Simple conclusion: Use local embedding for few memory files (zero cost), use Voyage AI for high multilingual needs or many files (200 million tokens per account free).

8. Ultimate Configuration Checklist

markdown

Prompt:

Please help me optimize OpenClaw configuration in one go to save tokens to the maximum extent, execute according to the following checklist:

Change default model to Sonnet, only reserve Opus for creative/analysis tasks

Streamline AGENTS.md / SOUL.md / MEMORY.md

Downgrade all cron tasks to Sonnet + merge + reduce frequency

Heartbeat interval 45 minutes + nighttime silence

Configure qmd precise retrieval to replace full-text reading

workspaceFiles only keep necessary files

Regularly streamline memory files, control MEMORY.md within 2000 tokens

Configure once, benefit long-term:

1. Model Tiering — Sonnet daily, Opus critical, save 60-80%

2. Context Slimming — Streamline files + qmd precise retrieval, save 30-90% input token

3. Reduce Calls — Merge cron, extend heartbeat, enable silent period

Sonnet 4 is already very strong, can't feel the difference in daily use. Just switch to Opus when you really need it.

Based on multi-agent system practical experience, data are desensitized estimates.

Related Questions

QWhat are the main hidden costs of token usage in OpenClaw according to the article?

AThe main hidden costs include the System Prompt (~3000-5000 tokens), context file injections like AGENTS.md, SOUL.md, and MEMORY.md (~3000-14000 tokens), and the accumulation of historical messages in conversations.

QWhat is the primary strategy recommended for reducing costs with model selection?

AThe primary strategy is model layering: using Claude Sonnet for daily tasks and reserving Claude Opus only for critical tasks like deep analysis or creative work, as Sonnet is about 1/5 the cost of Opus.

QHow does using qmd help in reducing token consumption?

Aqmd is a local semantic retrieval tool that creates a vector index for precise paragraph retrieval instead of reading entire files, reducing input tokens by up to 90% for research tasks, as it only fetches the needed 10-30 lines.

QWhat optimizations are suggested for cron tasks to save tokens?

AOptimizations include downgrading non-creative tasks to Sonnet, merging multiple tasks into single calls, reducing unnecessary high frequency (e.g., from 10 to 30 minutes), and configuring delivery for on-demand notifications to avoid messages when normal.

QWhat is the recommended approach for heartbeat configuration to minimize costs?

ASet heartbeat intervals to 45-60 minutes during work hours, implement a silent period from 23:00 to 08:00,精简 HEARTBEAT.md to minimal lines, and consolidate scattered check tasks into batch executions within heartbeat.

Related Reads

Lowering Expectations for BTC's Next Bull Market

The author, Alex Xu, explains his decision to significantly reduce his Bitcoin holdings (from full to ~30% of his portfolio) during the current bull cycle, citing a lowered long-term outlook for BTC's price appreciation in the next cycle. He outlines six key reasons for this reduced expectation: 1. **Diminished Growth Drivers:** The narrative of exponential user adoption has largely played out with institutional ETF adoption. The next major growth phase—adoption by sovereign national reserves or central banks—seems unlikely in the near future. 2. **Personal Opportunity Cost:** More attractive investment opportunities have emerged in other assets, such as undervalued companies. 3. **Industry-Wide Contraction:** The broader crypto industry is struggling, with most Web3 business models (SocialFi, GameFi, DePIN) failing. This overall萧条 (depression) reduces the fundamental demand and consensus for Bitcoin. 4. **Strain on Major Buyer:** MicroStrategy, a major corporate buyer of BTC, faces rising financing expenses for its debt, which could slow its purchasing rate and create significant marginal pressure on the market. 5. **Increased Competition from Gold:** The emergence of "tokenized gold" has closed the functional gap (portability, divisibility) between physical gold and Bitcoin, offering a strong competitor in the non-sovereign store-of-value space. 6. **Security Budget Concerns:** The block reward halving continues to exacerbate the long-standing issue of funding Bitcoin's network security, with new fee source explorations like Ordinals and L2s largely failing. The author's decision to hold a significant (though reduced) position reflects a cautious, not bearish, outlook. He remains open to increasing his exposure if the fundamental reasons for his skepticism change or if new positive catalysts emerge.

marsbit36m ago

Lowering Expectations for BTC's Next Bull Market

marsbit36m ago

Can Iran 'Control' the Strait of Hormuz?

Iran has announced a comprehensive plan to assert control over the strategic Strait of Hormuz, a critical global oil shipping chokepoint. The proposed measures include requiring all vessels to obtain Iranian permission for passage, imposing fees for security, environmental protection, and navigation management—preferably paid in Iranian rials—and absolutely banning Israeli ships. Vessels from countries deemed hostile by Iran’s top security bodies may also be barred. Analysts suggest Iran’s motives are multifaceted: increasing pressure on the U.S. and Israel by leveraging control over oil transit to influence global prices and inflation; creating a new revenue stream, potentially exceeding $7.7 billion annually, to counter Western sanctions and support postwar reconstruction; and using transit permissions as bargaining chips in future negotiations, notably with the U.S. However, the plan faces significant practical and diplomatic challenges. Enforcing comprehensive interception and fee collection in the busy waterway, patrolled by international military forces, would be difficult. The U.S. has already countering with a blockade of Iranian ports and threats to intercept any ship paying fees, potentially strangling Iran’s oil exports and fee revenue. Broad international opposition, led by European and Gulf states, and legal controversies further complicate implementation. The proposal may ultimately serve more as a negotiating tactic than a feasible policy, with its execution remaining highly uncertain.

marsbit1h ago

Can Iran 'Control' the Strait of Hormuz?

marsbit1h ago

Trading

Spot
Futures

Hot Articles

How to Buy T

Welcome to HTX.com! We've made purchasing Threshold Network Token (T) simple and convenient. Follow our step-by-step guide to embark on your crypto journey.Step 1: Create Your HTX AccountUse your email or phone number to sign up for a free account on HTX. Experience a hassle-free registration journey and unlock all features.Get My AccountStep 2: Go to Buy Crypto and Choose Your Payment MethodCredit/Debit Card: Use your Visa or Mastercard to buy Threshold Network Token (T) instantly.Balance: Use funds from your HTX account balance to trade seamlessly.Third Parties: We've added popular payment methods such as Google Pay and Apple Pay to enhance convenience.P2P: Trade directly with other users on HTX.Over-the-Counter (OTC): We offer tailor-made services and competitive exchange rates for traders.Step 3: Store Your Threshold Network Token (T)After purchasing your Threshold Network Token (T), store it in your HTX account. Alternatively, you can send it elsewhere via blockchain transfer or use it to trade other cryptocurrencies.Step 4: Trade Threshold Network Token (T)Easily trade Threshold Network Token (T) on HTX's spot market. Simply access your account, select your trading pair, execute your trades, and monitor in real-time. We offer a user-friendly experience for both beginners and seasoned traders.

10.8k Total ViewsPublished 2024.03.29Updated 2025.03.21

How to Buy T

Discussions

Welcome to the HTX Community. Here, you can stay informed about the latest platform developments and gain access to professional market insights. Users' opinions on the price of T (T) are presented below.

活动图片