OpenClaw Token Saving Ultimate Guide: Use the Strongest Model, Spend the Least Money / Includes Prompts

marsbitPubblicato 2026-02-11Pubblicato ultima volta 2026-02-11

Introduzione

This guide provides strategies to reduce OpenClaw token usage by 60-85% when using expensive models like Claude Opus. The main costs come not just from your input and the model's output, but from hidden overhead: a fixed System Prompt (~3000-5000 tokens), injected context files like AGENTS.md and MEMORY.md (~3000-14000 tokens), and conversation history. Key strategies include: 1. **Model Tiering:** Use the cheaper Claude Sonnet for 80% of daily tasks (chat, simple Q&A, cron jobs) and reserve Opus for complex tasks like writing and deep analysis. 2. **Context Slimming:** Drastically reduce the token count in injected files (AGENTS.md, SOUL.md, MEMORY.md) and remove unnecessary files from `workspaceFiles`. 3. **Cron Optimization:** Lower the frequency, merge tasks, and downgrade non-critical cron jobs to Sonnet. Configure deliveries for notifications only when necessary. 4. **Heartbeat Tuning:** Increase the interval (e.g., 45-60 minutes), set a silent period overnight, and slim down the HEARTBEAT.md file. 5. **Precise Retrieval with QMD:** Implement the local, zero-cost qmd tool for semantic search. This allows the agent to retrieve only specific relevant paragraphs from documents instead of reading entire files, saving up to 90% of tokens per query. 6. **Memory Search Selection:** For small memory files, use local embedding; for larger or multi-language needs, consider Voyage AI's free tier. By implementing these changes—model switching, context reduction, and smarter...

Author: xiyu

Want to use Claude Opus 4.6 but don't want the bill to explode at the end of the month? This guide will help you cut 60-85% of the cost.

1. Where do tokens go?

You think tokens are just "what you say + what the AI replies"? Actually, it's far more than that.

Hidden costs of each conversation:

  • System Prompt (~3000-5000 tokens): OpenClaw core instructions, cannot be changed

  • Context file injection (~3000-14000 tokens): AGENTS.md, SOUL.md, MEMORY.md, etc., included in every conversation – this is the biggest hidden cost

  • Message history: Gets longer the more you chat

  • Your input + AI output: This is what you thought was the "whole" thing

A simple "How's the weather today?" actually consumes 8000-15000 input tokens. Calculated with Opus, just the context costs $0.12-0.22.

Cron is even worse: Each trigger = a brand new conversation = re-injecting all context. A cron running every 15 minutes, 96 times a day, costs $10-20 per day under Opus.

Heartbeat is the same principle: Essentially also a conversation call, the shorter the interval, the more money it burns.

2. Model Tiering: Sonnet for Daily, Opus for Critical

The first major money-saving trick, with the most dramatic effect. Sonnet is priced at about 1/5 of Opus, and is fully sufficient for 80% of daily tasks.

markdown

Prompt:

Please help me change OpenClaw's default model to Claude Sonnet,

and only use Opus when deep analysis or creation is needed.

Specific needs:

1) Set default model to Sonnet

2) cron tasks default to Sonnet

3) Only specify Opus for writing, deep analysis tasks

Opus scenarios: Long-form writing, complex code, multi-step reasoning, creative tasks

Sonnet scenarios: Daily chat, simple Q&A, cron checks, heartbeat, file operations, translation

Actual test: After switching, monthly cost dropped 65%, experience almost no difference.

3. Context Slimming: Cut the Hidden Token Hogs

The "background noise" per call can be 3000-14000 tokens. Streamlining injected files is the optimization with the highest cost-performance ratio.

markdown

Prompt:

Help me streamline OpenClaw's context files to save tokens.

Specifically include: 1) Delete unnecessary parts of AGENTS.md (group chat rules, TTS, unused features), compress to within 800 tokens

2) Simplify SOUL.md to concise key points, 300-500 tokens

3) Clean up expired information in MEMORY.md, control within 2000 tokens

4) Check workspaceFiles configuration, remove unnecessary injected files

Rule of thumb: For every 1000 tokens reduced in injection, calculated at 100 Opus calls per day, save about $45 per month.

4. Cron Optimization: The Most Hidden Cost Killer

markdown

Prompt: Help me optimize OpenClaw's cron tasks to save tokens.

Please:

1) List all cron tasks, their frequency, and model

2) Downgrade all non-creative tasks to Sonnet

3) Merge tasks in the same time period (e.g., combine multiple checks into one)

4) Reduce unnecessary high frequency (system check from 10 minutes to 30 minutes, version check from 3 times/day to 1 time/day)

5) Configure delivery to notify on demand, no message when normal

Core principle: More frequent is not always better, most "real-time" demands are false demands. Merging 5 independent checks into 1 call saves 75% context injection cost.

5. Heartbeat Optimization

markdown

Prompt: Help me optimize OpenClaw heartbeat configuration:

1) Set work hour interval to 45-60 minutes

2) Set 23:00-08:00 at night as silent period

3) Streamline HEARTBEAT.md to the minimum number of lines

4) Merge scattered check tasks into heartbeat for batch execution

6. Precise Retrieval: Use qmd to Save 90% Input Token

When the agent looks up information, it defaults to "reading the full text" – a 500-line file is 3000-5000 tokens, but it only needs 10 lines from it. 90% of input tokens are wasted.

qmd is a local semantic retrieval tool that builds a full-text + vector index, allowing the agent to pinpoint paragraphs instead of reading the entire file. All computed locally, zero API cost.

Use with mq (Mini Query): Preview directory structure, precise paragraph extraction, keyword search – only read the needed 10-30 lines each time.

markdown

Prompt:

Help me configure qmd knowledge base retrieval to save tokens.

Github address: https://github.com/tobi/qmd

Needs:

1) Install qmd

2) Build index for the working directory

3) Add retrieval rules in AGENTS.md, force agent to prioritize qmd/mq search over direct read full text

4) Set up scheduled index updates

Actual effect: Each information lookup dropped from 15000 tokens to 1500 tokens, a 90% reduction.

Difference from memorySearch: memorySearch manages "memories" (MEMORY.md), qmd manages "looking up information" (custom knowledge base), they do not affect each other.

7. Memory Search Choice

markdown

Prompt: Help me configure OpenClaw's memorySearch.

If I don't have many memory files (dozens of md),

recommend using local embedding or Voyage AI?

Please explain the cost and retrieval quality differences of each.

Simple conclusion: Use local embedding for few memory files (zero cost), use Voyage AI for high multilingual needs or many files (200 million tokens per account free).

8. Ultimate Configuration Checklist

markdown

Prompt:

Please help me optimize OpenClaw configuration in one go to save tokens to the maximum extent, execute according to the following checklist:

Change default model to Sonnet, only reserve Opus for creative/analysis tasks

Streamline AGENTS.md / SOUL.md / MEMORY.md

Downgrade all cron tasks to Sonnet + merge + reduce frequency

Heartbeat interval 45 minutes + nighttime silence

Configure qmd precise retrieval to replace full-text reading

workspaceFiles only keep necessary files

Regularly streamline memory files, control MEMORY.md within 2000 tokens

Configure once, benefit long-term:

1. Model Tiering — Sonnet daily, Opus critical, save 60-80%

2. Context Slimming — Streamline files + qmd precise retrieval, save 30-90% input token

3. Reduce Calls — Merge cron, extend heartbeat, enable silent period

Sonnet 4 is already very strong, can't feel the difference in daily use. Just switch to Opus when you really need it.

Based on multi-agent system practical experience, data are desensitized estimates.

Domande pertinenti

QWhat are the main hidden costs of token usage in OpenClaw according to the article?

AThe main hidden costs include the System Prompt (~3000-5000 tokens), context file injections like AGENTS.md, SOUL.md, and MEMORY.md (~3000-14000 tokens), and the accumulation of historical messages in conversations.

QWhat is the primary strategy recommended for reducing costs with model selection?

AThe primary strategy is model layering: using Claude Sonnet for daily tasks and reserving Claude Opus only for critical tasks like deep analysis or creative work, as Sonnet is about 1/5 the cost of Opus.

QHow does using qmd help in reducing token consumption?

Aqmd is a local semantic retrieval tool that creates a vector index for precise paragraph retrieval instead of reading entire files, reducing input tokens by up to 90% for research tasks, as it only fetches the needed 10-30 lines.

QWhat optimizations are suggested for cron tasks to save tokens?

AOptimizations include downgrading non-creative tasks to Sonnet, merging multiple tasks into single calls, reducing unnecessary high frequency (e.g., from 10 to 30 minutes), and configuring delivery for on-demand notifications to avoid messages when normal.

QWhat is the recommended approach for heartbeat configuration to minimize costs?

ASet heartbeat intervals to 45-60 minutes during work hours, implement a silent period from 23:00 to 08:00,精简 HEARTBEAT.md to minimal lines, and consolidate scattered check tasks into batch executions within heartbeat.

Letture associate

You Bet on the News, the Pros Read the Rules: The True Cognitive Gap in Losing Money on Polymarket

The article explains that the key to profiting on Polymarket, a prediction market platform, lies not just predicting real-world events correctly, but in meticulously understanding the specific rules that govern how each market will be resolved. It illustrates this with examples, such as a market on Venezuela's 2026 leader, where the official rules defining "officially holds" the office overruled the intuitive answer of who was in practical control. Other examples include debates over the definition of a "token" or what constitutes an "agreement." The core argument is that a "reality vs. rules" gap creates pricing discrepancies that savvy traders ("车头" or "whales") exploit. The platform has a formal dispute resolution process managed by UMA token holders to settle ambiguous outcomes. This process involves proposal submission, a challenge window, a discussion period, and a final vote. However, the article highlights a critical flaw in this system compared to a traditional court: the lack of separation between the arbiters (UMA voters) and the interested parties (traders with financial stakes in the outcome). This conflict of interest undermines the discussion phase, leads to herd mentality, and results in opaque final decisions without explanatory rulings. Consequently, the system lacks a body of precedent, making it difficult for users to learn from past disputes. The ultimate takeaway is that success on Polymarket requires a lawyer-like scrutiny of the rules to identify and capitalize on the cognitive gap between how events appear and how they are contractually defined for settlement.

marsbit11 min fa

You Bet on the News, the Pros Read the Rules: The True Cognitive Gap in Losing Money on Polymarket

marsbit11 min fa

Will the Fed Still Cut Interest Rates? Tonight's Data Is Crucial

The core debate surrounding the Federal Reserve's potential interest rate cuts is intensifying amid geopolitical conflict and rebounding inflation. The key question is whether high energy prices will cause persistent inflation or weaken consumer demand enough to force the Fed to cut rates. Citigroup presents a bullish case for cuts, arguing that oil supply disruptions from the Strait of Hormuz are temporary and will not lead to lasting inflationary pressure. They point to receding bond yields and oil prices as evidence the market is pricing in a short-lived shock. Citi's data also shows tightening financial conditions, a stabilizing labor market, and healthy tax returns, supporting their view that the path to lower rates remains open. Conversely, Deutsche Bank offers a starkly contrasting, more hawkish outlook. They argue the Fed's current policy is already neutral and expect rates to remain unchanged indefinitely. Their view is based on stalled disinflation progress and a shift toward more hawkish rhetoric from key Fed officials like Waller, who cited risks from prolonged Middle East conflict and tariffs. Other officials, including Williams and Hammack, signaled rates would likely stay on hold for a "considerable time." The market pricing has shifted dramatically, now forecasting zero cuts in 2026. The imminent release of the March retail sales "control group" data is highlighted as a critical test. This metric, which excludes gas station sales, will reveal if high gasoline prices are eroding consumer spending in other areas. A weak reading could support the case for imminent rate cuts, while a strong one would bolster the argument for the Fed to hold steady. This data is pivotal for determining the near-term policy path.

marsbit32 min fa

Will the Fed Still Cut Interest Rates? Tonight's Data Is Crucial

marsbit32 min fa

The Second Half of Macro Influencer Fu Peng's Career

Fu Peng, a prominent Chinese macroeconomist and former chief economist of Northeast Securities, has joined Hong Kong-based digital asset management firm Bitfire Group (formerly New Huo Group) as its chief economist. This move, announced in April 2026, triggered an 11% surge in Bitfire's stock price. Fu, known for his accessible macroeconomic commentary and large social media following, will focus on integrating digital assets into global asset allocation frameworks, particularly combining FICC (fixed income, currencies, and commodities) with cryptocurrencies for institutional clients. His career includes roles at Lehman Brothers and Solomon International, with significant influence gained through public communication. However, in late 2024, Fu faced temporary social media bans after a controversial private speech at HSBC on China's economic challenges, though he denied regulatory sanctions. He later left Northeast Securities citing health reasons. Bitfire, a licensed virtual asset manager serving high-net-worth clients, seeks to build trust and attract traditional capital through Fu’s expertise and credibility. The partnership represents a strategic shift for both: Fu enters the crypto sector after a traditional finance peak, while Bitfire aims to leverage his macro framework for institutional adoption. Outcomes remain uncertain regarding capital inflows and compatibility within corporate structure.

marsbit1 h fa

The Second Half of Macro Influencer Fu Peng's Career

marsbit1 h fa

Trading

Spot
Futures

Articoli Popolari

Come comprare T

Benvenuto in HTX.com! Abbiamo reso l'acquisto di Threshold Network Token (T) semplice e conveniente. Segui la nostra guida passo passo per intraprendere il tuo viaggio nel mondo delle criptovalute.Step 1: Crea il tuo Account HTXUsa la tua email o numero di telefono per registrarti il tuo account gratuito su HTX. Vivi un'esperienza facile e sblocca tutte le funzionalità,Crea il mio accountStep 2: Vai in Acquista crypto e seleziona il tuo metodo di pagamentoCarta di credito/debito: utilizza la tua Visa o Mastercard per acquistare immediatamente Threshold Network TokenT.Bilancio: Usa i fondi dal bilancio del tuo account HTX per fare trading senza problemi.Terze parti: abbiamo aggiunto metodi di pagamento molto utilizzati come Google Pay e Apple Pay per maggiore comodità.P2P: Fai trading direttamente con altri utenti HTX.Over-the-Counter (OTC): Offriamo servizi su misura e tassi di cambio competitivi per i trader.Step 3: Conserva Threshold Network Token (T)Dopo aver acquistato Threshold Network Token (T), conserva nel tuo account HTX. In alternativa, puoi inviare tramite trasferimento blockchain o scambiare per altre criptovalute.Step 4: Scambia Threshold Network Token (T)Scambia facilmente Threshold Network Token (T) nel mercato spot di HTX. Accedi al tuo account, seleziona la tua coppia di trading, esegui le tue operazioni e monitora in tempo reale. Offriamo un'esperienza user-friendly sia per chi ha appena iniziato che per i trader più esperti.

306 Totale visualizzazioniPubblicato il 2024.12.10Aggiornato il 2025.03.21

Come comprare T

Discussioni

Benvenuto nella Community HTX. Qui puoi rimanere informato sugli ultimi sviluppi della piattaforma e accedere ad approfondimenti esperti sul mercato. Le opinioni degli utenti sul prezzo di T T sono presentate come di seguito.

活动图片