OpenClaw Token Saving Ultimate Guide: Use the Strongest Model, Spend the Least Money / Includes Prompts

marsbit2026-02-11 tarihinde yayınlandı2026-02-11 tarihinde güncellendi

Özet

This guide provides strategies to reduce OpenClaw token usage by 60-85% when using expensive models like Claude Opus. The main costs come not just from your input and the model's output, but from hidden overhead: a fixed System Prompt (~3000-5000 tokens), injected context files like AGENTS.md and MEMORY.md (~3000-14000 tokens), and conversation history. Key strategies include: 1. **Model Tiering:** Use the cheaper Claude Sonnet for 80% of daily tasks (chat, simple Q&A, cron jobs) and reserve Opus for complex tasks like writing and deep analysis. 2. **Context Slimming:** Drastically reduce the token count in injected files (AGENTS.md, SOUL.md, MEMORY.md) and remove unnecessary files from `workspaceFiles`. 3. **Cron Optimization:** Lower the frequency, merge tasks, and downgrade non-critical cron jobs to Sonnet. Configure deliveries for notifications only when necessary. 4. **Heartbeat Tuning:** Increase the interval (e.g., 45-60 minutes), set a silent period overnight, and slim down the HEARTBEAT.md file. 5. **Precise Retrieval with QMD:** Implement the local, zero-cost qmd tool for semantic search. This allows the agent to retrieve only specific relevant paragraphs from documents instead of reading entire files, saving up to 90% of tokens per query. 6. **Memory Search Selection:** For small memory files, use local embedding; for larger or multi-language needs, consider Voyage AI's free tier. By implementing these changes—model switching, context reduction, and smarter...

Author: xiyu

Want to use Claude Opus 4.6 but don't want the bill to explode at the end of the month? This guide will help you cut 60-85% of the cost.

1. Where do tokens go?

You think tokens are just "what you say + what the AI replies"? Actually, it's far more than that.

Hidden costs of each conversation:

  • System Prompt (~3000-5000 tokens): OpenClaw core instructions, cannot be changed

  • Context file injection (~3000-14000 tokens): AGENTS.md, SOUL.md, MEMORY.md, etc., included in every conversation – this is the biggest hidden cost

  • Message history: Gets longer the more you chat

  • Your input + AI output: This is what you thought was the "whole" thing

A simple "How's the weather today?" actually consumes 8000-15000 input tokens. Calculated with Opus, just the context costs $0.12-0.22.

Cron is even worse: Each trigger = a brand new conversation = re-injecting all context. A cron running every 15 minutes, 96 times a day, costs $10-20 per day under Opus.

Heartbeat is the same principle: Essentially also a conversation call, the shorter the interval, the more money it burns.

2. Model Tiering: Sonnet for Daily, Opus for Critical

The first major money-saving trick, with the most dramatic effect. Sonnet is priced at about 1/5 of Opus, and is fully sufficient for 80% of daily tasks.

markdown

Prompt:

Please help me change OpenClaw's default model to Claude Sonnet,

and only use Opus when deep analysis or creation is needed.

Specific needs:

1) Set default model to Sonnet

2) cron tasks default to Sonnet

3) Only specify Opus for writing, deep analysis tasks

Opus scenarios: Long-form writing, complex code, multi-step reasoning, creative tasks

Sonnet scenarios: Daily chat, simple Q&A, cron checks, heartbeat, file operations, translation

Actual test: After switching, monthly cost dropped 65%, experience almost no difference.

3. Context Slimming: Cut the Hidden Token Hogs

The "background noise" per call can be 3000-14000 tokens. Streamlining injected files is the optimization with the highest cost-performance ratio.

markdown

Prompt:

Help me streamline OpenClaw's context files to save tokens.

Specifically include: 1) Delete unnecessary parts of AGENTS.md (group chat rules, TTS, unused features), compress to within 800 tokens

2) Simplify SOUL.md to concise key points, 300-500 tokens

3) Clean up expired information in MEMORY.md, control within 2000 tokens

4) Check workspaceFiles configuration, remove unnecessary injected files

Rule of thumb: For every 1000 tokens reduced in injection, calculated at 100 Opus calls per day, save about $45 per month.

4. Cron Optimization: The Most Hidden Cost Killer

markdown

Prompt: Help me optimize OpenClaw's cron tasks to save tokens.

Please:

1) List all cron tasks, their frequency, and model

2) Downgrade all non-creative tasks to Sonnet

3) Merge tasks in the same time period (e.g., combine multiple checks into one)

4) Reduce unnecessary high frequency (system check from 10 minutes to 30 minutes, version check from 3 times/day to 1 time/day)

5) Configure delivery to notify on demand, no message when normal

Core principle: More frequent is not always better, most "real-time" demands are false demands. Merging 5 independent checks into 1 call saves 75% context injection cost.

5. Heartbeat Optimization

markdown

Prompt: Help me optimize OpenClaw heartbeat configuration:

1) Set work hour interval to 45-60 minutes

2) Set 23:00-08:00 at night as silent period

3) Streamline HEARTBEAT.md to the minimum number of lines

4) Merge scattered check tasks into heartbeat for batch execution

6. Precise Retrieval: Use qmd to Save 90% Input Token

When the agent looks up information, it defaults to "reading the full text" – a 500-line file is 3000-5000 tokens, but it only needs 10 lines from it. 90% of input tokens are wasted.

qmd is a local semantic retrieval tool that builds a full-text + vector index, allowing the agent to pinpoint paragraphs instead of reading the entire file. All computed locally, zero API cost.

Use with mq (Mini Query): Preview directory structure, precise paragraph extraction, keyword search – only read the needed 10-30 lines each time.

markdown

Prompt:

Help me configure qmd knowledge base retrieval to save tokens.

Github address: https://github.com/tobi/qmd

Needs:

1) Install qmd

2) Build index for the working directory

3) Add retrieval rules in AGENTS.md, force agent to prioritize qmd/mq search over direct read full text

4) Set up scheduled index updates

Actual effect: Each information lookup dropped from 15000 tokens to 1500 tokens, a 90% reduction.

Difference from memorySearch: memorySearch manages "memories" (MEMORY.md), qmd manages "looking up information" (custom knowledge base), they do not affect each other.

7. Memory Search Choice

markdown

Prompt: Help me configure OpenClaw's memorySearch.

If I don't have many memory files (dozens of md),

recommend using local embedding or Voyage AI?

Please explain the cost and retrieval quality differences of each.

Simple conclusion: Use local embedding for few memory files (zero cost), use Voyage AI for high multilingual needs or many files (200 million tokens per account free).

8. Ultimate Configuration Checklist

markdown

Prompt:

Please help me optimize OpenClaw configuration in one go to save tokens to the maximum extent, execute according to the following checklist:

Change default model to Sonnet, only reserve Opus for creative/analysis tasks

Streamline AGENTS.md / SOUL.md / MEMORY.md

Downgrade all cron tasks to Sonnet + merge + reduce frequency

Heartbeat interval 45 minutes + nighttime silence

Configure qmd precise retrieval to replace full-text reading

workspaceFiles only keep necessary files

Regularly streamline memory files, control MEMORY.md within 2000 tokens

Configure once, benefit long-term:

1. Model Tiering — Sonnet daily, Opus critical, save 60-80%

2. Context Slimming — Streamline files + qmd precise retrieval, save 30-90% input token

3. Reduce Calls — Merge cron, extend heartbeat, enable silent period

Sonnet 4 is already very strong, can't feel the difference in daily use. Just switch to Opus when you really need it.

Based on multi-agent system practical experience, data are desensitized estimates.

İlgili Sorular

QWhat are the main hidden costs of token usage in OpenClaw according to the article?

AThe main hidden costs include the System Prompt (~3000-5000 tokens), context file injections like AGENTS.md, SOUL.md, and MEMORY.md (~3000-14000 tokens), and the accumulation of historical messages in conversations.

QWhat is the primary strategy recommended for reducing costs with model selection?

AThe primary strategy is model layering: using Claude Sonnet for daily tasks and reserving Claude Opus only for critical tasks like deep analysis or creative work, as Sonnet is about 1/5 the cost of Opus.

QHow does using qmd help in reducing token consumption?

Aqmd is a local semantic retrieval tool that creates a vector index for precise paragraph retrieval instead of reading entire files, reducing input tokens by up to 90% for research tasks, as it only fetches the needed 10-30 lines.

QWhat optimizations are suggested for cron tasks to save tokens?

AOptimizations include downgrading non-creative tasks to Sonnet, merging multiple tasks into single calls, reducing unnecessary high frequency (e.g., from 10 to 30 minutes), and configuring delivery for on-demand notifications to avoid messages when normal.

QWhat is the recommended approach for heartbeat configuration to minimize costs?

ASet heartbeat intervals to 45-60 minutes during work hours, implement a silent period from 23:00 to 08:00,精简 HEARTBEAT.md to minimal lines, and consolidate scattered check tasks into batch executions within heartbeat.

İlgili Okumalar

The Largest IPO in History Ignites Heated Debate: Is SpaceX Worth $1.77 Trillion?

SpaceX's potential IPO is priced at $135 per share, aiming to raise $75 billion and valuing the company at approximately $1.77 trillion, which would make it the largest IPO in history. This valuation has sparked intense debate among investors. Bullish analysts, including major underwriters Goldman Sachs and Morgan Stanley, argue the valuation is justified by SpaceX's long-term potential. They see it not just as a rocket company but as a future leader in space infrastructure, with key growth drivers being Starlink satellite internet, low-cost rocket launches, and future AI-related ventures. They project revenues reaching hundreds of billions to trillions of dollars by 2030-2040. ARK Invest's model suggests a 2030 enterprise value could reach $2.5 trillion. Bearish analysts from independent research firms like Morningstar, PitchBook, and New Constructs contend the IPO price is excessively high, already pricing in unrealistic future growth. Using DCF and sum-of-the-parts models, they estimate fair value between $780 billion and $1.7 trillion, significantly below the IPO target. They highlight risks such as the speculative nature of AI projections, over-dependence on Elon Musk, high growth expectations, and corporate governance concerns. Trefis set a target price of just $79 per share. While both sides acknowledge SpaceX's unique position in commercial space, the core disagreement centers on whether the $135 share price offers a reasonable margin of safety or is overly optimistic. Despite the valuation controversy, reported strong demand for the IPO indicates significant market interest.

marsbit45 dk önce

The Largest IPO in History Ignites Heated Debate: Is SpaceX Worth $1.77 Trillion?

marsbit45 dk önce

After the Passage of the GENIUS Act and the CLARITY Act, What Is the Correct Architecture for On-Chain Yield?

The article discusses the evolution of on-chain credit, distinguishing three markets: overcollateralized crypto lending, unsecured lending (largely unsuccessful), and asset-backed credit (ABC). ABC, backed by identifiable real-world collateral with legal recourse, is identified as the fastest-growing category and the only one credibly addressing adverse selection—the core problem in credit where the riskiest borrowers self-select. Current growth in on-chain Real World Assets (RWAs), particularly tokenized private credit funds (e.g., Maple Finance, Centrifuge), is substantial but often merely "wraps" existing fund structures, inheriting their risks rather than solving adverse selection at the protocol level. The regulatory landscape is a key driver, with the US GENIUS Act (prohibiting stablecoin issuers from paying yield) and the proposed CLARITY Act (closing loopholes on indirect yield) set to redefine permissible yield-bearing products. This makes vaults (like ERC-4626) the critical architecture—they become the primary compliant vehicle for delivering yield, functioning as issuance, disclosure, distribution, and recovery mechanisms. The author's thesis is that the correct post-GENIUS/CLARITY architecture involves building ABC solutions where credit assessment, structure, and recovery are encoded directly into the smart contract vault layer, moving beyond mere tokenized fund wrappers to solve adverse selection fundamentally and ensure regulatory compliance.

Foresight News1 saat önce

After the Passage of the GENIUS Act and the CLARITY Act, What Is the Correct Architecture for On-Chain Yield?

Foresight News1 saat önce

TechFlow Intelligence Bureau: Anthropic's New Model Fable Sparks Controversy by Restricting Biosafety Research, US CPI Soars to 4.2%, a Three-Year High

**Summary of TechFlow Intelligence Report:** The newsletter covers several key tech and finance developments. In AI, Anthropic's new Fable model faced backlash for secretly limiting biomedical research capabilities and enforcing a 30-day data retention policy, prompting the company to promise more transparent adjustments. In a related story, Anthropic's founder revealed his departure from OpenAI was due to dishonesty from Sam Altman, not safety concerns. Meanwhile, OpenAI is considering significant price cuts to compete with Anthropic, potentially sparking a price war. In crypto/Web3, BlackRock filed a new amendment for a yield-generating Bitcoin ETF, while Bank of America's CEO warned that stablecoin yields could drain trillions from traditional banks. U.S. Senator Cynthia Lummis advocated for the U.S. to officially accumulate Bitcoin reserves. In hardware, Nvidia released the DiffusionGemma-2-6B image model optimized for efficient inference, and AMD promoted its unified memory architecture to challenge Nvidia's dominance. TSMC's CFO hinted at possible price increases due to soaring AI chip demand. A major legal ruling in Germany held Google legally responsible for inaccurate information generated by its AI Overviews feature. Google Chrome also moved to fully block ad-blocker workarounds like uBlock Origin. Macroeconomic headlines included U.S. CPI rising to 4.2% (a 3-year high) and Iran's complete closure of the Strait of Hormuz, raising oil price and inflation fears. South Korean markets saw continued volatility with massive foreign capital outflow. Other notable stories: Microsoft expanded its Copilot AI assistant "Mico" globally; a study found r/wallstreetbets users' stock picks outperformed Wall Street; a fully autonomous drone killed a human soldier for the first time, raising AI ethics concerns; and a Chinese hospital used brain-computer interface technology to help a blind person "see." The overarching theme connects debates over AI boundaries and responsibility (Anthropic's restrictions, Google's liability, lethal autonomous drones) with real-world economic and geopolitical turmoil (inflation, Strait of Hormuz closure, market instability), highlighting the tense interplay between technological advancement and global chaos.

marsbit1 saat önce

TechFlow Intelligence Bureau: Anthropic's New Model Fable Sparks Controversy by Restricting Biosafety Research, US CPI Soars to 4.2%, a Three-Year High

marsbit1 saat önce

Alibaba's Yet Another New Business Division: What Signal Does It Send?

Alibaba has established a new "Token Foundry" business unit, merging its Tongyi large model division and Future Life Lab. Led directly by Group CEO Wu Yongming, this marks the company's third significant AI organizational reshuffle in 2026, following the creation of the Alibaba Token Hub (ATH) and a Group Technology Committee. The move signals a strategic shift from consolidating AI resources to accelerating productization and commercialization. The "Token Foundry" name reflects Alibaba's ambition to become a foundational supplier in the AI era, focusing on model development and commercial application. Key teams, including those behind the high-performing HappyHorse video generation model, have been integrated into the new unit. Concurrently, Zhou Jingren, architect of the Qwen model series, has been appointed Group Chief Scientist to lead a new AI Future Research Institute, focusing on long-term technological breakthroughs like Agent capabilities. This restructuring creates a clear four-layer AI architecture within Alibaba: the research institute for frontier exploration, Token Foundry for core models and commercialization, MaaS for platform services, and business units like Qianwen (C端) and Wukong (B端) for end-user applications. The adjustments align with a global trend among tech giants like Google and Microsoft to centralize AI leadership under the CEO and deeply integrate research with business units. The urgency is driven by a narrowing competitive window. Alibaba has announced its AI business is now entering a commercialization phase, with AI-related revenue seeing triple-digit growth for eleven consecutive quarters. The company faces intense competition in the MaaS (Model-as-a-Service) sector from rivals like ByteDance and Tencent. The Token Foundry initiative represents Alibaba's effort to streamline execution and enhance competitiveness in this critical, fast-evolving landscape.

marsbit2 saat önce

Alibaba's Yet Another New Business Division: What Signal Does It Send?

marsbit2 saat önce

İşlemler

Spot
Futures

Popüler Makaleler

T Nasıl Satın Alınır

HTX.com’a hoş geldiniz! Threshold Network Token (T) satın alma işlemlerini basit ve kullanışlı bir hâle getirdik. Adım adım açıkladığımız rehberimizi takip ederek kripto yolculuğunuza başlayın. 1. Adım: HTX Hesabınızı OluşturunHTX'te ücretsiz bir hesap açmak için e-posta adresinizi veya telefon numaranızı kullanın. Sorunsuzca kaydolun ve tüm özelliklerin kilidini açın. Hesabımı Aç2. Adım: Kripto Satın Al Bölümüne Gidin ve Ödeme Yönteminizi SeçinKredi/Banka Kartı: Visa veya Mastercard'ınızı kullanarak anında Threshold Network Token (T) satın alın.Bakiye: Sorunsuz bir şekilde işlem yapmak için HTX hesap bakiyenizdeki fonları kullanın.Üçüncü Taraflar: Kullanımı kolaylaştırmak için Google Pay ve Apple Pay gibi popüler ödeme yöntemlerini ekledik.P2P: HTX'teki diğer kullanıcılarla doğrudan işlem yapın.Borsa Dışı (OTC): Yatırımcılar için kişiye özel hizmetler ve rekabetçi döviz kurları sunuyoruz.3. Adım: Threshold Network Token (T) Varlıklarınızı SaklayınThreshold Network Token (T) satın aldıktan sonra HTX hesabınızda saklayın. Alternatif olarak, blok zinciri transferi yoluyla başka bir yere gönderebilir veya diğer kripto para birimlerini takas etmek için kullanabilirsiniz.4. Adım: Threshold Network Token (T) Varlıklarınızla İşlem YapınHTX'in spot piyasasında Threshold Network Token (T) ile kolayca işlemler yapın.Hesabınıza erişin, işlem çiftinizi seçin, işlemlerinizi gerçekleştirin ve gerçek zamanlı olarak izleyin. Hem yeni başlayanlar hem de deneyimli yatırımcılar için kullanıcı dostu bir deneyim sunuyoruz.

467 Toplam GörüntülenmeYayınlanma 2024.12.10Güncellenme 2026.06.02

T Nasıl Satın Alınır

Tartışmalar

HTX Topluluğuna hoş geldiniz. Burada, en son platform gelişmeleri hakkında bilgi sahibi olabilir ve profesyonel piyasa görüşlerine erişebilirsiniz. Kullanıcıların T (T) fiyatı hakkındaki görüşleri aşağıda sunulmaktadır.

活动图片