OpenClaw Token Saving Ultimate Guide: Use the Strongest Model, Spend the Least Money / Includes Prompts
This guide provides strategies to reduce OpenClaw token usage by 60-85% when using expensive models like Claude Opus. The main costs come not just from your input and the model's output, but from hidden overhead: a fixed System Prompt (~3000-5000 tokens), injected context files like AGENTS.md and MEMORY.md (~3000-14000 tokens), and conversation history.
Key strategies include:
1. **Model Tiering:** Use the cheaper Claude Sonnet for 80% of daily tasks (chat, simple Q&A, cron jobs) and reserve Opus for complex tasks like writing and deep analysis.
2. **Context Slimming:** Drastically reduce the token count in injected files (AGENTS.md, SOUL.md, MEMORY.md) and remove unnecessary files from `workspaceFiles`.
3. **Cron Optimization:** Lower the frequency, merge tasks, and downgrade non-critical cron jobs to Sonnet. Configure deliveries for notifications only when necessary.
4. **Heartbeat Tuning:** Increase the interval (e.g., 45-60 minutes), set a silent period overnight, and slim down the HEARTBEAT.md file.
5. **Precise Retrieval with QMD:** Implement the local, zero-cost qmd tool for semantic search. This allows the agent to retrieve only specific relevant paragraphs from documents instead of reading entire files, saving up to 90% of tokens per query.
6. **Memory Search Selection:** For small memory files, use local embedding; for larger or multi-language needs, consider Voyage AI's free tier.
By implementing these changes—model switching, context reduction, and smarter retrieval—users can significantly cut costs while maintaining performance for most tasks.
marsbit02/11 00:35