Skip to main content
SoulForge tracks every token spent and prices it in real time. The status bar shows the running total in USD. /context opens a dashboard with the per-model breakdown.

What gets tracked

  • Prompt tokens (uncached input).
  • Completion tokens (model output).
  • Cache-write tokens (billed at a higher rate by most providers).
  • Cache-read tokens (billed at a discount).
  • Subagent tokens tracked separately from the main agent.
  • Per-model breakdown when the task router mixes providers.

Providers with built-in pricing

Pricing tables ship for the major providers, updated against their public price lists:
ProviderNotes
AnthropicClaude Opus/Sonnet/Haiku with cache-write and cache-read rates
OpenAIGPT-5.4, GPT-4.1, o3, o4-mini
GoogleGemini 2.5 Pro/Flash, Gemini 3 Flash/Pro
DeepSeekV3.2 (chat and reasoner)
GroqLlama 3.3, Llama 4 Scout, Qwen3, GPT-OSS
MistralMistral Large/Medium/Small, Codestral, Magistral, Ministral, Pixtral, Devstral
FireworksTier-based pricing (Mixtral, Llama 70B+, DeepSeek)
GitHub CopilotPremium-request multiplier-based estimation
OpenRouterLive pricing from the catalog
GitHub ModelsPer-token via multipliers
Ollama, LM Studio, OpenCode free models$0.00
Custom providers default to a conservative estimate. Unknown models fall back to Sonnet-tier pricing as a safety floor.

Why it matters

Two tactics cut cost dramatically:
  1. Mix models. Haiku for spark agents, Sonnet for ember agents, Flash for compaction. A task that would cost $0.25 on Sonnet often runs for $0.05 when the exploration phase routes through Haiku.
  2. Use caching. Cache reads are 10x cheaper on Anthropic, up to 50% off on Groq/Fireworks. SoulForge structures the system prompt and the Soul Map for maximum cache hits — typical cache-hit rates exceed 60%.

UI

The status bar shows the running total in USD. Compact mode shows tokens plus a dollar figure. /context opens the detailed view: per-model usage, cache ratio, subagent spend, and the compaction history. Use /router to assign cheap models to cheap tasks.