Skip to main content

Documentation Index

Fetch the complete documentation index at: https://soulforge.proxysoul.com/llms.txt

Use this file to discover all available pages before exploring further.

For bigger tasks, Forge (the main agent) dispatches sub-agents in parallel. Each one does a slice of the work, then reports back.

Two tiers

AgentRoleModel (router slot)
SparkRead-only research: navigate, read, analyzespark
EmberFile edits, refactorsember
Plus a WebSearch agent for multi-step web research. Assign cheap models to spark, strong models to ember in the task router. That is where most of the cost savings come from.

What they share

  • I/O cache. When multiple agents run concurrently and one has already fetched a file, the others get the cached bytes instead of touching disk again. This is a speed win, not a token win: every agent still reads the content into its own context window and pays tokens for it.
  • Edit serialization. Concurrent writes to the same file are queued, not raced.
  • Findings channel. One agent’s discovery reaches the others at their next step.

How the cost savings actually work

Savings do not come from shared context. They come from:
  • Model mix. Spark agents run on a cheap model (Haiku, Flash) while Ember runs on a strong one. The task router decides per task.
  • Symbol-level access. Agents use LSP go-to-definition and surgical symbol reads, not grep + cat on whole files. See code intelligence.
  • Parallelism hides latency. Three agents finishing in parallel beats one agent doing three things serially.

When dispatch happens

You do not trigger it. Forge decides when a task benefits from parallelism: multi-file refactors, research questions spanning several modules, or plan-mode execution. For one-shot questions, Forge does the work directly.

Steering mid-flight

Type while the agents are running. Your message is queued and injected at the next step. See steering.