# caw

> One interface for every coding agent — a Python library and CLI for orchestrating Claude Code, Codex, and opencode.

caw (Coding Agent Wrapper) is a Python library and CLI that wraps multiple coding-agent CLIs (Claude Code, Codex, opencode) behind one Agent / Session API, with auto-provider fallback, MCP tool servers, subagents, structured trajectories, and Docker credential management.

# Getting started

# caw

One interface for every coding agent.

**caw** (Coding Agent Wrapper) is a Python library and CLI that wraps multiple coding-agent CLIs — [Claude Code](https://docs.claude.com/en/docs/claude-code), [Codex](https://github.com/openai/codex), and [opencode](https://github.com/sst/opencode) — behind a single `Agent` / `Session` API. Swap providers without changing your code, attach MCP tool servers, capture structured trajectories, and manage credentials for Docker containers.

caw aims at the common cases with a small, ergonomic API — if you need fine-grained control over agent behavior, reach for the underlying agent SDKs; caw isn't trying to replace them.

```python
from caw import Agent

agent = Agent()  # defaults to claude_code
traj = agent.completion("Explain what this repository does")
print(traj.result)
print(f"{traj.usage.total_tokens} tokens, ${traj.usage.cost_usd:.4f}")
```

[Get started](https://zzjas.github.io/caw/getting-started/installation/index.md) [Quickstart](https://zzjas.github.io/caw/getting-started/quickstart/index.md) [API reference](https://zzjas.github.io/caw/reference/api/agent/index.md)

## Why caw

- **One API, three backends.** Pin a provider, or give caw a fallback order and let it pick whatever is installed and healthy at runtime. See [Providers](https://zzjas.github.io/caw/guides/providers/index.md) and [Auto-provider mode](https://zzjas.github.io/caw/guides/auto-provider/index.md).
- **Portable model selection.** Use [`ModelTier`](https://zzjas.github.io/caw/guides/models-and-tiers/index.md) so "strongest" and "fast" resolve per provider — no hard-coded model strings.
- **Multi-turn sessions that resume across processes.** Grab a [`resume_handle`](https://zzjas.github.io/caw/guides/resuming/index.md) and continue the conversation later, even in a different process or without a `data_dir`.
- **Tools your way.** Attach [MCP servers](https://zzjas.github.io/caw/guides/mcp-servers/index.md), define tools declaratively with [`ToolKit`](https://zzjas.github.io/caw/guides/toolkit/index.md), or register [subagents](https://zzjas.github.io/caw/guides/subagents/index.md) the parent can call.
- **Structured trajectories.** Every interaction yields a [`Trajectory`](https://zzjas.github.io/caw/getting-started/concepts/index.md) with turns, content blocks, token usage, and cost — persisted to JSONL and viewable in the [trajectory viewer](https://zzjas.github.io/caw/guides/trajectory-viewer/index.md).
- **Health checks & credentials.** Probe provider [health](https://zzjas.github.io/caw/guides/health/index.md) and bind-mount OAuth credentials into [Docker containers](https://zzjas.github.io/caw/guides/docker-credentials/index.md) without touching host files.

## Install

```bash
pip install coding-agent-wrapper
```

Requires Python 3.10+. See [Installation](https://zzjas.github.io/caw/getting-started/installation/index.md) for the CLI prerequisites and dev setup.

## For agents

These docs are also published as machine-readable [`llms.txt`](https://zzjas.github.io/caw/llms.txt) (index) and [`llms-full.txt`](https://zzjas.github.io/caw/llms-full.txt) (flattened) — handy when caw's own users are agents.

# Concepts

caw has a small, consistent vocabulary. Understanding these five types is enough to use most of the library.

## Agent

Agent is the configuration object and factory. You set the provider, model, reasoning effort, system prompt, tool permissions, MCP servers, and subagents on it, then ask it to do work. An `Agent` is cheap and reusable — it holds configuration, not a live connection.

```python
from caw import Agent

agent = Agent(provider="claude_code", model="opus", reasoning="high")
agent.set_system_prompt("You are a security reviewer.")
```

Two ways to run work:

- **`agent.completion(message)`** — one-shot. Starts a session, sends one message, ends it, and returns the Trajectory.
- **`agent.start_session()`** — opens a multi-turn [`Session`](#session).

## Session

Session is a live, stateful conversation with a provider. Each `session.send(message)` returns a [`Turn`](#turn); context carries across turns. Use it as a context manager so it's finalized (and persisted) on exit:

```python
with agent.start_session() as session:
    session.send("Remember the number 42.")
    print(session.send("What number did I tell you?").result)
```

A session can be persisted and [resumed later](https://zzjas.github.io/caw/guides/resuming/index.md) — even in another process — via its `resume_handle`.

## Trajectory

Trajectory is the complete, structured record of a session. It's what you get back from `completion()` and `session.end()`, and what gets persisted to disk.

```text
Trajectory
├── agent, model, session_id, created_at, completed_at
├── turns: list[Turn]
│   ├── input: str
│   ├── output: list[TextBlock | ThinkingBlock | ToolUse]
│   │   └── ToolUse.subagent_trajectory: Trajectory | None
│   ├── usage: UsageStats
│   └── duration_ms: int
├── usage: UsageStats        # this agent's own usage
└── total_usage: UsageStats  # own + all nested subagents (recursive)
```

Handy properties: `traj.result` (final text), `traj.num_turns`, `traj.total_tool_calls`, `traj.subagent_trajectories`, `traj.is_complete`, and `traj.is_usage_limited`.

## Turn

Turn is one request/response exchange: the user `input`, a list of `output` content blocks, the `usage` for that turn, and `duration_ms`. Convenience accessors:

- `turn.result` — the last text block.
- `turn.tool_calls` — the ToolUse blocks in this turn.

Content blocks are one of TextBlock, ThinkingBlock, or ToolUse (which may carry a nested `subagent_trajectory`).

## UsageStats

UsageStats holds `input_tokens`, `output_tokens`, cache token counts, and `cost_usd`. They add together (`a + b`), and `total_tokens` sums input + output.

The distinction that matters: `trajectory.usage` is *this* agent's own consumption, while `trajectory.total_usage` includes every nested [subagent](https://zzjas.github.io/caw/guides/subagents/index.md) recursively. For cost dashboards over many files, [`FastStats`](https://zzjas.github.io/caw/guides/display-and-logging/index.md) extracts these totals without parsing the whole trajectory.

## Putting it together

```python
traj = agent.completion("List the Python files here and count them.")

print(traj.result)                       # final answer text
print(traj.num_turns, "turn(s)")
for tc in traj.turns[-1].tool_calls:
    print("called", tc.name, "→", tc.output[:50])
print(f"${traj.total_usage.cost_usd:.4f} total")
```

# Installation

## Install caw

caw is on PyPI as `coding-agent-wrapper`. The recommended path is [uv](https://docs.astral.sh/uv/) — it's fast, reproducible, and handles Python versions for you:

```bash
uv add coding-agent-wrapper            # use as a library in a uv-managed project
uv tool install coding-agent-wrapper   # install just the CLI (caw, caw-traj) globally
```

Plain `pip` works too:

```bash
pip install coding-agent-wrapper
```

caw requires **Python 3.10+**. Import it as `caw`:

```python
import caw
from caw import Agent
```

## CLI prerequisites

caw is a *wrapper* — it drives the real coding-agent CLIs, so you need at least one of them installed and authenticated on your `PATH`:

| Provider    | CLI binary | Install docs                                  |
| ----------- | ---------- | --------------------------------------------- |
| Claude Code | `claude`   | <https://docs.claude.com/en/docs/claude-code> |
| Codex       | `codex`    | <https://github.com/openai/codex>             |
| opencode    | `opencode` | <https://github.com/sst/opencode>             |

You only need the one(s) you intend to use. Check what caw can see with:

```bash
caw doctor
```

This prints whether each CLI is installed, where its binary lives, and what caw can tell about its credentials — see [Provider health](https://zzjas.github.io/caw/guides/health/index.md).

## Development install

From a checkout of the repo, sync with [uv](https://docs.astral.sh/uv/):

```bash
uv sync --extra dev
```

Run commands inside that environment with `uv run` (e.g. `uv run pytest`, `uv run ruff check .`). If you'd rather use `pip`:

```bash
pip install -e '.[dev]'
```

### Building the docs locally

The documentation site is built with [MkDocs](https://www.mkdocs.org/) + Material. Sync the `docs` extra and serve it:

```bash
uv sync --extra docs
uv run mkdocs serve
```

Then open <http://127.0.0.1:8000>. Use `uv run mkdocs build --strict` to catch broken links and missing references the way CI does.

## Next steps

- [Quickstart](https://zzjas.github.io/caw/getting-started/quickstart/index.md) — your first agent in 60 seconds.
- [Concepts](https://zzjas.github.io/caw/getting-started/concepts/index.md) — the `Agent` / `Session` / `Trajectory` mental model.

# Quickstart

This page gets you from install to a running agent in about a minute. If you haven't yet, [install caw](https://zzjas.github.io/caw/getting-started/installation/index.md) and make sure at least one provider CLI (e.g. `claude`) is authenticated — `caw doctor` will tell you.

## 60-second example

```python
from caw import Agent

agent = Agent()  # defaults to claude_code
traj = agent.completion("Explain what this repository does")

print(traj.result)
print(f"{traj.usage.total_tokens} tokens, ${traj.usage.cost_usd:.4f}")
```

`Agent.completion()` runs a single message and returns the complete [`Trajectory`](https://zzjas.github.io/caw/getting-started/concepts/index.md) — `traj.result` is the final text, and `traj.usage` carries the token counts and cost.

## Multi-turn session

When you need follow-up turns that share context, open a [session](https://zzjas.github.io/caw/guides/sessions/index.md):

```python
from caw import Agent

agent = Agent(provider="claude_code", model="opus", reasoning="high")
agent.set_system_prompt("You are a security reviewer.")

with agent.start_session() as session:
    print(session.send("Review src/auth.py for vulnerabilities").result)
    print(session.send("Now check src/api.py").result)
# session.end() runs on context-manager exit and returns the full Trajectory
```

This is the runnable [`examples/basic.py`](https://github.com/zzjas/caw/blob/main/examples/basic.py):

```python
"""Basic usage: single turn, tool use, and multi-turn sessions."""

import os

os.environ["CAW_LOG"] = "full"

from caw import Agent


def main():
    agent = Agent(data_dir="caw_data")

    print("=== Single turn ===")
    with agent.start_session() as session:
        session.send("What is 2 + 2? Answer in one sentence.")
        print()

    print("=== Tool use turn ===")
    with agent.start_session() as session:
        session.send("List files in the current directory.")
        print()

    print("=== Multi-turn ===")
    with agent.start_session() as session:
        session.send("Remember the number 42.")
        session.send("What number did I just tell you?")

        traj = session.trajectory
        print(f"\nTurns: {traj.num_turns}")
        print(f"Total tool calls: {traj.total_tool_calls}")
        print(f"Total tokens: {traj.usage.total_tokens}")


if __name__ == "__main__":
    main()
```

## Swap providers without changing code

The same code runs against any backend. Pin one explicitly:

```python
agent = Agent(provider="codex")
```

…or give caw a fallback order and let it use whatever is installed and healthy at runtime:

```python
agent = Agent(provider=["claude", "codex", "opencode"])
traj = agent.completion("Reply with a one-line hello.")
print(f"[{traj.agent}] {traj.result}")  # whichever provider handled it
```

See [Auto-provider mode](https://zzjas.github.io/caw/guides/auto-provider/index.md) for the full fallback semantics.

## Where to go next

- [Concepts](https://zzjas.github.io/caw/getting-started/concepts/index.md) — what a `Trajectory` contains and how usage rolls up.
- [Providers](https://zzjas.github.io/caw/guides/providers/index.md) — the three backends and how to switch.
- [Sessions](https://zzjas.github.io/caw/guides/sessions/index.md) and [Resuming](https://zzjas.github.io/caw/guides/resuming/index.md) — multi-turn and cross-process conversations.
- [ToolKit](https://zzjas.github.io/caw/guides/toolkit/index.md), [MCP servers](https://zzjas.github.io/caw/guides/mcp-servers/index.md), and [Subagents](https://zzjas.github.io/caw/guides/subagents/index.md) — give the agent tools.
# Guides

# Auto-provider mode

Don't want to hard-code one provider? Give caw a **fallback order** and let it use whatever is available at runtime. caw selects the first *installed* provider and, on the first send, transparently moves to the next one if that provider fails (CLI missing, auth expired) or is rate-limited — no exception handling or provider-picking on your side.

```python
import caw
from caw import Agent

caw.set_provider_order(["claude", "codex", "opencode"])  # set once, globally

agent = Agent(provider="auto")            # uses the global order
traj = agent.completion("Explain this repo")
print(f"[{traj.agent}] {traj.result}")    # whichever provider handled it
```

## Where the order comes from

In priority order (highest first):

```python
Agent(provider=["claude", "codex"])              # explicit per-agent order
caw.set_provider_order([...])                     # global default, used by provider="auto"
os.environ["CAW_PROVIDER"] = "claude,codex,opencode"  # env var, comma list
```

A single name (`provider="claude"`) stays **pinned** — no fallback. Use a list or `"auto"` to opt into fallback.

## How fallback works

1. **Selection** is a fast, no-network check: caw picks the first provider in the order whose CLI binary is installed. This is what `agent.provider` reports.
1. **On the first `send()`**, if the chosen provider raises (missing CLI, auth error) or reports a usage limit, caw silently builds the next provider's session and retries.
1. **Once a provider produces the first turn, the session is committed to it.** Conversation context can't move across CLIs mid-stream, so any later failure propagates normally.

Prefer a `ModelTier` in auto mode

Use a [`ModelTier`](https://zzjas.github.io/caw/guides/models-and-tiers/index.md) (or no model) rather than a concrete model string. Tiers are re-resolved per provider, so model selection stays portable across the fallback. A bare concrete model string is **dropped** when falling back to a different provider — it would be meaningless to the others.

## A model per provider in the order

To pin a *specific* model to each provider in the order, attach it to `set_provider_order` — as `(name, model)` tuples or a `models=` mapping. Each value may be a concrete string or a `ModelTier`. Because the model is bound to its provider, it is honored even when that provider is reached as a fallback (unlike a bare Agent-level string):

```python
import caw
from caw import Agent, ModelTier

caw.set_provider_order([
    ("claude", ModelTier.STRONGEST),   # re-resolved via the claude tier config
    ("codex", "gpt-5.5"),              # concrete, bound to codex
    ("opencode", "openai/gpt-5.5"),
])
# Equivalent: caw.set_provider_order(["claude", "codex"], models={"codex": "gpt-5.5"})

caw.get_provider_models()             # {'claude': <ModelTier.STRONGEST>, 'codex': 'gpt-5.5', ...}

Agent(provider="auto")                # each provider uses its attached model
```

A provider's order-model applies only when the `Agent` sets no `model` of its own — an explicit `model=` (or `CAW_MODEL`) on the `Agent` always wins.

## Inspecting the selection

Pair auto-provider with [`check_providers()`](https://zzjas.github.io/caw/guides/health/index.md) to see what's installed before you commit:

```python
from caw import Agent, check_providers

for h in check_providers(["claude", "codex", "opencode"]):
    print("✓" if h.installed else "✗", h.provider)

print("selected:", Agent(provider="auto").provider.name)
```

## Full example

[`examples/auto_provider.py`](https://github.com/zzjas/caw/blob/main/examples/auto_provider.py):

```python
"""Auto-provider mode: "use whatever provider is available right now".

Set a fallback order once (globally or per-agent). caw selects the first
*installed* provider and, on the first send, transparently moves to the next
one if that provider fails (CLI missing, auth expired) or is rate-limited —
your code never has to catch an exception or pick a provider by hand.

The order can come from (highest priority first):

* an explicit list on the Agent: ``Agent(provider=["claude", "codex"])``
* the global setting: ``caw.set_provider_order([...])`` with ``provider="auto"``
* the ``CAW_PROVIDER`` env var as a comma list: ``CAW_PROVIDER="claude,codex"``

Tip: in auto mode use a ``ModelTier`` (or no model) rather than a concrete model
string — tiers are re-resolved per provider, so model selection stays portable
across the fallback. A bare concrete model string is dropped when falling back to
a different provider (it would be meaningless to the others). If you want a
specific model *per provider* that survives fallback, attach it to the order
itself with ``set_provider_order([(name, model), ...])`` (see below).
"""

import caw
from caw import Agent, ModelTier, check_providers


def main():
    # --- Global order, used by provider="auto" (or by Agent() with no provider). ---
    caw.set_provider_order(["opencode", "codex", "claude"])

    agent = Agent(provider="auto", model=ModelTier.STRONGEST)

    # Selection is a fast, no-network check — the first installed provider wins.
    print("Installed providers, in order:")
    for h in check_providers(["claude", "codex", "opencode"]):
        mark = "✓" if h.installed else "✗"
        print(f"  [{mark}] {h.provider}")
    print(f"\nAuto-selected provider: {agent.provider.name}\n")

    # Just use it. If the selected provider errors or is rate-limited on the
    # first send, caw silently falls back to the next installed one in the order.
    traj = agent.completion("Reply with a one-line hello and include your model name and agent harness name.")
    print(f"Provider used: {traj.agent}")
    print(f"Agent reply: {traj.result}")

    # --- Per-agent order (overrides the global setting). ---
    other = Agent(provider=["codex", "claude"])
    print(f"\nPer-agent order selected: {other.provider.name}")

    # --- Per-provider models in the order. ---
    # Attach a model to each provider: a ModelTier (re-resolved per provider) or a
    # concrete string. Unlike a bare Agent-level model string, these are bound to
    # their provider, so each is used even when reached as a fallback. Applied only
    # when the Agent sets no model of its own.
    caw.set_provider_order([("opencode", "openai/gpt-5.5"), ("codex", ModelTier.STRONGEST), ("claude", "opus")])
    pinned = Agent(provider="auto")
    print(f"Per-provider models: {caw.get_provider_models()}")
    print(f"Pinned-order selected: {pinned.provider.name}")


if __name__ == "__main__":
    main()
```

# Display & logging

caw can stream a live, human-readable view of what the agent is doing, and/or emit a one-line summary of every event to a logger of your choice.

## Console display

The Display class renders agent events (user messages, thinking, text, tool calls and results, end-of-turn stats) to the terminal with Rich. It has four DisplayModes:

| Mode     | Shows                             |
| -------- | --------------------------------- |
| `FULL`   | Everything, untruncated           |
| `SHORT`  | Truncated previews (the default)  |
| `RESULT` | Only the final result, in a panel |
| `OFF`    | Nothing                           |

The simplest way to control it is the `CAW_LOG` environment variable, which the global display reads on first use:

```python
import os
os.environ["CAW_LOG"] = "full"   # or "short", "result", "off"

from caw import Agent
Agent().completion("Hello")      # streams output per CAW_LOG
```

Programmatically:

```python
from caw import Display, DisplayMode, set_global_display

set_global_display(Display(mode=DisplayMode.RESULT))
```

get_global_display() returns the current instance (creating one from `CAW_LOG` on first call); set_global_display(None) silences it.

## Structured logging

Separately from the pretty console output, you can route a **one-line summary per event** to any logger via AgentLogger. Any object with `info` / `warn` / `error` string methods satisfies the protocol — including the stdlib `logging.Logger` (after a tiny `warn`→`warning` adapter) and your own sinks (e.g. a Redis logger).

```python
import logging
from caw import Agent

logging.basicConfig(level=logging.INFO)
log = logging.getLogger("agent")
log.warn = log.warning  # satisfy the AgentLogger protocol

agent = Agent(logger=log)         # or agent.start_session(logger=log)
agent.completion("List the files here")
```

Every major event — user message, tool call, tool result, assistant text, thinking, turn-end stats — is emitted as a compact line, in addition to the console `Display`.

## FastStats

FastStats extracts the frequently-needed header/footer fields (cost, model, timestamps, token totals) from a trajectory file by reading only its head and tail — roughly 3× faster than a full parse on small files and 25×+ on multi-MB ones, with a full-parse fallback when the fast path doesn't apply.

```python
from caw import FastStats

stats = FastStats.from_path("caw_data/sessions/<id>/trajectory.json")
print(stats.model, stats.cost_usd, stats.total_tokens)

# Aggregate across a directory:
total = FastStats.directory_total_cost("caw_data")
```

See the [Display & logging API reference](https://zzjas.github.io/caw/reference/api/display/index.md) for the full surface.

# Docker credentials

Coding agents store OAuth credentials in home-directory files (e.g. `~/.claude/.credentials.json`). When you run an agent **inside a Docker container**, token refresh creates new tokens (OAuth rotation) and invalidates the host's tokens. `caw auth` solves this **without modifying host files**: it bind-mounts the host credentials into the container and runs an inotify-based guard that keeps the container's copy and the bind-mounted host file in sync, both directions.

```bash
caw auth setup                        # snapshot configs, write mount manifest
caw auth status                       # token expiry, last modified, mount flags
docker run $(caw auth docker-flags) -v ./project:/work my-image
caw auth teardown                     # rm -rf ~/.caw/auth/  (host files untouched)
```

## How it works

```text
HOST:                                         CONTAINER:

~/.claude/.credentials.json  ←—docker bind—→  /tmp/caw_auth/claude/credentials.json
    (untouched, real file)                         ↓ copy + inotify sync
                                               /home/playground/.claude/.credentials.json
```

`~/.caw/auth/` only holds things caw legitimately owns: the manifest, the container setup script, and cleaned/stripped configs. Credentials stay at their original host paths.

## Commands

### `caw auth setup`

Reads credentials and configs from the host, validates them, writes cleaned configs and a credential snapshot into `~/.caw/auth/`, and generates `manifest.json` + `setup-container.sh`. Host credential files are **read but never modified**.

- **Credential files** (tokens, OAuth) — `strategy: bind`. Bind-mounted from the host into the container at run time; the container-side guard copies them to the user's home and keeps them in sync.
- **Config files** (`.claude.json`, `config.toml`) — `strategy: copy`. Cleaned/stripped for containers and shipped in the staging directory.

```bash
caw auth setup                        # all detected agents
caw auth setup --agents claude codex  # specific agents only
```

### `caw auth status`

Shows a table with each managed file, where its source of truth lives (host for bind, staged for copy), last-modified time, and token expiry for credential files. Credential freshness is read from the host file directly.

### `caw auth docker-flags`

Emits one directory mount for the staging area plus one file mount per credential:

```bash
$ caw auth docker-flags
-v /home/user/.caw/auth:/tmp/caw_auth:rw \
-v /home/user/.claude/.credentials.json:/tmp/caw_auth/claude/credentials.json:rw \
-v /home/user/.codex/auth.json:/tmp/caw_auth/codex/auth.json:rw
```

Command substitution (`$(caw auth docker-flags)`) expands these into separate `docker run` arguments.

### `caw auth teardown`

Removes `~/.caw/auth/`. Host credential files are never involved. It refuses to run if a host credential file is still a symlink into the auth dir (leftover from caw's old symlink-based design); pass `--force` to override — but you'll have to re-authenticate every agent.

## Container setup

The generated `setup-container.sh` runs inside the container (called from your entrypoint). It reads `manifest.json`, copies credentials and configs into the container user's home, and starts a bidirectional inotify guard for credential sync.

```bash
# In your entrypoint.sh:
if [ -f /tmp/caw_auth/setup-container.sh ]; then
    /tmp/caw_auth/setup-container.sh /tmp/caw_auth /home/playground playground
fi
```

The guard runs as root and uses plain `cp` (no `--preserve`, no `chown` on the mount side), so writes back to the host file preserve the host user's uid/gid/mode on the real inode. Requires `jq` in the container image; `inotify-tools` is installed automatically if not present.

## Supported agents

| Agent       | Credential files            | Config files                                |
| ----------- | --------------------------- | ------------------------------------------- |
| Claude Code | `.claude/.credentials.json` | `.claude.json` (stripped to essential keys) |
| Codex       | `.codex/auth.json`          | `.codex/config.toml` (local trust removed)  |

## Programmatic API

The same operations are available in Python:

```python
from caw.auth import setup, teardown, get_status, get_docker_flags

setup(agents=["claude"])
statuses = get_status()
flags = get_docker_flags()
teardown()
```

These are also re-exported at the top level as auth_setup, auth_get_status, and auth_get_docker_flags. Full example — [`examples/auth.py`](https://github.com/zzjas/caw/blob/main/examples/auth.py):

```python
"""Programmatic auth: set up credentials, check status, and get docker flags."""

from pathlib import Path

from caw.auth import setup, get_docker_flags, get_status


def main():
    # Set up credentials to a custom directory (instead of ~/.caw/auth)
    auth_dir = Path("./my_project_auth")
    print("=== Setting up credentials ===")
    setup(agents=["all"], dest_dir=auth_dir)

    # Check status of all collected auth files
    print("\n=== Auth file status ===")
    statuses = get_status(auth_dir=auth_dir)
    for s in statuses:
        print(f"  {s.agent}/{s.file}: type={s.type}, strategy={s.strategy}, exists={s.exists}")
        if s.token_expiry:
            print(f"    token: {s.token_expiry}")

    # Get docker volume flag for mounting auth into a container
    print("\n=== Docker flags ===")
    flags = get_docker_flags(auth_dir=auth_dir)
    print(f"  {flags}")

    # Use it to construct a docker command
    docker_cmd = f"docker run {flags} my-agent-image"
    print(f"\n  Full command: {docker_cmd}")


if __name__ == "__main__":
    main()
```

See the [Auth API reference](https://zzjas.github.io/caw/reference/api/auth/index.md) for full signatures.

## Known limitations

- **OAuth token rotation.** A refresh returns a new refresh token, invalidating the old one. If two processes refresh simultaneously, one gets an invalid token. Don't run the same agent identity in two places at once.
- **Atomic rewrites.** If an agent refreshes by writing a temp file and `rename(2)`-ing it over the credential, a single-file bind mount detaches from the new inode. If that becomes a problem for a given agent, switch its bind to a directory bind (mount the parent directory), which survives renames.

# Provider health

Check whether a provider is set up correctly — without committing to using it. caw reports **raw signals** and forms no "available" verdict, so you write your own predicate. The default check is fast and free; a live probe round-trips a tiny request to confirm the provider responds and isn't rate-limited.

```python
from caw import Agent, check_providers

for h in check_providers():               # fast: no network, no token cost
    print(h.provider, h.installed, h.binary_path,
          h.auth.detail if h.auth else None)
```

## Compose your own "available"

Because caw gives you signals rather than a verdict, you decide what "usable" means:

```python
usable = [h.provider for h in check_providers()
          if h.installed and not (h.auth and h.auth.token_expired)]
```

## Two depths of check

- **Fast (default).** Is the CLI installed, where is its binary, and what can caw cheaply tell about its credentials? No network, no token cost — safe at startup.
- **Live (`live=True`).** Additionally round-trips a minimal prompt to confirm the provider actually responds and whether it's currently rate-limited. **Costs one probe request per provider.**

```python
h = Agent(provider="codex").check_health(live=True)
if h.rate_limited:
    print(f"codex rate-limited, ~{h.wait_minutes}m until reset")
```

## What `ProviderHealth` exposes

ProviderHealth carries:

| Field          | Meaning                                          |
| -------------- | ------------------------------------------------ |
| `provider`     | Canonical provider name                          |
| `installed`    | CLI binary found on `PATH` (or a known fallback) |
| `binary_path`  | Resolved path to the CLI, or `None`              |
| `auth`         | An AuthSignal, or `None` if not introspectable   |
| `probed`       | Whether a live round-trip was attempted          |
| `rate_limited` | From the probe; `None` if not probed             |
| `wait_minutes` | Estimated minutes until the limit resets         |
| `error`        | Exception text if the live probe failed          |

AuthSignal adds `present`, `detail`, `credentials_path`, `token_expires_at`, and `token_expired`. Every field is a *signal* — `None` means "couldn't determine", not a negative result — so treat a falsy `present` as a hint, not a verdict.

## From the CLI

`caw doctor` prints the same signals as a table:

```bash
caw doctor            # fast: installed + credential signals (no token cost)
caw doctor --live     # also probe each provider (costs one request each)
```

## Full example

[`examples/health.py`](https://github.com/zzjas/caw/blob/main/examples/health.py):

```python
"""Checking provider health / availability at runtime.

caw reports *raw signals* about each provider — it forms no verdict on what
counts as "available", so you compose your own predicate from the fields.

Two depths of check:

* **fast** (default): is the CLI installed, and what can we cheaply learn about
  its credentials?  No network, no token cost — safe to call at startup.
* **live** (``live=True``): also round-trips a tiny probe to confirm the
  provider responds and isn't rate-limited.  Costs one request per provider.

See also the ``caw doctor`` CLI command, which prints this as a table.
"""

from caw import Agent, check_providers


def main():
    # --- Fast sweep over every registered provider (no token cost). ---
    print("Provider health (fast check):\n")
    for h in check_providers():
        auth = h.auth.detail if h.auth else "unknown"
        print(f"  {h.provider:12} installed={h.installed!s:5} auth={auth}")

    # Compose your own "available" predicate from the raw signals. Here:
    # installed, and (if we can tell) the credential token isn't expired.
    def is_usable(h) -> bool:
        return h.installed and not (h.auth and h.auth.token_expired)

    usable = [h.provider for h in check_providers() if is_usable(h)]
    print(f"\nUsable right now (installed + non-expired creds): {usable}")

    # --- Health of a specific agent's provider. ---
    health = Agent(provider="claude_code").check_health()
    print(f"\nclaude_code → installed={health.installed} binary={health.binary_path}")

    # --- Live probe (uncomment to actually round-trip; costs a request each). ---
    # for h in check_providers(["claude", "codex"], live=True):
    #     if h.rate_limited:
    #         print(f"  {h.provider}: rate-limited, ~{h.wait_minutes}m until reset")
    #     elif h.error:
    #         print(f"  {h.provider}: probe failed: {h.error}")
    #     else:
    #         print(f"  {h.provider}: responds OK")


if __name__ == "__main__":
    main()
```

# MCP servers

[MCP](https://modelcontextprotocol.io/) servers let the agent call external tools. caw supports both stdio and HTTP transports via the MCPServer config.

## Attaching an MCP server

```python
from caw import Agent, MCPServer

agent = Agent()
agent.add_mcp_server(MCPServer(
    name="my_db",
    command="python",
    args=["-m", "my_mcp_server"],
))

traj = agent.completion("Use the my_db tools to count rows in the users table.")
```

## stdio vs. HTTP transport

`MCPServer` picks the transport from which fields you set:

- **stdio** — set `command`, `args`, and optionally `env`. caw launches the server as a subprocess and speaks MCP over its stdin/stdout.
- **HTTP** — set `url`. `command`/`args`/`env` are ignored.

```python
# stdio
MCPServer(name="fs", command="npx", args=["-y", "@modelcontextprotocol/server-filesystem", "/data"])

# HTTP
MCPServer(name="remote", url="http://localhost:9000/mcp")
```

You can register multiple servers; configured servers are recorded on the trajectory (`trajectory.mcp_servers`).

## Defining tools in Python instead

If you'd rather write tools as Python functions or classes than run a separate MCP server, caw spins one up for you:

- **[`ToolKit`](https://zzjas.github.io/caw/guides/toolkit/index.md)** — declarative tool classes with `@tool` methods, served over a managed HTTP MCP server.
- **Stateless functions** — pass plain `@tool`-decorated functions via `stateless_tools=`.

Both are MCP under the hood; the difference is you don't author or run the server yourself.

## API

See MCPServer and the lower-level [Tools & MCP reference](https://zzjas.github.io/caw/reference/api/toolkit-mcp/index.md) for `MCPServerHandle`, `create_mcp_http_server_bundle`, and the `mcp_tool` decorator.

# Models & tiers

You can pick a model two ways: a **concrete model string** (provider-specific) or an abstract ModelTier that each provider maps to its own model.

## Concrete model strings

Pass whatever the backend understands:

```python
from caw import Agent

agent = Agent(provider="claude_code", model="opus")
agent = Agent(provider="codex", model="gpt-5.5")
```

A concrete string is tied to one provider. In an [auto-provider](https://zzjas.github.io/caw/guides/auto-provider/index.md) order it is dropped on fallback (the next provider wouldn't recognize it).

## Model tiers (portable)

ModelTier expresses intent — "give me the strongest" or "give me the fast/cheap one" — and each provider resolves it to a concrete model:

```python
from caw import Agent, ModelTier

agent = Agent(model=ModelTier.STRONGEST)  # provider picks its best model
agent = Agent(model=ModelTier.FAST)       # provider picks its fast model
```

| Tier                  | Meaning              | Example (Claude Code) | Example (Codex)       |
| --------------------- | -------------------- | --------------------- | --------------------- |
| `ModelTier.STRONGEST` | Best available model | `opus`                | provider default      |
| `ModelTier.FAST`      | Cheapest / fastest   | `claude-haiku-4-5`    | `gpt-5.3-codex-spark` |

Because tiers re-resolve per provider, they're the right choice whenever you use a fallback order. See [`examples/model_tiers.py`](https://github.com/zzjas/caw/blob/main/examples/model_tiers.py):

```python
"""Model tiers: use provider-agnostic model selection."""

import os

os.environ["CAW_LOG"] = "full"

from caw import Agent, ModelTier


def main():
    # Use the fast model tier — each provider maps this to its cheapest/fastest model.
    # Claude Code: claude-haiku-4-5-20251001, Codex: gpt-5.3-codex-spark
    agent = Agent(model=ModelTier.FAST, data_dir="caw_data")

    traj = agent.completion("What model are you? Answer in one sentence.")
    print(traj.result)
    print(f"\nmodel: {traj.model}")
    print(f"tokens: {traj.usage.total_tokens}")


if __name__ == "__main__":
    main()
```

## Configuring tier defaults

The model each tier maps to is **not** hardcoded — it's resolved from config that lives under `~/.caw/` (override the base dir with `CAW_HOME`). Inspect and edit it with the `caw config` CLI:

```console
$ caw config list                              # effective model per provider/tier, with source
$ caw config set opencode strongest openai/gpt-5.5-turbo
$ caw config get opencode strongest
$ caw config unset opencode strongest          # revert to the shipped default
$ caw config path                              # where ~/.caw/config.json lives
$ caw config refresh                           # re-fetch the shipped defaults now
```

`caw config set` writes to `~/.caw/config.json`; only the keys you override are stored, and everything else falls through to the shipped defaults.

For a given provider/tier, the model is resolved in this order (highest first):

1. The provider env var — `ANTHROPIC_MODEL` / `ANTHROPIC_SMALL_FAST_MODEL`, `OPENCODE_MODEL` / `OPENCODE_SMALL_FAST_MODEL`
1. **User config** `~/.caw/config.json` (what `caw config set` edits)
1. **Remote defaults** fetched from `CAW_DEFAULTS_URL` and cached under `~/.caw/cache/`
1. **Baked-in defaults** shipped in the wheel (the offline floor)

An explicit `model=` on the `Agent` (or `CAW_MODEL`) bypasses tier resolution entirely.

### Updatable shipped defaults

The shipped defaults are served from a JSON file in the repo ([`caw/defaults/models.json`](https://github.com/zzjas/caw/blob/main/caw/defaults/models.json)), which is both bundled in the wheel and published at the default `CAW_DEFAULTS_URL`. caw fetches it lazily and caches it for `CAW_DEFAULTS_TTL` seconds (default 24h). That means the default models can be **updated without cutting a release** — edit that file on `main` and every install picks it up within the TTL (or immediately with `caw config refresh`). Set `CAW_DEFAULTS_URL=off` to disable network fetches and pin to the baked-in defaults.

## Reasoning effort

Set the reasoning budget with `reasoning=` (`"high"`, `"medium"`, `"low"`) at construction or later:

```python
agent = Agent(model="opus", reasoning="high")
agent.set_reasoning("medium")
```

Both model and reasoning have environment-variable fallbacks — `CAW_MODEL` and `CAW_EFFORT` — so you can configure them without touching code. See [Environment variables](https://zzjas.github.io/caw/reference/environment/index.md).

## Asking a sub-task to use a cheaper model

You can also steer the *agent itself* to use a cheaper model for exploratory sub-steps, as in [`examples/haiku.py`](https://github.com/zzjas/caw/blob/main/examples/haiku.py):

```python
import os

os.environ["CAW_LOG"] = "full"

from caw import Agent

if __name__ == "__main__":
    agent = Agent(
        system_prompt="You are a software engineer.",
        data_dir="caw_data",
    )

    with agent.start_session() as session:
        session.send(
            "Can you explore the codebase at caw/auth and tell me what it does in five sentences? If possible, do the exploration use Haiku model"
        )
```

# Persistence

Pass `data_dir=` to an `Agent` and every session is persisted to disk — an incremental JSONL event log plus a full trajectory snapshot, updated after each turn.

```python
from caw import Agent

agent = Agent(data_dir="caw_data")
with agent.start_session() as session:
    session.send("Remember the number 42.")
    session.send("What number did I just tell you?")
```

Without a `data_dir`, sessions run in memory and nothing is written.

## On-disk layout

SessionStore writes one directory per session:

```text
<data_dir>/sessions/<session_id>/
    traj.jsonl          # incremental append-only event log
    trajectory.json     # full trajectory, overwritten after each turn
    turns/
        000_input.txt
        000_raw_output.jsonl
        ...
```

- `traj.jsonl` is a per-event stream (metadata, user, thinking, text, tool_call, tool_result, turn_end) written via JsonlWriter, with file locking for concurrent safety. Subagent events are tagged with their name.
- `trajectory.json` is the complete Trajectory as JSON — the same object you get from `session.end()`. It's overwritten after every turn so a crash still leaves the latest snapshot.
- `turns/NNN_*` keep the raw input and the backend's raw output per turn.

## Loading trajectories back

Read a saved trajectory into a read-only session:

```python
from caw import Session

session = Session.load_trajectory("caw_data/sessions/<id>/trajectory.json")
print(session.trajectory.result)
```

Or save the current one anywhere:

```python
session.save_trajectory("/tmp/run.json")
```

You can also point `start_session(traj_path=...)` at a file to have caw write the trajectory there after each step in addition to (or instead of) the `data_dir` layout.

## Fast stats over many trajectories

For dashboards, spend limiters, and list views you usually only want the header/footer fields (cost, model, timestamps, token totals). [`FastStats`](https://zzjas.github.io/caw/guides/display-and-logging/#faststats) reads just the head and tail of each file — far faster than a full parse:

```python
from caw import FastStats

total = FastStats.directory_total_cost("caw_data")
for s in FastStats.iter_directory("caw_data"):
    print(s.model, f"${s.cost_usd:.4f}", s.total_tokens)
```

## `data_dir` and resuming

`data_dir` is also what lets a resumed session restore its prior trajectory and append new turns to the original directory — see [Resuming sessions](https://zzjas.github.io/caw/guides/resuming/index.md) for the full with/without-`data_dir` matrix.

# Providers

caw wraps three coding-agent CLIs behind one interface. A *provider* is the backend that actually runs the model.

| Provider    | CLI binary | Provider name(s)              |
| ----------- | ---------- | ----------------------------- |
| Claude Code | `claude`   | `claude_code`, `claude`, `cc` |
| Codex       | `codex`    | `codex`                       |
| opencode    | `opencode` | `opencode`                    |

The names (and aliases) are what you pass as `provider=`. All three aliases for Claude Code resolve to the same backend.

## Choosing a provider

There are three places to set the provider, in priority order:

```python
import os
from caw import Agent

# 1. Constructor (highest priority)
agent = Agent(provider="codex")

# 2. Environment variable
os.environ["CAW_PROVIDER"] = "codex"

# 3. At runtime, before starting a session
agent.set_provider("codex")
```

With no provider set anywhere, caw defaults to `claude_code`.

## Pinned vs. fallback

The `provider` argument accepts more than a single name:

- **A single name** (`provider="claude"`) — *pinned*. caw uses exactly that backend and surfaces its errors directly.
- **A list** (`provider=["claude", "codex", "opencode"]`) — a *fallback order*. caw selects the first installed provider and, on the first send, transparently moves to the next if one fails or is rate-limited.
- **`"auto"`** (or omitted) — use the global order from set_provider_order(), else `CAW_PROVIDER`, else the default.

The list and `"auto"` forms are covered in detail in [Auto-provider mode](https://zzjas.github.io/caw/guides/auto-provider/index.md).

## Model selection across providers

A concrete model string (`model="opus"`) is provider-specific. To stay portable when you might fall back to a different provider, prefer a [`ModelTier`](https://zzjas.github.io/caw/guides/models-and-tiers/index.md):

```python
from caw import Agent, ModelTier

agent = Agent(provider=["claude", "codex"], model=ModelTier.STRONGEST)
```

Tiers are re-resolved per provider, so `STRONGEST`/`FAST` always map to that backend's best or fastest model.

## Registering a custom provider

Providers are looked up in a registry. To add your own backend, subclass Provider / ProviderSession and register it:

```python
import caw

caw.register_provider("my_backend", MyProvider)
agent = caw.Agent(provider="my_backend")
```

See the [Providers API reference](https://zzjas.github.io/caw/reference/api/providers/index.md) for the full abstract interface a provider must implement (`start_session`, `resolve_model`, `resolve_tool_restrictions`, health hooks, and resume support).

# Resuming sessions

caw can resume a conversation later — in the same process, after a restart, or in a completely different process. Grab a `resume_handle` (a string), store it anywhere (a database, a file, a queue), and resume from it.

```python
# Process 1: start, communicate, persist the handle.
agent = Agent(provider="claude_code")
session = agent.start_session()
session.send("My deploy target is staging-eu. Remember that.")
handle = session.resume_handle          # store this string
session.end()

# Process 2 (later, after a restart): resume by handle.
agent = Agent(provider="claude_code")
session = agent.resume_session(handle)
print(session.send("Where am I deploying?").result)   # -> "staging-eu"
session.end()
```

## The handle is self-contained

The handle is a JSON string carrying the backend's own resume key, so resuming works even with **no `data_dir`** — the underlying CLI still has the conversation:

```json
{"version": 1, "provider": "claude_code", "session_id": "bd260210-…", "resume_key": "bd260210-…"}
```

`resume_key` is Claude's session id, Codex's `thread_id`, or opencode's session id — for codex/opencode it differs from `session_id`. Resuming works across all three providers.

Treat the handle like a secret

The handle grants resume access to the conversation — it is not an opaque random id. Store it with the same care as a credential.

Send before reading the handle

The backend assigns its resume key on the first exchange, so send at least one message before reading `session.resume_handle`. Reading it earlier raises.

## `data_dir` is optional and additive

Whether you resume with or without the original `data_dir` changes only how much *caw-side* history you get back — the backend conversation resumes either way:

|                      | without `data_dir` | with the original `data_dir`         |
| -------------------- | ------------------ | ------------------------------------ |
| backend conversation | resumed            | resumed                              |
| caw trajectory       | starts empty       | full history restored                |
| new turns            | not persisted      | appended to the original session dir |

A bare session id is also accepted in place of a full handle, but only when `data_dir` is set (the resume key is then read from disk).

## Full example

[`examples/resume.py`](https://github.com/zzjas/caw/blob/main/examples/resume.py) shows both the with- and without-`data_dir` paths:

```python
"""Resuming a session across processes.

Start a session, store its ``resume_handle`` (a string), then resume the
conversation later with a brand-new Agent — as if a new process picked it up.

The handle is self-contained, so resume works even without a ``data_dir``
(the backend CLI still has the conversation). With a ``data_dir`` you also get
the full trajectory restored and new turns appended. This example shows both.
"""

import os

os.environ["CAW_LOG"] = "full"

from caw import Agent

DATA_DIR = "caw_data"


def main():
    # --- "Process 1": start a session and grab a handle to store somewhere. ---
    agent = Agent(data_dir=DATA_DIR)
    session = agent.start_session()
    session.send("My favorite number is 42. Acknowledge it.")
    handle = session.resume_handle
    session.end()

    print(f"\nStored resume_handle: {handle}\n")
    # In a real app you'd persist `handle` (DB, file, queue) and exit here.

    # --- "Process 2a": a fresh Agent WITH the same data_dir restores history. ---
    agent2 = Agent(data_dir=DATA_DIR)
    resumed = agent2.resume_session(handle)
    turn = resumed.send("What is my favorite number?")
    resumed.end()
    print(f"\n[with data_dir]    recalled: {turn.result!r}")
    print(f"[with data_dir]    turns in trajectory: {resumed.trajectory.num_turns}")

    # --- "Process 2b": a fresh Agent with NO data_dir still resumes the chat. ---
    agent3 = Agent(data_dir=None)
    resumed2 = agent3.resume_session(handle)
    turn2 = resumed2.send("And what is my favorite number again?")
    resumed2.end()
    print(f"\n[without data_dir] recalled: {turn2.result!r}")
    # Trajectory starts empty here, so only this turn is recorded.
    print(f"[without data_dir] turns in trajectory: {resumed2.trajectory.num_turns}")


if __name__ == "__main__":
    main()
```

# Sessions

A Session is a live, multi-turn conversation. Open one with `agent.start_session()`, send messages, and the context carries across turns.

```python
from caw import Agent

agent = Agent(provider="claude_code", model="opus", reasoning="high")
agent.set_system_prompt("You are a security reviewer.")

with agent.start_session() as session:
    turn1 = session.send("Review src/auth.py for vulnerabilities")
    print(turn1.result)

    turn2 = session.send("Now check src/api.py")
    print(turn2.result)
# session.end() runs on exit and returns the full Trajectory
```

Using the session as a context manager is the easy path — `__exit__` calls session.end(), which finalizes the trajectory, persists it, and stops any tool servers. If you don't use `with`, call `session.end()` yourself.

## One-shot vs. session

For a single message, `agent.completion(message)` is a convenience wrapper that opens a session, sends once, and ends it:

```python
traj = agent.completion("Explain this code")
print(traj.result)
```

## Inspecting progress mid-session

`session.trajectory` is available during the session, not just after:

```python
with agent.start_session() as session:
    session.send("Remember the number 42.")
    session.send("What number did I just tell you?")

    traj = session.trajectory
    print(f"Turns: {traj.num_turns}")
    print(f"Total tool calls: {traj.total_tool_calls}")
    print(f"Total tokens: {traj.usage.total_tokens}")
```

## Async sends

`send_async()` runs the blocking send in a thread and processes overlapping calls in FIFO order, so you can do async work while a turn is in flight:

```python
import asyncio

task = asyncio.create_task(session.send_async(prompt))
while not task.done():
    # ... do other async work ...
    await asyncio.sleep(0.5)
turn = await task
```

## Interactive mode

`agent.interactive(prompt)` hands control to the user's terminal — stdin/stdout/stderr are inherited so the user talks to the agent directly, while caw captures a copy of the output. All three providers support it (Claude Code, Codex, and opencode), each launching its own full-screen TUI with your initial prompt.

Pass `select_provider=True` to choose which backend to launch at runtime: caw shows an arrow-key menu of the *installed* providers (↑/↓ to move, `Enter` to choose, `q`/`Esc` to cancel) and launches the one you pick, ignoring the agent's configured provider. Cancelling the menu returns an `InteractiveResult` with exit code `130` without launching anything. `caw.installed_providers()` exposes the same list (name + provider) for your own menus.

See [`examples/interactive.py`](https://github.com/zzjas/caw/blob/main/examples/interactive.py):

```python
"""Interactive mode — launch the agent and let the user take over.

Pass ``select_provider=True`` to pick which installed provider to launch from
an arrow-key menu (↑/↓ to move, Enter to choose, q/Esc to cancel) instead of
using the agent's configured provider.
"""

import sys

from caw import Agent


def main():
    # `select_provider` is taken from the first CLI arg: `python interactive.py pick`.
    pick = len(sys.argv) > 1 and sys.argv[1] in ("pick", "select", "--select-provider")

    agent = Agent()

    prompt = (
        "List the directories in the current directory, then wait for me to tell "
        "you which one to count the Python files in."
    )
    result = agent.interactive(prompt, capture_bytes=4096, select_provider=pick)

    print(f"\nExit code: {result.exit_code}")
    if result.session_id:
        print(f"Session ID: {result.session_id}")
    print(f"Captured {len(result.output)} chars of terminal output")


if __name__ == "__main__":
    main()
```

## Auto-wait on usage limits

By default, when a provider reports a usage limit mid-session, `send()` sleeps until the limit resets and then resumes automatically — transparently to you. Disable it per agent with `Agent(..., auto_wait=False)` or globally with `CAW_AUTOWAIT=0`.

## Persistence and resuming

Pass `data_dir=` to persist a session to disk, and grab a `resume_handle` to continue it later (even in another process). Those are covered in [Resuming sessions](https://zzjas.github.io/caw/guides/resuming/index.md) and [Persistence](https://zzjas.github.io/caw/guides/persistence/index.md).

# Subagents

A subagent is a child agent that the parent can invoke as a tool. Register an AgentSpec and caw exposes it to the parent automatically; when the parent calls it, the subagent runs its own session and its full trajectory is captured.

```python
from caw import Agent, AgentSpec

reviewer = AgentSpec(
    name="security_reviewer",
    description="Reviews code for security issues",
    system_prompt="You are a security expert. Review the given code.",
)

agent = Agent()
agent.add_subagent(reviewer)
traj = agent.completion("Review the auth module for vulnerabilities")
```

## Nested trajectories and usage roll-up

Each subagent invocation attaches a nested Trajectory to the parent's `ToolUse` block, so you can inspect what the child did:

```python
for sub in traj.subagent_trajectories:
    print(f"  subagent: {sub.agent}, {sub.num_turns} turns, ${sub.usage.cost_usd:.4f}")
```

Usage rolls up: `traj.usage` is the parent's own consumption, while `traj.total_usage` is the parent **plus all nested subagents** (recursively). This is the number to use for total cost.

## Configuring a subagent

AgentSpec carries the same knobs as an `Agent`: `system_prompt`, `model`, `reasoning`, `tools`, plus its own `tool_servers`, `mcp_servers`, and even nested `subagents`. That means a subagent can have its own tools and its own children.

```python
AgentSpec(
    name="researcher",
    description="Searches the web and summarizes findings",
    system_prompt="You research topics thoroughly.",
    model="opus",
    tools=ToolGroup.READER | ToolGroup.WEB,
)
```

## Full example

[`examples/subagent.py`](https://github.com/zzjas/caw/blob/main/examples/subagent.py) shows a senior-engineer agent delegating code review to a subagent and inspecting the nested trajectory:

```python
"""Subagent demo: a parent agent delegates code review to a subagent."""

import os

os.environ["CAW_LOG"] = "full"

from caw import Agent, AgentSpec


def main():
    reviewer = AgentSpec(
        name="Code Reviewer",
        description="Review code for correctness and style issues.",
        system_prompt="You are a code reviewer. Given code, identify bugs and style issues. Be concise.",
    )

    agent = Agent(
        system_prompt="You are a senior engineer. Use the Code Reviewer tool to review code when asked.",
        data_dir="caw_data",
    )
    agent.add_subagent(reviewer)

    with agent.start_session() as session:
        turn = session.send("Review this Python function:\n\ndef add(a, b):\n    return a - b\n")

        traj = session.trajectory
        print(f"\nParent own usage: ${traj.usage.cost_usd:.4f}")
        print(f"Parent total usage (with subagents): ${traj.total_usage.cost_usd:.4f}")
        print(f"Parent total tokens: {traj.total_usage.total_tokens}")

        for tc in turn.tool_calls:
            if tc.subagent_trajectory:
                st = tc.subagent_trajectory
                print(f"\n  Subagent '{tc.name}':")
                print(f"    Model: {st.model}")
                print(f"    System prompt: {st.system_prompt[:60]}...")
                print(f"    Tool calls: {st.total_tool_calls}")
                print(f"    Usage: ${st.usage.cost_usd:.4f} ({st.usage.total_tokens} tokens)")


if __name__ == "__main__":
    main()
```

# ToolKit & tools

caw gives you two ways to hand the agent Python tools without writing or running an MCP server yourself: **stateless functions** and the declarative **`ToolKit`** class. Both are MCP HTTP servers under the hood, started and stopped automatically with the session.

## Stateless tools

Decorate plain functions with @tool and pass them via `stateless_tools=`:

```python
from caw import Agent, tool

@tool(description="Add two numbers")
def add(a: int, b: int) -> int:
    return a + b

@tool(description="Multiply two numbers")
def multiply(a: int, b: int) -> int:
    return a * b

agent = Agent(
    system_prompt="You have access to math tools. Use them to answer questions.",
    stateless_tools=[add, multiply],
)
```

Full example — [`examples/tools_simple.py`](https://github.com/zzjas/caw/blob/main/examples/tools_simple.py):

```python
"""Stateless tools demo: pass plain functions directly to an agent."""

import os

os.environ["CAW_LOG"] = "full"

from caw import Agent, tool


@tool(description="Add two numbers")
def add(a: int, b: int) -> int:
    return a + b


@tool(description="Multiply two numbers")
def multiply(a: int, b: int) -> int:
    return a * b


def main():
    agent = Agent(
        system_prompt="You have access to math tools. Use them to answer questions.",
        stateless_tools=[add, multiply],
        data_dir="caw_data",
    )

    with agent.start_session() as session:
        session.send("List every tool you have access to by name.")
        session.send("What is 3 + 4? Then multiply the result by 5.")

        traj = session.trajectory
        print(f"\nTurns: {traj.num_turns}, Tool calls: {traj.total_tool_calls}")


if __name__ == "__main__":
    main()
```

## ToolKit: stateful, declarative tool servers

Subclass ToolKit, decorate methods with `@tool`, and caw exposes them as a single MCP server. Instance state (`self`) persists across tool calls for the whole session:

```python
from caw import Agent, ToolKit, tool

class UserDB(ToolKit, server_name="user_db", display_name="User Database"):
    def __init__(self):
        self.users = ["Alice", "Bob"]

    @tool(description="List all users")
    async def list_users(self) -> str:
        return ", ".join(self.users)

    @tool(description="Add a user")
    async def add_user(self, name: str) -> str:
        self.users.append(name)
        return f"Added {name}"

db = UserDB()
agent = Agent(system_prompt="You have a user database.", tool_servers=[db])
traj = agent.completion("Add Eve to the user database, then list all users")
```

You can pass the `ToolKit` instance directly in `tool_servers=` (caw calls `as_server()` for you), or call `agent.add_tool_server(db)` later. Methods may be sync or async.

Full example — [`examples/toolkit.py`](https://github.com/zzjas/caw/blob/main/examples/toolkit.py):

```python
"""Custom tool server demo: a stateful user database exposed via ToolKit."""

import os

os.environ["CAW_LOG"] = "full"

from caw import Agent, ToolKit, tool


class UserDB(ToolKit, server_name="user_db", display_name="User Database"):
    def __init__(self):
        self.users = ["Alice", "Bob", "Charlie"]
        self.count = 0

    @tool(description="List all users in the database")
    async def list_users(self) -> str:
        self.count += 1
        return f"Users: {', '.join(self.users)} (queried {self.count} time(s))"

    @tool(description="Add a user to the database")
    async def add_user(self, name: str) -> str:
        self.users.append(name)
        return f"Added {name}. Total users: {len(self.users)}"


def main():
    db = UserDB()
    agent = Agent(
        system_prompt="You have access to a user database. Use the tools to answer questions about users.",
        tool_servers=[db],
        data_dir="caw_data",
    )

    with agent.start_session() as session:
        session.send("How many users are in the database? List them.")
        session.send("Add a user named Diana, then list all users again.")

        traj = session.trajectory
        print(f"\nTurns: {traj.num_turns}, Tool calls: {traj.total_tool_calls}")

    # State persists across turns (server stayed alive for the whole session)
    print(f"Final DB state: users={db.users}, count={db.count}")


if __name__ == "__main__":
    main()
```

### Thread safety

By default a `ToolKit`'s methods may run concurrently. If your state isn't safe for that, declare `thread_safe=True` in the subclass options and caw serializes calls with a lock:

```python
class Counter(ToolKit, server_name="counter", thread_safe=True):
    ...
```

## Tool permission groups

Independently of *which* tools you add, you can restrict the agent's **built-in** tools (read, write, exec, web, …) with ToolGroup:

```python
from caw import Agent, ToolGroup

# Read-only: Read/Glob/Grep, but no Bash/Write/Edit/WebSearch.
agent = Agent(tools=ToolGroup.READER)

# Everything except writes:
agent = Agent(tools=ToolGroup.ALL - ToolGroup.WRITER)
```

Groups combine with `|` (union) and `-` (subtract). The default for automated runs is `ToolGroup.ALL - ToolGroup.INTERACTION`. Full example — [`examples/tool_groups.py`](https://github.com/zzjas/caw/blob/main/examples/tool_groups.py):

```python
"""Tool groups demo: restrict an agent to read-only tools."""

import os

os.environ["CAW_LOG"] = "full"

from caw import Agent, ToolGroup


def main():
    # Only allow Read, Glob, Grep — no Bash, Write, Edit, WebSearch, etc.
    agent = Agent(tools=ToolGroup.READER, data_dir="caw_data")

    traj = agent.completion(
        "List every tool you have access to by name. "
        "Then answer: can you use the Bash tool? Can you use the Write tool? "
        "Can you use the Edit tool? Can you use the WebSearch tool?"
    )
    print(traj.result)
    print(f"\nis_complete: {traj.is_complete}")
    print(f"is_usage_limited: {traj.is_usage_limited}")


if __name__ == "__main__":
    main()
```

# Trajectory viewer

caw ships a small web UI for browsing saved trajectories — turns, content blocks, tool calls, and nested subagent trajectories — plus a terminal inspector for agents.

## Web viewer

Start the server from the CLI:

```bash
caw viewer                 # auto host/port
caw viewer --port 8080     # fixed port
```

…or programmatically with start_viewer_server(), which returns a ViewerServer handle:

```python
from caw.viewer import start_viewer_server

server = start_viewer_server()          # auto host/port
print(server.url)                       # http://localhost:<port>
server.check_status()                   # True / False
server.stop()
```

The viewer loads a trajectory JSON file by absolute path, passed as a query parameter:

```text
http://localhost:<port>?path=/abs/path/to/trajectory.json
```

Full example — [`examples/traj_viewer.py`](https://github.com/zzjas/caw/blob/main/examples/traj_viewer.py) runs a short session, saves the trajectory, and opens it in the viewer:

```python
"""Trajectory viewer: save a session trajectory and view it in the browser."""

import os
import tempfile

os.environ["CAW_LOG"] = "full"

from caw import Agent
from caw.viewer import start_viewer_server


def main():
    agent = Agent(data_dir="caw_data")

    # Run a short session and save the trajectory to a temp file
    traj_path = os.path.join(tempfile.gettempdir(), "caw_example_traj.json")

    with agent.start_session(traj_path=traj_path) as session:
        session.send("What is 2 + 2? Answer in one sentence.")

    print(f"\nTrajectory saved to {traj_path}")

    # Start the viewer server and print the URL
    server = start_viewer_server()
    print(f"Open in browser: {server.url}?path={traj_path}")

    print("Press Ctrl+C to stop the viewer.")
    try:
        import signal

        signal.pause()
    except KeyboardInterrupt:
        pass
    finally:
        server.stop()


if __name__ == "__main__":
    main()
```

## Terminal inspector

For a quick, scriptable look (or to let another agent read a trajectory), use the `caw-traj` CLI or the `caw traj` subcommand. It prints a compact, step-indexed view and can expand specific steps:

```bash
caw-traj run.json                 # compact, step-indexed overview
caw-traj run.json --recursive     # include nested subagent steps
caw-traj run.json --step 7        # full detail for step 7
caw-traj run.json --step 7-10     # a range
caw-traj run.json --step 12/3     # a nested step under step 12
```

See the [CLI reference](https://zzjas.github.io/caw/reference/cli/index.md) for every option.
# Reference

# Environment variables

caw can be configured globally via environment variables, so you can change behavior without touching code. Each is used as a fallback when the corresponding value isn't set explicitly on the `Agent` constructor or a method call.

| Variable                                         | Purpose                                                                                                                            | Example                                 |
| ------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------- |
| `CAW_PROVIDER`                                   | Default provider, or a comma-separated [fallback order](https://zzjas.github.io/caw/guides/auto-provider/index.md)                 | `claude_code` · `claude,codex,opencode` |
| `CAW_MODEL`                                      | Default model name                                                                                                                 | `opus` · `gpt-5.5`                      |
| `CAW_EFFORT`                                     | Default reasoning effort                                                                                                           | `high` · `medium` · `low`               |
| `CAW_AUTOWAIT`                                   | Auto-wait on usage limits (on by default)                                                                                          | `0` / `false` to disable                |
| `CAW_LOG`                                        | Console [display mode](https://zzjas.github.io/caw/guides/display-and-logging/index.md)                                            | `full` · `short` · `result` · `off`     |
| `CAW_HOME`                                       | Base dir for caw state — config & caches (default `~/.caw/`)                                                                       | `/path/to/caw`                          |
| `CAW_AUTH_DIR`                                   | Relocate the [auth staging dir](https://zzjas.github.io/caw/guides/docker-credentials/index.md) (default `~/.caw/auth/`)           | `/path/to/auth`                         |
| `ANTHROPIC_MODEL` / `ANTHROPIC_SMALL_FAST_MODEL` | Override the Claude Code [tier defaults](https://zzjas.github.io/caw/guides/models-and-tiers/index.md) (strongest / fast)          | `opus` · `claude-haiku-4-5`             |
| `OPENCODE_MODEL` / `OPENCODE_SMALL_FAST_MODEL`   | Override the opencode tier defaults (strongest / fast)                                                                             | `openai/gpt-5.5`                        |
| `CAW_DEFAULTS_URL`                               | Remote source for shipped [tier defaults](https://zzjas.github.io/caw/guides/models-and-tiers/index.md); `off` to disable fetching | URL · `off`                             |
| `CAW_DEFAULTS_TTL`                               | Cache window (seconds) for remote tier defaults (default `86400`)                                                                  | `3600`                                  |

## Precedence

For the provider, model, and reasoning settings, the order is:

1. Explicit argument to `Agent(...)` or a setter (`set_provider`, `set_model`, …)
1. The environment variable above
1. caw's built-in default (`claude_code`, no model, provider default effort)

`CAW_PROVIDER` may be a single name (pinned) or a comma-separated list (a fallback order used when `provider="auto"` or no provider is given). See [Auto-provider mode](https://zzjas.github.io/caw/guides/auto-provider/index.md).

# CLI reference

caw installs two console scripts: **`caw`** (the main multi-command app) and **`caw-traj`** (a standalone trajectory inspector, also available as `caw traj`).

## `caw`

Top-level commands: `doctor` ([provider health](https://zzjas.github.io/caw/guides/health/index.md)), `auth` ([Docker credentials](https://zzjas.github.io/caw/guides/docker-credentials/index.md)), `viewer` and `traj` ([trajectory viewer](https://zzjas.github.io/caw/guides/trajectory-viewer/index.md)).

# caw

Coding Agent Wrapper — tools for managing coding agents.

## Usage

`caw [OPTIONS] COMMAND [ARGS]...`

## Arguments

*No arguments available*

## Options

| Name                   | Description                                                                      | Required | Default |
| ---------------------- | -------------------------------------------------------------------------------- | -------- | ------- |
| `--install-completion` | Install completion for the current shell.                                        | No       | -       |
| `--show-completion`    | Show completion for the current shell, to copy it or customize the installation. | No       | -       |
| `--help`               | Show this message and exit.                                                      | No       | -       |

## Commands

| Name     | Description                                   |
| -------- | --------------------------------------------- |
| `viewer` | Launch the trajectory viewer web UI.          |
| `doctor` | Show health/availability signals for each...  |
| `traj`   | Inspect a saved trajectory from the terminal. |
| `auth`   | Manage credentials for Docker containers.     |
| `config` | View and edit caw's per-provider model...     |

## Subcommands

### `caw viewer`

Launch the trajectory viewer web UI.

#### Usage

`caw viewer [OPTIONS]`

#### Arguments

*No arguments available*

#### Options

| Name                 | Description                 | Required | Default   |
| -------------------- | --------------------------- | -------- | --------- |
| `-h, --host TEXT`    | Host to bind to.            | No       | `0.0.0.0` |
| `-p, --port INTEGER` | Port to bind to (0 = auto). | No       | `0`       |
| `--help`             | Show this message and exit. | No       | -         |

### `caw doctor`

Show health/availability signals for each provider's CLI.

#### Usage

`caw doctor [OPTIONS]`

#### Arguments

*No arguments available*

#### Options

| Name     | Description                                                                                       | Required | Default |
| -------- | ------------------------------------------------------------------------------------------------- | -------- | ------- |
| `--live` | Round-trip a probe per provider to check it responds / isn't rate-limited (costs a request each). | No       | -       |
| `--help` | Show this message and exit.                                                                       | No       | -       |

### `caw traj`

Inspect a saved trajectory from the terminal.

#### Usage

`caw traj [OPTIONS] PATH`

#### Arguments

| Name   | Description                             | Required |
| ------ | --------------------------------------- | -------- |
| `PATH` | Path to the saved trajectory JSON file. | Yes      |

#### Options

| Name                          | Description                                                                                            | Required | Default     |
| ----------------------------- | ------------------------------------------------------------------------------------------------------ | -------- | ----------- |
| `-s, --step TEXT`             | Show full details for visible-step selectors like 7, 7-10, or 12/3-12/7. Repeat to add more selectors. | No       | -           |
| `-r, --recursive`             | Include nested visible subagent steps in the compressed listing.                                       | No       | -           |
| `--text-chars INTEGER RANGE`  | Maximum characters to show in user/assistant previews.                                                 | No       | `60; x>=10` |
| `--input-chars INTEGER RANGE` | Reserved for future tool-detail rendering; accepted for compatibility.                                 | No       | `60; x>=10` |
| `--help`                      | Show this message and exit.                                                                            | No       | -           |

### `caw auth`

Manage credentials for Docker containers.

#### Usage

`caw auth [OPTIONS] COMMAND [ARGS]...`

#### Arguments

*No arguments available*

#### Options

| Name     | Description                 | Required | Default |
| -------- | --------------------------- | -------- | ------- |
| `--help` | Show this message and exit. | No       | -       |

#### Subcommands

##### `caw auth setup`

Snapshot credentials and write the container setup bundle into ~/.caw/auth/.

Host credential files are not modified; they are bind-mounted into the container at run time via `caw auth docker-flags`.

###### Usage

`caw auth setup [OPTIONS]`

###### Arguments

*No arguments available*

###### Options

| Name                 | Description                                    | Required | Default        |
| -------------------- | ---------------------------------------------- | -------- | -------------- |
| `-a, --agents TEXT`  | Agents to include (claude, codex, or all)      | No       | -              |
| `--source-home TEXT` | Source home directory to read credentials from | No       | `/home/runner` |
| `--help`             | Show this message and exit.                    | No       | -              |

##### `caw auth teardown`

Remove ~/.caw/auth/. Host credential files are untouched.

Refuses to run if host credentials are still symlinks into the auth directory (leftover from the old symlink-based design). Use `--force` to override — but you will have to re-authenticate every agent.

###### Usage

`caw auth teardown [OPTIONS]`

###### Arguments

*No arguments available*

###### Options

| Name            | Description                                          | Required | Default |
| --------------- | ---------------------------------------------------- | -------- | ------- |
| `-n, --dry-run` | Show what would be done                              | No       | -       |
| `-f, --force`   | Delete even if host symlinks point into the auth dir | No       | -       |
| `--help`        | Show this message and exit.                          | No       | -       |

##### `caw auth status`

Show token expiry, last modified, and docker mount flags.

###### Usage

`caw auth status [OPTIONS]`

###### Arguments

*No arguments available*

###### Options

| Name                | Description                 | Required | Default |
| ------------------- | --------------------------- | -------- | ------- |
| `-a, --agents TEXT` | Agents to show              | No       | -       |
| `--help`            | Show this message and exit. | No       | -       |

##### `caw auth docker-flags`

Output the -v flags for docker (one per bind mount, space-separated).

###### Usage

`caw auth docker-flags [OPTIONS]`

###### Arguments

*No arguments available*

###### Options

| Name     | Description                 | Required | Default |
| -------- | --------------------------- | -------- | ------- |
| `--help` | Show this message and exit. | No       | -       |

### `caw config`

View and edit caw's per-provider model configuration (~/.caw/config.json).

#### Usage

`caw config [OPTIONS] COMMAND [ARGS]...`

#### Arguments

*No arguments available*

#### Options

| Name     | Description                 | Required | Default |
| -------- | --------------------------- | -------- | ------- |
| `--help` | Show this message and exit. | No       | -       |

#### Subcommands

##### `caw config list`

Show the effective model for every provider/tier, with its source.

###### Usage

`caw config list [OPTIONS]`

###### Arguments

*No arguments available*

###### Options

| Name     | Description                 | Required | Default |
| -------- | --------------------------- | -------- | ------- |
| `--help` | Show this message and exit. | No       | -       |

##### `caw config get`

Print the effective model for a single PROVIDER and TIER.

###### Usage

`caw config get [OPTIONS] PROVIDER TIER`

###### Arguments

| Name       | Description | Required |
| ---------- | ----------- | -------- |
| `PROVIDER` | [required]  | No       |
| `TIER`     | [required]  | No       |

###### Options

| Name     | Description                 | Required | Default |
| -------- | --------------------------- | -------- | ------- |
| `--help` | Show this message and exit. | No       | -       |

##### `caw config set`

Set the model for PROVIDER/TIER, saved to ~/.caw/config.json.

###### Usage

`caw config set [OPTIONS] PROVIDER TIER MODEL`

###### Arguments

| Name       | Description | Required |
| ---------- | ----------- | -------- |
| `PROVIDER` | [required]  | No       |
| `TIER`     | [required]  | No       |
| `MODEL`    | [required]  | No       |

###### Options

| Name     | Description                 | Required | Default |
| -------- | --------------------------- | -------- | ------- |
| `--help` | Show this message and exit. | No       | -       |

##### `caw config unset`

Remove a user override for PROVIDER/TIER (revert to the default).

###### Usage

`caw config unset [OPTIONS] PROVIDER TIER`

###### Arguments

| Name       | Description | Required |
| ---------- | ----------- | -------- |
| `PROVIDER` | [required]  | No       |
| `TIER`     | [required]  | No       |

###### Options

| Name     | Description                 | Required | Default |
| -------- | --------------------------- | -------- | ------- |
| `--help` | Show this message and exit. | No       | -       |

##### `caw config path`

Print the path to the user config file.

###### Usage

`caw config path [OPTIONS]`

###### Arguments

*No arguments available*

###### Options

| Name     | Description                 | Required | Default |
| -------- | --------------------------- | -------- | ------- |
| `--help` | Show this message and exit. | No       | -       |

##### `caw config refresh`

Force a re-fetch of the remote default models into the local cache.

###### Usage

`caw config refresh [OPTIONS]`

###### Arguments

*No arguments available*

###### Options

| Name     | Description                 | Required | Default |
| -------- | --------------------------- | -------- | ------- |
| `--help` | Show this message and exit. | No       | -       |

## `caw-traj`

The standalone trajectory inspector. Prints a compact, step-indexed view of a saved trajectory and can expand individual steps.

# caw-traj

Inspect a saved trajectory.

Running `caw-traj PATH` prints a compact, step-indexed view of the conversational trajectory so another agent can see the structure quickly. Tool calls are omitted from the compressed view.

Use `--step` to retrieve full content for a specific visible step. Step paths use the same addresses shown in the compact output:

- `7` means top-level step 7
- `12/3` means nested visible step 3 under step 12
- `7-10` expands to steps 7, 8, 9, 10
- `12/3-12/7` expands to a nested range within the same parent
- you can copy the bracketed form directly, for example `--step [12/3]`
- multiple selectors can be combined in one flag, for example `--step 7,8,12/3-12/5`

Each compressed step shows a 1-based raw JSON line range like `L41-L48`. LLM agents should feel free to inspect the raw JSON file directly with those line numbers when the compact view is not enough.

Examples: caw-traj run.json caw-traj run.json --recursive caw-traj run.json --step 7 caw-traj run.json --step 7-10 caw-traj run.json --step 12/3 caw-traj run.json --step 12/3-12/7 caw-traj run.json --step 7,8,12/3-12/5

**Usage:**

```text
caw-traj [OPTIONS] PATH
```

**Options:**

```text
  -s, --step TEXT              Show full details for visible-step selectors
                               like 7, 7-10, or 12/3-12/7.
  -r, --recursive              Include nested visible subagent steps in the
                               compressed listing.
  --text-chars INTEGER RANGE   Maximum characters to show in user/assistant
                               previews.  [default: 60; x>=10]
  --input-chars INTEGER RANGE  Reserved for future tool-detail rendering;
                               accepted for compatibility.  [default: 60;
                               x>=10]
  --help                       Show this message and exit.
```