Resume and Replay
How AGH resumes stopped sessions, reconstructs transcripts from events, and falls back when ACP session resources are gone.
- Audience
- Operators running durable agent work
- Focus
- Sessions guidance shaped for scanability, day-two clarity, and operator context.
Resume and replay are related, but they are not the same thing in AGH.
- Resume tries to restart the session as live agent work.
- Replay reconstructs durable history from the stored event log.
When resume is the right tool
Use resume when you want to keep working with the same AGH session ID, workspace, and stored history:
- after a clean manual stop
- after a daemon restart
- after an agent crash
- after inspecting old work and deciding the same session should continue
Use a new session instead when you want a clean slate, a different workspace, or a different agent definition.
Resuming a stopped session
Resume from the CLI:
agh session resume sess-1234Resume over HTTP:
curl -X POST http://localhost:2123/api/sessions/sess-1234/resumeIf the session is already active in memory, resume is idempotent: AGH returns the live session instead of starting a second subprocess.
What AGH validates before resume
AGH refuses to resume if any of the durable prerequisites are gone:
- the stored session metadata is invalid
- the workspace can no longer be resolved
- the workspace path no longer exists
- the agent definition is no longer available in that workspace context
- the per-session
events.dbfile is missing - the per-session
events.dbfile exists but is empty
This is why resume is stricter than just "spawn a new process with the same name".
Resume flow
For an inactive session, AGH follows this sequence:
- Read and repair stored metadata if the daemon previously crashed mid-session.
- Validate the workspace, agent definition, and session event store.
- Start the normal session startup path with the same AGH session ID.
- Pass the stored ACP session ID to the driver as
ResumeSessionID. - Attempt ACP
session/load. - On success, transition the AGH session back to
active.
Daemon boot also performs append-only transcript repair for sessions stopped with agent_crashed or
error. That pass closes interrupted final turns before replay, so resumed sessions do not inherit
permanently streaming assistant messages or dangling tool calls. Use
agh session repair <session-id> --dry-run when you want to inspect the planned repair before
writing events.
Native ACP resume
If the agent supports session/load, AGH asks the upstream ACP runtime to restore its own native
session state. This is the best case because the agent can recover its own model-facing session
context directly.
You can inspect that capability through the session payload:
curl http://localhost:2123/api/sessions/sess-1234 | jq '.session.acp_caps'Missing-resource fallback
One fallback is built in today.
If ACP session/load fails with the specific ACP request error that means "resource not found",
AGH clears the stored ACP session ID, keeps the same AGH session ID, and retries the start as a
fresh ACP session/new.
That gives you:
- the same AGH session record
- the same
events.db - a new underlying ACP session ID
Any other session/load failure is returned as an error.
Replay surfaces
AGH exposes two read paths for old work:
1. Raw or grouped event history
Read persisted events directly:
agh session events sess-1234 --last 20Read grouped history by turn:
agh session history sess-12342. Canonical transcript reconstruction
Read the reconstructed transcript over HTTP:
curl http://localhost:2123/api/sessions/sess-1234/transcriptExample response shape:
{
"messages": [
{
"id": "evt-user",
"role": "user",
"content": "Explain the stop path.",
"timestamp": "2026-04-16T01:00:00Z"
},
{
"id": "evt-agent",
"role": "assistant",
"content": "AGH first marks the session stopping...",
"thinking": "Need the lifecycle code path.",
"thinking_complete": true,
"timestamp": "2026-04-16T01:00:02Z"
}
]
}The transcript assembler produces these canonical roles:
| Role | Derived from persisted events |
|---|---|
user | user_message |
assistant | agent_message and thought |
tool_call | tool_call |
tool_result | tool_result |
There is currently no dedicated agh session transcript CLI command.
What replay includes
Transcript assembly is deterministic and event-driven:
- events are sorted by sequence, timestamp, and ID
agent_messagechunks are concatenated into assistantcontentthoughtchunks are concatenated into assistantthinkingtool_callandtool_resultare stitched together bytool_call_id- empty text chunks are dropped
- non-chat events such as
plan,system,done, anderroronly flush in-progress assistant buffers
Common cases and gotchas
Resume after a clean stop
This is the simplest path. The stored ACP session ID is reused if the agent still has that native session resource.
Resume after a crash
AGH first repairs stale metadata before it tries to resume. That repaired stop classification is
kept as agent_crashed so operators can still see what happened.
Fresh ACP session after fallback
After a missing-resource fallback, the outer AGH session still looks continuous, but the inner
canonical event payload may show a different ACP session_id in future events.
Resume does not rewrite history
The per-session store is append-only for the AGH session ID. A stop, resume, and second stop
creates one continuous event history with multiple terminal session_stopped events.
Next steps
- Use Event Streaming to inspect the stored event log and live SSE stream.
- Use Permissions to understand what survives across resume and what still needs approval.