Production Checklist
Prepare AGH for persistent unattended operation with clear pass and fail checks.
- Audience
- Operators running durable agent work
- Focus
- Operations guidance shaped for scanability, day-two clarity, and operator context.
Use this checklist before running AGH as a persistent daemon for real work. It is written for local
or self-managed production-like environments where one service user owns one AGH_HOME.
1. Pin the daemon identity and home
| Check | Pass condition |
|---|---|
| Service user | A dedicated OS user owns the daemon process. |
| Home directory | AGH_HOME is explicit, stable, and owned by the service user. |
| CLI operations | Operators use the same AGH_HOME when running agh daemon status, agh session list, and related commands. |
| File permissions | The home directory is not world-writable; socket access is limited to the daemon user. |
Example:
sudo install -d -o agh -g agh -m 0750 /var/lib/aghAGH creates its standard subdirectories with normal directory permissions, and the live UDS socket is
chmodded to 0600.
2. Harden configuration
Review the home config that the daemon loads:
export AGH_HOME="${AGH_HOME:-$HOME/.agh}"
sed -n '1,220p' "$AGH_HOME/config.toml"After changing it, run agh daemon start --foreground during a maintenance window or in a staging
AGH_HOME to surface config validation errors directly.
Use explicit daemon and HTTP settings:
[daemon]
socket = "/var/lib/agh/daemon.sock"
[http]
host = "localhost"
port = 2123
[log]
level = "info"
[limits]
max_sessions = 10
max_concurrent_agents = 20| Check | Pass condition |
|---|---|
| HTTP bind | [http].host is localhost unless AGH is intentionally protected by a reverse proxy or host firewall. |
| UDS path | [daemon].socket is inside a directory owned by the daemon user. |
| Log level | [log].level is info or warn for unattended operation; use debug only for short investigations. |
| Limits | Session and agent concurrency limits match the host capacity. |
| Provider environment | Required provider API keys are set in the service environment, not only in an interactive shell. |
3. Run under a service manager
The service manager should:
- start
agh daemon start --foreground - send
SIGTERMduring stop - restart on unexpected failure
- provide the provider environment used by agent subprocesses
- keep stdout and stderr in a known log location
For concrete service files, see Daemon Operations.
4. Configure log retention
AGH writes structured logs to $AGH_HOME/logs/agh.log. Detached daemon startup also appends child
stdout and stderr there.
If your host uses logrotate, use a rule like this and adjust user, group, and path:
/var/lib/agh/logs/agh.log {
daily
rotate 14
compress
missingok
notifempty
copytruncate
create 0640 agh agh
}| Check | Pass condition |
|---|---|
| Retention | Logs rotate before filling the filesystem. |
| Access | Only operators who need runtime logs can read them. |
| Error review | Recent error lines are reviewed during incident response and before upgrades. |
5. Monitor daemon and runtime health
Use both daemon status and observe health:
agh daemon status --output json
agh observe health --output jsonIf HTTP is available locally:
curl -fsS http://localhost:2123/api/daemon/status >/dev/null
curl -fsS http://localhost:2123/api/observe/health >/dev/nullAlert on:
| Signal | Failing condition |
|---|---|
| Daemon status | Status is not running, or PID is absent. |
| HTTP status | /api/daemon/status or /api/observe/health cannot be reached from the host. |
| Active sessions | Count exceeds the expected operating range. |
| Database size | global_db_size_bytes or session_db_size_bytes grows faster than planned. |
| Logs | Repeated startup, socket, database, or ACP spawn errors. |
6. Back up state
Back up at least:
$AGH_HOME/agh.dband SQLite sidecars$AGH_HOME/sessions/$AGH_HOME/config.toml$AGH_HOME/agents/$AGH_HOME/skills/$AGH_HOME/memory/
Use one of the backup procedures in Database Operations. For
unattended hosts, prefer a scheduled cold backup when the daemon can be stopped. If it cannot be
stopped, use SQLite .backup instead of copying only the main database files.
| Check | Pass condition |
|---|---|
| Frequency | Backup frequency matches the amount of session history you can afford to lose. |
| Coverage | Backups include global and per-session databases plus config and content directories. |
| Restore drill | A restore has been tested on a separate AGH_HOME. |
| Retention | Old backups expire according to your storage and compliance needs. |
7. Reserve host resources
AGH starts real ACP-compatible agent CLIs as child processes. Size the host for the agent binaries you run, not only the daemon.
| Check | Pass condition |
|---|---|
| Disk | AGH_HOME, logs, and session event databases have room to grow. |
| File descriptors | The service limit is high enough for concurrent sessions, sockets, logs, and SQLite handles. |
| Process count | The service user can run the daemon plus expected agent child processes. |
| PATH | Provider commands such as npx, codex, or gemini are available to the service environment. |
| Shutdown | The service manager gives AGH time to stop sessions and close databases before killing it. |
For systemd, set resource limits in the service file when needed:
[Service]
LimitNOFILE=8192
TimeoutStopSec=30
Restart=on-failure8. Upgrade deliberately
Use this flow for binary upgrades:
export AGH_HOME=/var/lib/agh
agh daemon status
agh daemon stop
# Back up AGH_HOME here.
# Install the new agh binary here.
agh daemon start
agh daemon status
agh observe healthDo not rely on old daemon state after replacing the binary. Stop, back up, replace, start, and then confirm status and health.
Final readiness gate
| Area | Ready when |
|---|---|
| Daemon lifecycle | agh daemon start, agh daemon status, and agh daemon stop work under the service manager. |
| Socket | The CLI can reach the configured UDS socket as the intended operator user. |
| HTTP | HTTP is bound only where intended and health endpoints are reachable. |
| Logs | Logs rotate and recent errors are actionable. |
| Databases | Backups include agh.db, per-session events.db files, metadata, and sidecars. |
| Sessions | Test session creation, stop, list, and resume work with the production service environment. |
| Recovery | Operators know how to restore a backup into a separate AGH_HOME before touching production state. |
Related pages
- Daemon Operations shows service manager setup.
- Database Operations gives backup and inspection commands.
- Troubleshooting maps common failures to fixes.