May 25, 2026 4 min read

Observing Claude Code: Lifecycle Hooks and Session Transcripts

How to build an observability stack for Claude Code by combining real-time lifecycle hooks with session transcript analysis.

On this page

When you submit a prompt to Claude Code, it enters an agent loop with two distinct phases:

Reasoning: Claude evaluates the prompt, conversation history, available tools, and system instructions to determine whether to provide a final answer or execute a tool call.
Action: If a tool is called, the result is appended to the context, and the loop repeats.

A file edit typically requires 3-4 iterations, while complex refactors can exceed 30. To observe this loop, we have two primary data sources: Lifecycle Hooks, which fire during execution to capture real-time events, and Session Transcripts, which store the complete state and token usage on disk.

The agent loop follows the Reason+Act pattern from ReAct (Yao et al., 2022). Lifecycle documentation for Claude Code v0.2.x is available in the official docs.

Lifecycle Hooks

Hooks are plain executables configured in .claude/settings.json. They receive a JSON payload via stdin and must exit cleanly. A failing hook will log an error but won’t block the agent loop.

{
  "hooks": {
    "PreToolUse":   [{ "matcher": "", "hooks": [{ "type": "command", "command": "~/hooks/pre_tool_use.sh" }] }],
    "PostToolUse":  [{ "matcher": "", "hooks": [{ "type": "command", "command": "~/hooks/post_tool_use.sh" }] }],
    "Stop":         [{ "type": "command", "command": "~/hooks/stop.sh" }],
    "SubagentStop": [{ "type": "command", "command": "~/hooks/subagent_stop.sh" }],
    "PreCompact":   [{ "type": "command", "command": "~/hooks/pre_compact.sh" }],
    "Notification": [{ "type": "command", "command": "~/hooks/notification.sh" }]
  }
}

Event Telemetry

The combination of PostToolUse and Stop provides the necessary data to track several key metrics:

Tool Call Density: Volume of calls per turn and tool distribution (e.g., Bash vs. Grep).
Performance: Latency (duration_ms) p50/p95 across sessions and tools.
Reliability: Error rates tracked via the is_error flag.
Turn Latency: Wall-clock time from the first PreToolUse to the final Stop in a turn.

Tracking Peak Context Usage

Compaction occurs when the context window nears its limit. Claude summarizes the preceding turns, discards the raw history, and resumes from the summary.

The PreCompact hook captures the session state just before this reset. It provides the transcript_path for the current session, which you can read to calculate the peak context usage:

usage_json=$(jq -sc '[.[] | select(.type=="assistant") | .message.usage] | {
  li: (last.input_tokens                // 0),
  lr: (last.cache_read_input_tokens     // 0),
  lw: (last.cache_creation_input_tokens // 0)
}' "$transcript_path")

li + lr + lw represents the peak context usage. Tracking this helps identify “heavy” sessions driven by large file reads or excessive tool output.

Compaction history showing context window size at the moment of reset

Session Transcripts

Understanding the scale and cost of a session requires tracking token usage. This data is stored as JSONL in ~/.claude/projects/<slug>/<session_id>.jsonl.

Each assistant turn in the transcript contains a usage block:

{
  "input_tokens": 3,
  "cache_read_input_tokens": 54845,
  "cache_creation_input_tokens": 406,
  "output_tokens": 382
}

Context Window vs. Cumulative Billing

There are two distinct metrics to track when analyzing token usage:

Context Window Depth: The total tokens Claude “sees” in a given turn (the sum of input + cache_read + cache_creation). This value determines how close the session is to the 200k token limit and is what eventually triggers compaction.
Cumulative Billing: The running sum of all input_tokens across the entire session. A long session might bill 10M+ tokens while the context window remains at a steady 80K.

Dashboard panels showing Context Window (36.16K) vs Cumulative Input Billed (29.00)

The Observability Stack

The architecture is designed for low-overhead local collection, with the lifecycle hooks acting as the primary collector:

flowchart TD
    subgraph Local[Local Machine]
        CC[Claude Code]
        TR[(Session Transcript)]
        H[Bash Hooks]
        
        CC -->|JSON on stdin| H
        CC -.->|writes| TR
        H -.->|reads| TR
    end
    
    H -->|HTTP| CH[(ClickHouse)]
    H -->|HTTP| LK[(Loki)]
    CH --> GF[Grafana]
    LK --> GF

I route hook output to ClickHouse (via JSONEachRow HTTP inserts) for structured telemetry and Loki for raw logs. Grafana sits on top for visualization.

Aggregated telemetry showing tool latency distribution and call volume

ClickHouse handles append-only writes with high throughput, and the HTTP interface eliminates the need for complex client libraries in Bash. A single schema with 19 columns captures the entire session lifecycle.

Execution Trace Log

For deep dives, we can reconstruct the exact sequence of tool calls. Since hooks capture the raw tool_input and tool_response, the resulting log provides a high-fidelity trace of the agent’s actions:

TIMESTAMP            TOOL      CATEGORY   STATUS   DURATION   DETAIL
2026-05-20 12:30:01  Bash      shell      ok       1.75s      ls ./src/components/
2026-05-20 12:30:05  Read      file       ok       6ms        cat ./src/components/Header.tsx
2026-05-20 12:30:12  Edit      file       ok       415ms      replace 'Header' with 'StickyHeader'
2026-05-20 12:30:18  Bash      shell      ok       2.45s      npm run lint

The hooks are implemented in ~80-100 lines of Bash with no external dependencies beyond jq and curl.

Code: github.com/vrajat/agent-stats