Interactive Diagnostic

Are You Babysitting Your Agent?

Most people who say their AI coding agent is "right but slow" have a trust gap, not a skill gap. They talk to it every two minutes, verify each step, keep control, which caps the whole thing at human speed and kills every long autonomous run before it starts.

This tool reads your Claude Code session logs and measures it. How often you interrupt, how long your agent runs unattended, whether you have given it a way to check its own work. Drop your logs below. Everything is parsed in your browser. Nothing is uploaded.

Drop your session logs here

Drag in one or more .jsonl files (or a whole folder of them)

You can also paste raw JSONL text below

100% local, parsed in your browser, never uploaded

Where are my logs? Claude Code stores them as .jsonl under ~/.claude/projects/<your-project>/ (macOS/Linux) or C:\Users\<you>\.claude\projects\<your-project>\ (Windows). Grab the most recent few.

01Right but slow has a cause

The tell is usually self-reported: "I talk to it every couple of minutes; the inputs that trigger a longer run aren't so often." That is not a lack of ability. It is treating the agent like pair-programming or autocomplete, staying in the loop, verifying each step, keeping control.

Every individual step in that pattern is correct. Understand the implementation, then implement, then test. That is exactly why it feels right. But serializing yourself into the agent's inner loop caps throughput at human speed and ends every long autonomous arc before it can start.

02The missing fundamental is a way for the agent to check itself

People sense something is missing: "if the tech is getting hard, I'm probably missing a previous step." They are right, and it is rarely a second-brain repo or a fancier framework. The missing piece is a test / QA gate the agent can run on its own.

Without a way to verify, babysitting every two minutes is the rational behavior. You have no way to trust a fifty-minute run, so you never take one.

Give the agent a verification layer it hits by itself and the babysitting stops being necessary. That is the unlock. It is the same thing a long, self-correcting run demonstrates: the agent kept fixing and re-checking because it had something to check against.

03The shift

The move is from "prompt, watch, tweak" to "scope + define done, release, review the diff." Git is the honest signal here, by the way. Commit shape tells you whether someone is running autonomous arcs or babysitting, and it cannot be gamed. But your own session logs tell you first. That is what this tool reads.

Want the full argument, with the real runs that taught me this? Read the essay: Are You Babysitting Your AI Agent?

Read the post