Run a real multi-step task, read the tool-use loop, and catch problems early.
Running your first real agentic task is different from chatting with Claude. Claude will read files, run commands, make changes, and report back — all without you typing each step. This lesson teaches you how to phrase tasks, read the tool-use loop as it runs, intervene correctly, and verify the output.
⬡ What you'll build
An agentic task is a request Claude executes using a sequence of tool calls — Read, Write, Edit, Bash, Glob, Grep — until it completes the goal or gets stuck.
The loop:
Prompt → Plan → Tool call → Read result → Plan next step → Tool call → ...
Claude doesn't run all tools at once. It plans one step, executes it, reads the result, and plans the next step based on what it found. This is why agentic tasks can diverge from your intent — small misunderstandings compound across steps.
The difference between a task Claude can run autonomously and one that stalls every three steps is how it's phrased.
What a good agentic task includes:
Failure Pattern — Underspecified task
✕ Before (broken pattern)
Add dark mode to the app.
✓ After (production pattern)
Add a dark mode toggle to the site. The theme state should live in a React context at lib/theme-context.tsx. The toggle component goes in components/layout/theme-toggle.tsx and should be added to the header in components/layout/header.tsx next to the existing nav. Constraints: don't touch app/layout.tsx — the dark class is already applied there. Don't add any new npm packages. Verify by: TypeScript passes (tsc --noEmit) and the toggle renders in the header without breaking the nav layout.
Lesson: Underspecified tasks cause Claude to make assumptions. Those assumptions compound across 10 tool calls. The extra 2 minutes you spend specifying saves 20 minutes of correction.
When Claude runs an agentic task, you'll see tool calls in real-time. Here's what a healthy loop looks like:
Each ⬡ line is a tool call. You're watching Claude's actual work in real-time.
What healthy looks like:
Red flags to watch for:
Claude runs autonomously, but you have full control. Use Ctrl+C to stop Claude mid-task.
Interrupt immediately when:
Do not interrupt when:
After interrupting: Correct the assumption directly: "Stop — don't touch auth.ts. The issue is in middleware.ts only." Then let Claude re-plan from that correction.
Run this exact task in a real project:
This task is good because:
Watch the tool calls as Claude works. Note which files it reads, in what order, and why.
Never accept Claude's summary as verification. Claude summarizes what it intended to do — not always what it actually did.
The verification checklist for any agentic task:
✕The trust-but-verify discipline
This isn't about distrust — it's about system discipline. Claude's summaries are accurate most of the time. But agentic tasks that touch 10+ files have surface area for errors. One undetected wrong edit compounds into a broken build. The 60 seconds of verification is always worth it.
Claude edits the wrong file: Usually from ambiguous scope. Fix by specifying the exact file path in your task.
Claude finishes but something's broken: Claude verified against its mental model, not actual execution. Always run tsc --noEmit and your test suite after code-changing tasks.
Claude gets stuck asking questions mid-task: Your task didn't have enough context. Claude hit an ambiguity and stopped. Re-phrase with the missing information included upfront.
Claude runs more tool calls than expected: Not necessarily bad — complex tasks have hidden complexity. Watch what it finds, not how many steps it takes.
You're ready to run agentic tasks when:
Milestone 3
You've run your first real agentic task and know how to read, guide, and verify it. This is the core skill that everything else in this track builds on.