The 6-part framework for prompts that work reliably in production codebases.
Most Claude Code failures aren't model failures — they're prompt failures. Claude produced exactly what was asked for. The problem was what was asked. This lesson gives you the framework for writing prompts that get correct, reliable output from complex production codebases.
⬡ What you'll build
In a simple coding assistant use case, Claude sees one file, fixes one thing, and you verify in seconds. In production agentic work:
A vague prompt that works fine for a 2-minute task will reliably fail for a 25-minute agentic task. The framework scales with complexity.
Every reliable production prompt has these six elements. Not all need to be long — but all need to be present.
1. Role context — What expertise should Claude operate with?
Not "you are an expert developer" (filler). Meaningful role context tells Claude the domain, the constraints of that domain, and the level of caution expected:
"You're operating as a production systems engineer on a live e-commerce platform. Cautious approach. Any changes to checkout or payment flows require explicit confirmation before applying."
2. Codebase context — What's the current state of the system?
Claude knows nothing about your codebase unless you tell it. Even with CLAUDE.md, specific tasks need specific context:
"The auth system uses Next.js middleware at
/middleware.ts. Session tokens are validated vialib/auth.ts:validateSession(). We switched to JWT 3 days ago — there may be old cookie-based session validation code still in routes."
3. Task definition — What exactly needs to happen?
The single most important element. One sentence minimum. State the end state, not just the action:
"Audit all route files under
app/api/for places that callgetUserFromCookie()— that function was deprecated when we switched to JWT. Replace each call withgetUserFromToken(req)fromlib/auth.ts."
4. Scope constraints — What must not be changed?
Explicit scope limits prevent Claude from "helpfully" touching adjacent code that you didn't intend to change:
"Only modify files under
app/api/. Don't touchmiddleware.ts,lib/auth.ts, or any test files. Don't change any function signatures."
5. Output format — How should results be structured?
Especially important for review tasks or when you need to process Claude's output further:
"Before making any changes, list every file you'll touch and what change you'll make. Show the list and wait for my approval."
6. Verification criteria — How will you know it's done?
Gives Claude a concrete success condition:
"After changes: run
tsc --noEmitand report the result. If there are TypeScript errors, fix them before reporting completion."
Bad prompt:
Fix the auth bug.This will fail. Claude doesn't know which auth file, which bug, what "fixed" looks like, or what not to touch.
Good prompt — same task:
You're working in a Next.js 15 App Router codebase with JWT auth.
Context:
- Auth middleware: middleware.ts (validates JWT in Authorization header)
- Session helper: lib/auth.ts — getUserFromToken(req: Request): User | null
- Bug: API routes under app/api/ still call getUserFromCookie() which was removed 3 days ago when we switched to JWT
Task:
Find every file under app/api/ that calls getUserFromCookie() and replace each call with the correct getUserFromToken(req) pattern from lib/auth.ts.
Constraints:
- Only touch files under app/api/
- Don't modify middleware.ts or lib/auth.ts
- Don't change any function signatures or return types
Before making changes: list the files you'll modify and what you'll change in each. Wait for confirmation.
Verification: after changes, run tsc --noEmit and report the result.This is 3× longer, but the task is 10× more likely to complete correctly in one pass.
Role context is most useful when it changes Claude's behavior — not just its confidence.
Failure Pattern — Filler role context
✕ Before (broken pattern)
You are an expert senior software engineer with 15 years of experience. Please help me with the following task...
✓ After (production pattern)
You're working on a production WordPress site with 50,000 active users. Treat every content operation as if failure would impact live traffic. Do not apply any changes without a dry-run pass first.
Lesson: Generic seniority framing does nothing. Role context that describes domain-specific caution and specific behavioral constraints actually changes Claude's approach.
The most valuable single element of the framework — and the most commonly omitted.
Without a scope constraint, Claude extends fixes to adjacent code. This is usually helpful in simple cases and often destructive in complex ones:
Add this to every task: "Only modify [specific scope]. Do not touch [adjacent things]."
Without a verification step, Claude declares "done" based on its mental model of correctness. That model is good but not perfect, especially for runtime behavior.
The minimum viable verification:
"After changes, run:
node node_modules/typescript/bin/tsc --noEmitand report the result."
For anything that touches data or APIs:
"After changes, run the test suite with
npm testand report any failures."
Your prompt framework is calibrated when:
Milestone 4
You have the framework that makes every task you give Claude more likely to succeed on the first pass. Apply it consistently — especially for complex, multi-file tasks where the cost of failure is highest.