Design specification for the evidence layer — how screenshots, deployment logs, command histories, debugging records, and operational timelines integrate into tracks, failures, playbooks, case studies, and labs.
This document defines the artifact system for AI Execution Lab. An artifact is any durable, specific piece of evidence produced by real execution. The platform's authority depends on artifacts — without them, content is theory. With them, content is operational record.
The artifact architecture answers: what types of evidence exist, where they live, what format they take, and how they connect to the content layer.
The platform's claim — "operational record of real AI-native systems work" — is only as credible as the evidence it produces. Text descriptions of what happened are not evidence. The evidence is:
Every piece of published content should link to or embed at least one artifact. Content without artifacts is content that could have been written without doing the thing.
Definition: A PNG image capturing a specific UI state, error, or before/after comparison.
Required fields:
alt text: describes the specific element, not just the page contextFile naming convention:
public/evidence/[content-slug]/[descriptor]-[YYYY-MM-DD].png
Examples:
public/evidence/env-vars-secrets/vercel-dashboard-env-scope-2026-05-18.png
public/evidence/build-failure-diagnosis/typescript-error-edge-runtime-2026-04-12.png
public/evidence/adsense-approval-reality/adsense-rpm-screenshot-2026-03-15.png
When required:
When not sufficient:
Definition: The actual terminal output from running a command sequence, preserved in a code block.
Format: Always a fenced code block with the appropriate language hint:
```bash
$ git revert HEAD --no-edit
[main a3f92b1] Revert "add broken deployment config"
1 file changed, 3 insertions(+), 5 deletions(-)
$ git push origin main
Enumerating objects: 5, done.
...
To github.com:org/repo.git
b8a3d2c..a3f92b1 main -> main
```
Authenticity rule: Command history must be real output, not reconstructed. If you reconstruct (because you forgot to copy the output), mark it as # reconstructed from session notes in a comment.
Common uses:
tsc --noEmit error listingsDefinition: A verifiable record that a deployment happened, what it contained, and whether it succeeded.
Components:
Deployment record:
- Date and time (UTC)
- Commit SHA: [40-char hash]
- Deployment URL: [Vercel deployment URL, not production alias]
- Build time: [n] seconds
- Result: Success / Failed
- Build log excerpt: [relevant lines if the story requires them]
Where it appears: In Failure Archive entries (documenting what was deployed when a failure occurred), in deployment-related playbooks (showing what a successful deployment record looks like), and in case studies (as part of the before/after operational timeline).
Vercel-specific evidence: The Vercel dashboard URL for a specific deployment is permanent and can be linked to directly. When documenting a specific deployment event, include the full Vercel deployment URL — not just the production alias.
Definition: A structured record of a work session — what was attempted, what tools were used, what happened, and what was produced.
Format:
## Execution Log: [Operation name]
**Date:** YYYY-MM-DD
**Duration:** [n] hours
**Environment:** [OS, tool versions, project context]
**Objective:** [One sentence: what was being accomplished]
### Session sequence
1. [Action]: [Tool/command] → [Result]
2. [Action]: [Tool/command] → [Result]
...
### Output artifacts produced
- [List of files created or modified]
- [Screenshots taken]
- [Measurements recorded]
### Failures encountered
- [Failure description] → [Root cause] → [Resolution]
### What would be done differently
- [Specific change to approach or tooling]
Where it appears: As standalone type: 'log' content items, embedded in case studies to provide session-level evidence, and in Lab content where the log is the primary deliverable.
Relationship to case studies: A case study synthesizes evidence from one or more execution logs. The execution log is the raw record; the case study is the structured analysis. Both can exist as separate published content items.
Definition: The complete evidence record from a debugging session, structured for reproduction and future reference.
Required components:
Error record:
- Exact error message (verbatim, in a code block)
- Error class: Build / Runtime / Type / Logic / Authentication / Rate-limit / Other
- Stack trace (if available)
- First occurrence: date, environment, operation being performed
Reproduction conditions:
- Exact state required to trigger: file content, env var values (redacted), command sequence
- Whether it reproduces consistently or intermittently
- Whether it reproduces only in specific environments (local / preview / production)
Diagnosis sequence:
- What was checked first (and why)
- What was ruled out (and the evidence that ruled it out)
- What the root cause turned out to be
Fix:
- Exact file changes (diff or code block)
- Command sequence to verify fix applied
- Verification that the error no longer occurs
Time-to-diagnose: [n] hours / minutes — honest estimate
Where it appears: As the primary structure for type: 'failure' content in the Failure Archive. Referenced from lesson content when a lesson covers a topic where a specific failure case study exists.
Definition: A structured comparison of system state before and after an operation, showing exactly what changed.
Format options:
For code changes:
**Before** (`src/components/nav.tsx`, line 23–31):
```tsx
// original code block
```
**After**:
```tsx
// modified code block
```
For configuration changes:
Before (vercel.json):
{
"framework": "nextjs"
}
After (vercel.json):
{
"framework": "nextjs",
"headers": [...]
}
For metric changes (use a table):
| Metric | Before | After | Date of change |
|---|---|---|---|
| Lighthouse Performance | 61 | 89 | 2026-04-15 |
| Build time | 47s | 23s | 2026-04-15 |
| TypeScript error count | 14 | 0 | 2026-04-14 |
Where it appears: In failure reports (the fix), in playbooks (the expected state change), in case studies (the outcome measurement), and in deployment-related content where configuration changes need illustration.
Definition: A dated screenshot or data export from an analytics system (GA4, GSC, AdSense, Vercel Analytics) used as evidence for a specific claim.
Requirements:
Data claims that require snapshot evidence:
Where it appears: In the AI Business Zero Budget track (confirming operational results), in case studies (outcome evidence), and in the Failure Archive where performance regressions need documentation.
Definition: A chronological record of events in a system's history, used to establish what happened and when.
Format:
## Timeline: [System or operation name]
| Date | Event | Evidence |
|---|---|---|
| 2026-03-01 | Property created on Vercel | Deployment log: [URL] |
| 2026-03-03 | Custom domain configured | Screenshot: custom-domain-setup-2026-03-03.png |
| 2026-03-15 | First organic session | GA4 snapshot: first-organic-session-2026-03-15.png |
| 2026-04-12 | Edge runtime failure | Failure report: edge-runtime-crypto-failure |
| 2026-04-12 | Failure resolved | Deployment log: fix-commit SHA |
| 2026-05-01 | 1,000 sessions milestone | GA4 snapshot: 1000-sessions-milestone.png |
Where it appears: In case studies that cover multi-month operations, in the platform's own operational documentation (for transparency about when content was built vs. documented), and in failure reports where the timeline of the failure matters to understanding the root cause.
Lessons reference artifacts, they don't contain raw evidence. A lesson about environment variable management should link to or embed a specific screenshot showing the Vercel env scope UI — but the lesson's job is to explain the procedure, not to present raw session logs.
Integration pattern:
LessonMeta.evidence field: cites the specific evidence base for the lesson's claimspublic/evidence/[lesson-id]/Failure reports are the primary home of Type 5 (Debugging Evidence). Every failure report is structured debugging evidence.
Required artifacts:
Playbooks document procedures. Artifacts in playbooks show what correct execution looks like.
Required artifacts:
Case studies require the most artifact density. A case study without evidence artifacts is a narrative, not a case study.
Required artifacts:
Labs produce artifacts as their primary output. The completion criterion for a lab is having produced specific artifacts.
Required artifacts:
public/
evidence/
[content-slug]/ ← one directory per piece of content
[descriptor]-[date].png
[descriptor]-[date].png
shared/ ← evidence used by multiple pieces
[descriptor]-[date].png
Naming rules:
Source preservation:
.txt file documents the URL and context:
Source: https://vercel.com/dashboard/[project]/deployments/[id]
Date: 2026-04-12
Context: Deployment failure in env-vars-secrets lesson evidence
Every piece of published content must include at least one artifact. The artifact standard by content type:
| Content Type | Minimum Artifact | Preferred |
|---|---|---|
| Lesson | One real command block with actual output | + One screenshot of UI step |
| Failure Report | Verbatim error message in code block + fix steps | + Stack trace + reproduction conditions |
| Playbook | Command sequence with real output | + Before/after state |
| Case Study | Before/after comparison + one measurement | + Operational timeline + analytics snapshot |
| Lab | Execution log + terminal output | + Screenshot of completed state |
| Project | Evidence of completion artifact | + Linked to relevant track content |
The no-decoration rule: An artifact that doesn't add evidence doesn't belong. A screenshot of a UI state that could have been described in one sentence is decoration. Include it only if the visual state is more informative than prose.
Phase 1 (current): Artifacts live as static files in public/evidence/, referenced from MDX content. No database, no upload system. This scales to ~500 pieces of content with manageable directory structure.
Phase 2 (when auth is stable): Artifacts become their own content objects with metadata — date, tool, operation, related content. An operator can submit their own execution evidence to a lesson. Analytics snapshots can be verified via OAuth-linked GA4 queries. This requires the Supabase infrastructure outlined in platform-vision-architecture.mdx.
The Phase 1 static approach is intentional — it keeps the content system simple and the artifacts tied directly to the content that references them. Phase 2 adds queryability, not production-readiness.
Execution artifacts architecture v1.0 — 2026-05-18.