Spec Driven Development
SPEC DRIVEN
DEVELOPMENT
A methodology for AI-native software engineering.
One spec. Two agents. Complete traceability.
MARKDOWN NO FRAMEWORK NO SDK OPEN PROCESS
Kevin Ryan
01
Context
WHAT IS SDD?
Spec Driven Development has roots in formal specification, design-by-contract, and the broader shift toward treating specifications as executable artefacts. StrongDM, Anthropic, and others are already building production software from specs using AI agents.
This presentation introduces a practical method for putting SDD into practice.
The unique contribution is provenance — the mechanism that makes specification-driven workflows auditable, traceable, and verifiable. How agents communicate through documentation. How reasoning is captured as a byproduct of building. How that creates compliance-ready evidence without additional effort.
SDD EXISTS
The concept of building software from specifications using AI agents is established and in production today.
WHAT'S NEW HERE
A concrete, open method — the provenance chain, the builder-tester separation, and the templates to implement it today.
WHAT YOU'LL LEAVE WITH
A workflow, three templates, and the understanding of why provenance is the core innovation.
Kevin Ryan
02
The Problem
NOT ALL AI CODING IS EQUAL.
Dan Shapiro's Five Levels of Vibe Coding maps where teams actually operate — vs where they think they are.
← 90% of developers are here
L0 L1 L2 L3 L4 L5 Spicy Autocomplete Coding Intern Junior Developer Developer as Manager Developer as Product Owner Dark Factory Kevin Ryan
03
The Opportunity
SDD GETS YOU
TO LEVEL 3–4.
Level 5 — the fully autonomous dark factory — that's a whole different problem set. Different tooling, different organisational structure, different economics.
But the method described in this deck can move you from L0–2 to L3–4 today. Write specs. Delegate to agents. Verify through provenance. That's the transition from writing code to directing agents.
L0–2 — WHERE YOU ARE
AI suggests, you accept or reject. You write code alongside an AI assistant. The human is still the bottleneck.
L3–4 — WHERE SDD TAKES YOU
You specify, the agent builds, a separate agent verifies. Provenance captures the reasoning. The human writes specs and reviews results.
L5 — DARK FACTORY
Fully autonomous. No humans in the loop. Digital twins, external scenarios, thousand-dollar daily compute budgets. A different talk.
Kevin Ryan
04
The Insight
IT'S ALL
ABOUT
THE SPEC.
With agentic AI, the spec you hand the agent is the implementation instruction. The agent reads your spec and builds the software. If the spec is structured well enough, it also defines the verification scenarios — not as a side effect, but inherently.
The spec is the single source of truth for what the software does, how it gets built, and how it gets verified.
THE OLD BOTTLENECK
Implementation speed. Can we build it fast enough? Can we hire enough engineers?
THE NEW BOTTLENECK
Specification quality. Can we describe what needs to exist precisely enough that agents can build it?
The skill that matters now is the ability to specify — clearly, completely, and verifiably.
Kevin Ryan
05
The Architecture
THE INFINITY LOOP
The spec sits at the top. Scenarios sit at the bottom. Provenance is the crossing point where the two agents communicate. Code is the canonical context — the reality everything else orbits.
SPEC The authority PROVENANCE The reasoning SCENARIOS The evidence
LEFT LOOP — BUILDER
Reads spec → Writes code → Produces provenance
CROSSING — PROVENANCE
Both agents read and write. Neither talks directly. The document is the interface.
RIGHT LOOP — TESTER
Reads spec + provenance → Writes scenarios → Runs tests
CODE — CANONICAL CONTEXT
Self-describing. Any agent reads provenance to understand why, reads code to understand what.
Kevin Ryan
06
The Spec — What It Contains
THE SPEC SPECIFIES
EVERYTHING.
The spec defines both the product and the process that produces the product. Not just what to build — but who does the work, what each worker is responsible for, and how they communicate.
REQUIREMENTS
Functional, non-functional, constraints, assumptions. Specific enough that an agent can implement them and a separate agent can verify them.
ARCHITECTURE
Component boundaries, data flow, interfaces. Enough for the builder to make decisions and enough for the tester to know what to probe.
AGENT ROLES
Builder and tester roles defined in the spec itself. At prompt time: "You are the builder. Here is the spec. Do your job."
PROCESS
What each agent reads, what it produces, where it writes. The spec is the orchestrator. No LangGraph. No supervisor agent. The workflow is the spec.
Kevin Ryan
07
The Builder Agent
BUILD AND
SHOW YOUR
WORKING.
The builder doesn't just write code. It produces evidence of how it interpreted the spec — every assumption, every ambiguity, every decision. It's not marking its own homework. It's handing in its homework with its working shown.
01
Read the spec
Full specification, prerequisites, current state
02
Build the software
Implement as specified
03
Write provenance as you go
Assumptions, ambiguities, decisions — not after the fact
04
Commit spec + code + provenance
One atomic unit. Never separated.
Do not write tests
That is not your role
Kevin Ryan
08
The Testing Agent
CHALLENGE
EVERY
ASSUMPTION.
A separate agent because the builder has blind spots. If the builder misunderstood the spec, it will write code that reflects the misunderstanding and tests that confirm it. Everything passes. Everything's wrong. The testing agent reads the spec and provenance — never the code — and finds the daylight between them.
01
Read the spec
Full specification — the authority
02
Read the provenance
What the builder claims it did and why
03
Find the daylight
Gaps, assumptions, ambiguities, silences
04
Write prose scenarios
Plain language — what's being tested and why
05
Implement tests from scenarios
Executable code, derived from the prose
06
Update the provenance
Findings, results, recommendations — append, never overwrite
Kevin Ryan
09
Provenance
THE REASONING
RECORD.
NOT A MAP.
Code is self-describing — any agent can read a codebase and understand its structure, patterns, and dependencies. What the code can't tell you is why. Why is the timeout 30 minutes? Why this library and not that one? Why does this module exist at all? That's the provenance. The reasoning layer that answers the questions code can't answer about itself.
CODE — THE WHAT
Self-describing context
Any agent can read and navigate it
The canonical reality of the system
What exists right now
PROVENANCE — THE WHY
Decisions made, assumptions held
Ambiguities interpreted
Layered: builder writes, tester appends
Why it exists this way
TOGETHER
Agent reads code → understands what
Agent reads provenance → understands why
Context window rebuilt from scratch each session
Provenance is pre-loaded understanding
Kevin Ryan
10
The Testing Agent
THE CROSS-
EXAMINATION.
The testing agent never sees the code. It has two inputs: the spec (what was intended) and the provenance (what the builder says it did). Its job is to find the daylight between those two documents.
A separate agent because the builder has blind spots. If the builder misunderstood the spec, it will write code that reflects the misunderstanding and tests that confirm it. Everything passes. Everything's wrong.
Gaps
Requirements the provenance doesn't address
Assumptions
Decisions the builder made where the spec was silent — primary targets
Ambiguities
Places the builder interpreted unclear requirements
Silences
Things the builder didn't mention at all — red flags
Kevin Ryan
11
Scenarios — Prose First
PROSE FIRST.
CODE SECOND.
The testing agent writes a markdown scenario — plain language explaining what's being tested and why — before it writes a single line of test code. The code is derived from the prose, not the other way around.
A product owner can read the scenario. A regulator can read it. A client who knows nothing about code can say "yes, that's the right question to ask" or "actually, don't test for that — update the spec."
S-003: TOKEN EXPIRY HANDLING
TRIGGERED BY: ASSUMPTION A1
The spec requires authentication on all protected routes. The provenance states the builder implemented JWT validation with a 30-minute expiry, but the spec is silent on expiry duration.
EXPECTS:
Expired tokens return 401
Expiry period is documented
FAILS IF:
Expired token returns 200
No expiry validation exists
TEST: tests/auth/token-expiry.test.ts#L12
Kevin Ryan
12
The Loop — Closing It
FAILING TESTS ARE
WORK ORDERS,
NOT BUG REPORTS.
A failing test with provenance is a diagnosis. The builder doesn't get "line 47 assertion failed." It gets the prose scenario, the gap between spec and provenance, and a recommendation for what to fix.
FAIL
Scenario S-003 fails. Token expiry not validated.
READ
Builder reads provenance. Tester's findings explain what and why.
FIX
Builder fixes code. Updates provenance with new entry.
PASS
Tester re-runs. Scenario passes. Loop continues.
No human touched the code. No human wrote a test. No human triaged a bug. A human wrote a spec. Everything else is derived.
Kevin Ryan
13
The Provenance Chain
FIVE ARTEFACTS.
FIVE PURPOSES.
COMPLETE LINEAGE.
SPEC
Intent
What should exist
and why.
CODE
Reality
Self-describing.
Canonical context.
PROVENANCE
Reasoning
Why it's this way.
The decisions made.
SCENARIOS
Challenge
Plain language.
What's being verified.
TESTS
Execution
Derived from scenarios.
Pass or fail.
The spec describes what the code should be. The code describes itself. The provenance explains why the code is the way it is. The scenarios challenge the code based on the gap between spec and provenance. The tests execute against the code.
Code is the reality every other artefact exists in relation to. Provenance is the reasoning that makes the code navigable.
Kevin Ryan
14
Why It's Different
MARKDOWN AND
A PROCESS.
THAT'S IT.
No SDK. No framework. No vendor lock-in. No orchestration engine. Anyone with access to an AI agent and a markdown editor can implement SDD today. The entire methodology fits in three templates.
THE INDUSTRY SAYS
Install our agent framework
Use our evaluation platform
Buy our orchestration tool
Adopt our scoring rubrics
SDD SAYS
Write a spec in markdown
Tell one agent to build
Tell another agent to verify
Let the documents do the rest
The spec is the orchestrator.
The provenance is the protocol.
The tools are whatever you have.
Kevin Ryan
15
The Value
WHAT SDD
GIVES YOU.
TRACEABILITY
Every test traces to a scenario, every scenario to a provenance entry, every entry to a spec requirement. End to end.
COMPLIANCE
Provenance is an audit trail. EU AI Act, SOC 2, ISO 27001 — the chain of evidence already exists.
SEPARATION
Builder and tester can't share blind spots. Same principle as code review, enforced architecturally.
SIMPLICITY
Three markdown templates. No framework. No vendor. The human maintains one artefact: the spec.
ACCESSIBILITY
Anyone with an AI agent and a text editor. No specialised tooling. The methodology is the tool.
SPEC QUALITY
Forces rigorous specification. Vague specs produce vague provenance, vague scenarios, unjudgeable outcomes.
Kevin Ryan
16
The Landscape
WHAT SDD DOES
NOT GIVE YOU.
MULTI-AGENT ORCHESTRATION
SDD uses two roles talking through documents. If you need dozens of agents coordinating dynamically in real-time, LangGraph, CrewAI, and AutoGen solve that problem. Most teams aren't there yet.
REAL-TIME AGENT MONITORING
SDD tests the software, not the agent. If you need continuous evaluation of agent behaviour in production — drift, hallucination, alignment — Arize, Braintrust, and Bloom address that.
DYNAMIC TOOL DISCOVERY
SDD specs are static documents. If your agents need to discover and compose tools at runtime, MCP servers and tool registries are built for that.
LONG-TERM AGENT MEMORY
SDD uses provenance as persistent context, but it doesn't give you vector stores, RAG pipelines, or cross-session memory systems.
These technologies solve real problems. But they solve advanced problems. SDD sharpens the thinking that makes every other tool more effective.
Start with the spec. Graduate to complexity when the problem demands it.
Kevin Ryan
17
Live Demo
LIVE
DEMO
SDD in action. One spec. Two agents. Provenance, scenarios, and tests — generated live.
01 — THE SPEC
A real spec with agent roles, requirements, and constraints. Markdown. Nothing else.
02 — THE BUILDER
Hand the spec to the builder agent. Watch it build and produce provenance — assumptions, decisions, ambiguities — in real time.
03 — THE TESTER
Hand the spec and provenance to a separate agent. Watch it find the daylight, write prose scenarios, and generate executable tests.
04 — THE LOOP
Failing scenario → builder reads provenance → fixes → tester re-runs. The infinity loop, live.
Kevin Ryan
18
THE BOTTLENECK HAS
MOVED FROM CODE
TO SPECIFICATION.
SDD IS THE METHOD.
One spec. Two agents. Five artefacts. Complete traceability from intent to verification.
No framework required. Start today.
Kevin Ryan & Associates
kevinryan.io
sddbook.com
Kevin Ryan
19
Bonus Content
BONUS
CONTENT
Provenance, audit, and regulation — why SDD is a compliance
asset for SOC 2, ISO 27001, and the EU AI Act.
SOC 2 ISO 27001 EU AI ACT AUGUST 2026
Kevin Ryan
20
Provenance & Compliance
YOUR AUDITOR'S AI
IS GOING TO ASK
HOW THIS WAS BUILT.
When a human builds software, the reasoning exists in their head, in Slack threads, in PR comments. You can reconstruct it — badly — after the fact. When an agent builds software, the reasoning exists in the context window. The session ends. The reasoning evaporates. Unless you capture it.
You cannot retrospectively create provenance for decisions that were never documented. SDD means you never have to.
SOC 2
Change Management Controls
Auditors require evidence that changes are authorised, documented, and traceable. SDD's provenance chain is that evidence — spec to code to verification.
ISO 27001
Annex A — Secure Development
Requires documented development procedures, separation of duties, and design review records. SDD's builder-tester separation and layered provenance satisfy this structurally.
EU AI ACT
Articles 11, 12, 19 — Full Force August 2026
Technical documentation, automatic record-keeping, and retained logs for high-risk AI systems. Provenance is all three — generated in real time, not written after the fact.
Kevin Ryan
21
Provenance & Compliance
COMPLIANCE
AS A BYPRODUCT.
NOT A PROJECT.
Most organisations will try to bolt compliance documentation onto AI-built systems after the fact. They will hire consultants to retrospectively construct the design history that auditors and regulators demand. SDD produces that documentation as an inherent part of the build. You don't retrofit it. It already exists.
AUDITOR ASKS
Why does this system behave this way?
SCENARIO
This test exists because provenance entry C1 challenged assumption A3.
PROVENANCE
A3 was assumed because the spec was silent on expiry. Builder chose 30 minutes.
SPEC
Requirement FR-007: all protected routes require authentication.
€35M
EU AI Act — 7% turnover
Prohibited practices
SOC 2
FAIL
Lost contracts, lost clients
Unrecoverable trust damage
ISO 27001
NC
Non-conformity finding
Certification at risk
SDD solves the documentation and traceability problem across all three frameworks — the one that requires you to have been recording decisions from the start.
Kevin Ryan
22