Zero-Ambiguity Specs: The Skill That Makes Agentic Development Actually Work
I pulled 704 closed issues from my own GitHub repos. The data revealed the single biggest predictor of agent success — and it's not the prompt.
Zero-Ambiguity Specs: The Skill That Makes Agentic Development Actually Work
I pulled the data from my own GitHub repos. 704 closed issues across 5 projects. The pattern was unmistakable — and it wasn't what I expected.
Everyone talks about prompting. "Get better at prompting and the AI will give you better output." That's true — the way "get better at driving and you'll win more races" is true. It's correct and useless. It skips the part that actually matters.
The part that actually matters is the spec.
Not the prompt you give the agent in the moment. The spec — the GitHub issue, the task definition, the document that tells the agent exactly what to build, exactly where to build it, exactly what "done" looks like, and exactly what not to touch. The thing you write before you open a terminal.
I've now built 10+ production projects with agentic development. And the single biggest predictor of whether an agent succeeds or fails on a task isn't the model, the temperature, or the system prompt. It's the quality of the spec I wrote before the agent started.
I call them zero-ambiguity specs. And the data from my own repos proves they work.
The Data: 704 Issues Across 5 Projects
I pulled every closed issue from my five most active GitHub repos — PlayTrack, Client Project 1, GoodSitterGreatSitter, Client Project 2, and PaintTheWorld — and analyzed them for structural quality markers: word count, file lists, acceptance criteria, test checklists, and effort estimates.
The results tell a clear story about how my spec-writing evolved over 10 weeks.
playtrack
client 1
client 2
GSGS
client 2
client 1
paint
The trend is clear: my specs got 4x longer, 4x more structured, and dramatically more explicit over 10 weeks. This wasn't a conscious decision at first — it was the natural result of encoding lessons from failures.
What a Zero-Ambiguity Spec Looks Like
Let me show you an actual before and after. These are real issues from my repos — anonymized slightly, but structurally identical to what I wrote.
The difference isn't just word count. It's the elimination of ambiguity at every decision point.
The Five Components of a Zero-Ambiguity Spec
After writing 700+ issues and watching agents succeed or fail on each one, I've distilled the pattern into five components. Every spec needs all five. Skip one and you're rolling the dice.
1. Files to Change (and Files to Leave Alone)
This is the single most impactful thing you can add to a spec. Agents that know exactly which files to modify produce dramatically better results than agents told to "build a comment system."
The "files to leave alone" section is equally important. Without it, agents will helpfully refactor code you didn't ask them to touch, introduce changes across unrelated modules, and create integration problems that take longer to fix than the original feature took to build.
2. Acceptance Criteria That Are Testable
Not "make sure it works." Not "should be good quality." Specific, binary, testable statements:
- Bad: "Comments should work properly"
- Good: "Post comment clears input and appears via optimistic update. Empty state shows 'No comments yet' message. Comment count in feed card matches actual count."
If you can't test it with a yes/no answer, it's not an acceptance criterion. It's a wish.
3. User Story With Context
Not just what to build, but why. Agents make better implementation decisions when they understand the user's goal. "As a user viewing a feed item, I want to read and post comments so I can engage with other members" tells the agent that this is a social feature — which influences everything from the UI treatment to the data model.
4. Test Checklist
A specific list of manual verification steps. Not "write tests" (the agent will do that anyway if your patterns doc says so). A list of things a human should check after the feature is built:
- Post a comment, verify it appears in the database
- Reload the page — comment persists
- Check the console — no errors
This is your quality gate. It tells the agent what "done" actually means in practice.
5. Effort Estimate and Dependencies
Not for sprint planning. For scope control. An "M (2-3 hours)" estimate tells the agent: this is a medium-complexity task. If you find yourself building something that would take 8 hours, you've gone off course. Come back and ask.
Dependencies prevent agents from building features that rely on code that doesn't exist yet. "Depends on #391" means: check that #391 is merged before you start.
What the Failure Data Reveals
I also searched all 704 closed issues for failure indicators — keywords like "revert," "broke," "missed," "try again," and "re-open" appearing in the issue body. These are issues where something went wrong during implementation.
| Keyword | Occurrences | Most Common In |
|---|---|---|
| broke | 19 | GoodSitterGreatSitter |
| try again | 14 | Client Project 2, Client Project 1 |
| missed | 11 | Client Project 2, Client Project 1 |
| revert | 2 | Client Project 1 |
| re-open | 2 | Client Project 2 |
| wrong file | 0 | — |
Three things jumped out:
1. "Broke" clusters in the middle of project timelines, not at the start. The early issues were simple enough that vague specs still worked. The failures happened when complexity increased but spec quality hadn't caught up yet. There's a danger zone — projects complex enough to fail but not yet disciplined enough to prevent it.
2. "Try again" and "missed" correlate with spec length, not inversely. Some of the longest issues (6,000-8,000 words) still triggered "try again." Length doesn't equal clarity. A 600-word spec with all five components outperforms a 6,000-word spec that's really just a brain dump.
3. Every single failure issue had acceptance criteria. The failures weren't from missing AC — they were from AC that was too vague, or from missing file lists that let the agent wander into the wrong part of the codebase. The lesson: AC is necessary but not sufficient. You need the full five-component spec.
The Zero-Ambiguity Principle
Here's the principle I now operate by:
If an agent has to make a judgment call that I didn't explicitly address in the spec, the spec failed — not the agent.
This flips the mental model. Most developers blame the AI when it produces the wrong output. But in my experience, 90% of the time the problem is that the spec left room for interpretation. The agent made a reasonable choice — just not the one I wanted.
The fix isn't better prompting. It's better specs.
Zero-ambiguity doesn't mean zero judgment. Agents still make hundreds of micro-decisions during implementation — variable names, error handling patterns, component structure. Those decisions are fine because they're governed by your PATTERNS.md and CLAUDE.md files.
Zero-ambiguity means: every architectural decision, every scope decision, every what goes where decision is made by the human, written into the spec, before the agent starts. The agent executes. The human designs.
The ROI of Spec Quality
Here's the math that convinced me to spend 20 minutes writing a spec instead of 5:
5-minute spec (early style):
- Agent builds something: 30 minutes
- It's wrong or incomplete: 50% chance
- Rework: 30 more minutes
- Total: 65 minutes average, unpredictable quality
20-minute spec (zero-ambiguity):
- Agent builds exactly what I specified: 30 minutes
- It's right on first try: 80% chance
- Minor adjustment: 10 minutes (20% of the time)
- Total: 52 minutes average, predictable quality
The zero-ambiguity spec takes 4x longer to write but delivers 20% faster total cycle time, 3x fewer iterations, and — crucially — predictable outcomes. When you're running 5 agents in parallel across multiple projects, predictability is worth more than speed.
Start Here
If you're building with agents and your success rate feels random, start by adding these to your next 10 issues:
- Files to change — be explicit, list every file
- Files to leave alone — prevent scope creep
- Testable acceptance criteria — binary yes/no checks
- Test checklist — what a human verifies after
- Effort estimate — scope control, not scheduling
You'll feel like you're over-specifying. That feeling is wrong. You're removing the decisions that agents shouldn't be making.
The specs get faster to write as you develop the habit. After 700+ issues, my zero-ambiguity specs take me about 15 minutes. They save me hours on every task. And they compound — because the patterns you encode in your specs feed into the patterns your agents learn to follow.
Zero ambiguity. Maximum leverage. That's the skill.
Dave Shak is a Fractional CTO and Agentic Development consultant based in Kitchener-Waterloo. He has 20+ years of experience scaling engineering organizations and now helps teams adopt agentic development workflows.