Agile Practice

Writing tickets for Spec Driven Development: a practical guide for teams and product owners

Something has shifted in how development teams use tickets. For two decades, the user story worked well enough: a brief statement of intent written from the user's perspective, designed to prompt a conversation between the person with the problem and the team solving it. The story was never supposed to be the full specification. It was the beginning of one. The understanding would fill in during planning, during pairing, during the natural back-and-forth of a team working closely together. That model assumed a team working closely together. It assumed that ambiguity would be resolved in conversation. Spec Driven Development changes the equation. When an AI is doing a significant share of the implementation, the conversation that filled in the gaps does not happen. The AI implements exactly what the ticket says. If the ticket is a user story, the AI implements a guess.

April 202626 min read

AgileSpecificationProduct OwnershipAIBDD

The user story was never the specification

The user story format, 'As a [role], I want [feature], so that [benefit]', was designed by Kent Beck and Ron Jeffries as a placeholder for a conversation, not a specification document. The three parts were meant to capture who, what, and why at a level of detail sufficient to schedule the work and open a dialogue. The actual details would come out in the conversation between the developer and the person requesting the feature. Story points would estimate the effort. Sprints would create the deadline. The team would figure out the rest.

This worked because development teams were built around human collaboration. A developer could walk over to a product owner, ask what exactly should happen when the user submits an empty form, and get an answer in thirty seconds. The ambiguity in the story was a feature, not a bug: it invited conversation rather than pretending that all questions had been answered in advance.

The challenge Spec Driven Development teams are discovering is that this implicit contract no longer holds. When a developer uses AI tools to implement a feature, the AI does not walk over and ask the PO about edge cases. It implements. It makes assumptions. It produces coherent-looking output that may answer a different question than the one the team actually had. The story card that was once a conversation starter becomes a specification, whether you intended it to be one or not.

This is not a criticism of user stories. They were fit for purpose in the context they were designed for. The context has changed. A ticket that was sufficient to initiate a human conversation is not sufficient to drive an AI implementation. The gap between story intent and implemented behaviour, which used to be bridged by dialogue, now has to be bridged by writing.

The story was never the specification. It was the beginning of one. The conversation that filled in the gaps no longer happens when an AI is doing the implementation.

What Spec Driven Development actually asks of your tickets

Spec Driven Development, in its simplest form, inverts the traditional relationship between specification and implementation. Instead of beginning with an incomplete story and letting implementation decisions emerge through conversation, you write the specification first, in enough detail that the implementation is a predictable consequence. The developer or the AI implements against the spec. The review validates against the spec. The spec is the single source of truth.

A development-ready SDD ticket has to answer several questions that a user story intentionally leaves open. What are all the possible outcomes of this feature, not just the happy path? What happens when the user provides invalid input, when a dependency is unavailable, when concurrent requests arrive simultaneously? What are the explicit constraints on performance, security, and data handling? What is explicitly out of scope, so that the implementation does not creep into adjacent functionality?

This does not mean tickets have to be long. It means they have to be complete. The difference matters. A bloated ticket full of implementation details the team has not agreed on is not a good spec. A concise ticket that names every significant scenario, every error state, and every constraint is. The goal is not volume. It is the absence of assumptions that the developer or the AI would otherwise have to make silently.

There is a useful test for whether a ticket is SDD-ready: can you give it to an AI coding assistant with no additional context and get an implementation that, after review, would meet your definition of done? If the AI has to make significant choices you did not specify, the ticket is not ready. Those choices will either be made incorrectly, or they will require a round of review and correction that the spec should have prevented.

A development-ready spec is not a long ticket. It is a complete one. The goal is the absence of assumptions the implementer would have to make silently.

SDD in 2025: from emerging practice to industry consensus

In 2025, Spec-Driven Development moved from an emerging practice to an industry recommendation. Thoughtworks placed it at 'Adopt' level on its Technology Radar, its highest endorsement for engineering techniques. The same year, GitHub released Spec Kit, an open-source toolkit that formalises the specify-plan-build workflow for use with AI coding assistants, and Amazon Web Services launched Kiro, a development environment built on the premise that specifications, not conversations, should drive AI implementation. Martin Fowler at ThoughtWorks and Addy Osmani at Google are among the senior engineering voices who have published extensively on why structured specifications matter for teams working with AI tools.

The contrast these practitioners draw is with 'vibe coding': informal, conversational AI-assisted development where the developer describes what they want and iterates without a written spec. Vibe coding produces output quickly. It also produces output whose behaviour depends on assumptions neither the developer nor the AI has made explicit. For weekend projects and throwaway prototypes, that trade-off may be acceptable. For production software with error handling, security constraints, and team-wide readability requirements, it is not. The structured ticket approach described in this article is the alternative that the current tooling ecosystem is built around.

Practitioners writing about SDD in 2025 also describe three levels of engagement, which offer a useful map for where your team sits today and where to aim next.

Level 1

Spec-First

Write a spec to think through the problem. Implement against it. Archive or discard the spec once the feature ships.

Best for: early exploration, prototypes, and features where the spec is primarily a thinking tool rather than a permanent artefact.

Limitation: the next developer working on this code has no spec to reason against. Good starting point, not a destination.

Level 2

Spec-Anchored

Write a spec before implementation. Keep it as a reference after shipping. Update it when the feature changes.

Best for: team features, public APIs, and anything a new team member needs to understand without asking.

Most teams that adopt SDD settle here. The spec becomes the answer to 'why does this work this way?' and the starting point for every future change.

Level 3

Spec-as-Source

The spec is the permanent source of truth. The implementation is derived from it. When the spec changes, the code is regenerated or updated to match.

Best for: teams with mature AI tooling workflows where code generation from spec is reliable and reviewable.

The destination for teams that invest in SDD long-term. Requires discipline and tooling support to maintain, but makes large-scale change tractable.

Most teams working with AI tools today sit between Level 1 and Level 2. The exercises at the end of this article are designed to build the muscle for Level 2 work: tickets precise enough to drive implementation and meaningful enough to keep. The progression to Level 3 is a longer journey, and the tooling is still maturing. The thinking skills that get you there are the same at every level.

The tooling has caught up with the practice. The gap most teams face now is not tools. It is the thinking skill to write specifications that those tools can actually use.

Gherkin and BDD: what they get right, and where they stop short

Behaviour Driven Development (BDD), formalised by Dan North in 2003, introduced Gherkin as a structured, human-readable language for writing behavioural specifications. The Given-When-Then format describes system behaviour in terms a non-technical stakeholder can read and a test automation framework can execute.

This is a genuine contribution to specification quality. Gherkin forces the team to describe behaviour in observable, testable terms. It makes the happy path explicit. It creates executable documentation that can double as automated test criteria. For teams practising BDD well, the scenarios written before implementation serve as both the specification and the test suite.

Gherkin Example

Password Reset: happy-path scenario

Given: a registered user with email user@example.com

When: they request a password reset

Then: they receive a reset link by email within 60 seconds

And: the link expires after 30 minutes

This scenario is precise and testable. It says nothing about what happens when the email is not registered, when the link is clicked twice, about rate limiting, session invalidation, or what is explicitly out of scope.

Where Gherkin falls short as a complete SDD ticket is in scope and completeness. A Gherkin scenario describes one path through the system. Non-functional requirements such as response time limits, security constraints, and concurrency handling are awkward to express in Given-When-Then and are frequently omitted. The out-of-scope declaration, which is one of the most valuable elements of a good spec, has no natural home in Gherkin. The business context that helps implementers make judgment calls at the boundaries of the spec is typically absent.

SDD tickets and Gherkin are not in competition. A well-written SDD ticket can include Gherkin scenarios for the key behavioural paths while surrounding them with the context, constraints, and explicit exclusions that Gherkin does not natively provide. Think of Gherkin as a precise tool for expressing individual scenarios, and the SDD ticket as the broader container that makes those scenarios meaningful.

Gherkin is precise about individual scenarios. A complete SDD ticket is precise about the whole feature, including the edges, the constraints, and what is explicitly not in scope.

What a development-ready SDD ticket looks like

The best way to understand the format is through a side-by-side comparison. Both cards below describe the same feature: password reset via email token.

User Story

Password Reset

"As a registered user, I want to reset my password so that I can regain access to my account if I forget it."

Communicates intent clearly. Opens a useful conversation.

Leaves unspecified: what happens when the email is not registered, how long the link is valid, whether the link is single-use, whether other sessions are invalidated, how many attempts are allowed, what the password rules are.

SDD Ticket

Password Reset via Email Token

Context: Users have no self-service recovery path. Support handles ~30 password tickets per week. System must not reveal whether an email is registered (regulatory).

Acceptance Criteria

User can request reset using registered email.

Response identical whether email is found or not.

Single-use link sent within 60 seconds.

Link expires after 30 minutes.

Using the link immediately invalidates it.

Successful reset invalidates all other active sessions.

Max 3 requests per email per hour.

New password minimum 12 characters.

Error States: Expired link, show error + option to request new. Used link, show confirmation. Rate limit, show wait time. Invalid token, redirect to login silently.

Constraints: Token hashes only in database. Email via existing notification service. Response under 2 seconds.

Out of Scope: SMS recovery, social login recovery, admin-initiated reset.

The user story can be written in thirty seconds. The SDD ticket requires thought. But the thought it requires is thought the team would have to do anyway. The only question is whether it happens before implementation or after, in a review cycle or in a production incident. The SDD ticket front-loads clarity. The user story defers it.

The thought a good SDD ticket requires is thought the team would have to do anyway. The only question is whether it happens before implementation, or during a production incident.

The anatomy of an SDD ticket: what goes in each part

The password reset example above illustrates the format. Here is a closer look at what each part actually asks of the writer, why it exists, and the most common mistakes teams make when filling it in.

Part 1

Title

A specific verb-and-noun phrase describing the deliverable, not the user intent.

Write: 'Filter customer list by status, date range, and name.' Not: 'Search functionality' or 'As a user I want to search.'

A good title tells any team member exactly what is being built before they open the ticket.

Part 2

Context

One to three sentences: what problem exists, who has it, and what it costs the business. The PO writes this.

Write: 'Support handles 40 filter-related calls per week. Current list has no filtering. SLA requires lookup under 2 minutes.' Not: 'Users need better search.'

Good context lets developers make sensible judgment calls at the edges of the spec.

Part 3

Acceptance Criteria

A numbered list of every outcome the feature must produce. One criterion per line. Write testable outcomes only.

Write: 'Filtering by status returns only records with that status.' Not: 'The filter should work correctly.'

Include the happy path, all meaningful variations, and every edge case you can foresee. Vague criteria become bugs.

Part 4

Error States

Every way the feature can fail, and what the system does in each case. The most frequently omitted section, and the most common source of production incidents.

For each error: what triggers it, what the user sees, and whether state is persisted or rolled back.

If an error is silent, say so explicitly. Do not leave it to developer judgment.

Part 5

Constraints

The non-functional requirements that bound the implementation: performance limits, security rules, architectural boundaries, and data-handling policies.

Write: 'Response under 500ms for 10,000 records. No raw SQL. Use existing auth middleware.' Not: 'Should be fast and secure.'

Constraints tell the developer what they cannot do, which is often more useful than what they can.

Part 6

Out of Scope

An explicit list of what this ticket does not cover. Not a wishlist of future features. A boundary declaration.

Write it for the cases adjacent enough that a developer might reasonably include them. If they build something listed here, that is a conversation to have.

This section also protects the PO: an Out of Scope item cannot later be claimed as an implicit requirement.

These six parts are not a rigid template. Some tickets need many error states; some need almost none. Constraints may be obvious from context. What matters is not the format but the intent: every question a developer or AI would otherwise have to answer silently should be answered explicitly in the ticket before work begins.

Example: filtering a data table

Table filters appear in almost every backlog. The user story typically reads: 'As a user, I want to filter the orders list so I can find what I need.' Here is what a development-ready ticket looks like for the same feature.

SDD Ticket

Filter Order List by Status, Date, and Customer Name

Context: Support agents spend 6+ minutes per call locating orders. The current list shows all orders with no filtering. The team handles 80 order-related calls per day. Goal: bring average lookup time under 45 seconds.

Acceptance Criteria

Filter panel shows three controls: Status (multi-select dropdown, values from OrderStatus enum), Date Range (from/to date pickers), Customer Name (text input).

Filters apply on submit, not on keystroke.

Active filters shown as dismissible chips above the results.

Removing a chip clears that filter and re-runs the query.

Filter state persists across browser refresh via URL query string.

Empty results: show 'No orders match your filters' with a 'Clear all filters' link.

No filters active: show all orders, sorted by created date descending.

Error States: End date before start date: inline validation error, submit disabled. Name input over 100 characters: silently truncate. Query timeout over 3 seconds: show 'Results are taking longer than expected' with a retry button.

Constraints: Filtering applied server-side only. No client-side filtering of already-fetched data. Status values from existing OrderStatus enum (no hardcoded strings). Response under 500ms for up to 10,000 records. Filter logic in service layer, not controller.

Out of Scope: Saving named filter presets. Bulk actions on filtered results. Exporting filtered list to CSV.

Example: file upload

File upload hides a surprising number of decisions behind a short user story. 'As a user, I want to upload a profile photo' typically masks at least fifteen open questions about accepted formats, size limits, processing, and failure handling. Here is the spec that closes them.

SDD Ticket

Upload and Replace Profile Photo

Context: Users cannot personalise their profile. Internal research shows profiles with photos receive 3x more connection requests. Upload accessed from profile settings page.

Acceptance Criteria

Accepted formats: JPG, PNG, WebP.

Maximum file size: 5 MB.

Image resized and cropped to 400x400 pixels server-side.

Thumbnail (80x80) generated and stored alongside full size.

New photo replaces previous immediately on profile page.

Upload button shows percentage progress during transfer.

On success: toast 'Photo updated' visible for 3 seconds.

Previous photo deleted from storage within 24 hours via scheduled cleanup job.

Error States: File over 5 MB: 'File too large. Maximum size is 5 MB.' Unsupported type: 'Only JPG, PNG, and WebP files are accepted.' Upload interrupted: 'Upload failed. Please try again.' Server-side processing failure: retain existing photo, show generic error, log failure with file metadata but not file content.

Constraints: Storage via existing S3-compatible service. Keys are UUID-based (no original filenames in storage). Image processing via existing Sharp pipeline. EXIF data stripped on ingest. No PII in storage keys or logs.

Out of Scope: Video avatars. Photo galleries or multiple images per profile. Client-side cropping UI. Content moderation queue.

Example: automated email notification

Notification triggers look simple until you consider idempotency, opt-out handling, and failure recovery. A story that reads 'As a user, I want to be notified when my order ships' opens questions about timing, deduplication, and what happens when the email service is unavailable. Here is the spec.

SDD Ticket

Send Order Shipped Email Notification

Context: Customers receive no automated communication when an order ships. Support receives 50+ calls per week asking for shipping status. Goal: reduce these contacts by 60% within 30 days of launch.

Acceptance Criteria

Email triggered when order status changes to Shipped.

Email contains: order number, item summary (name and quantity per line), tracking number, carrier name, and estimated delivery date.

Email sent within 5 minutes of the status change.

User opted out of transactional email: no email sent, opt-out event logged.

Idempotent: if order status reverts and then returns to Shipped, only one email is sent per fulfilled shipment.

Error States: Tracking number missing at status change time: retry every 15 minutes for up to 2 hours. If still missing after 2 hours: send email without tracking number and flag the order in the admin panel. Email service unavailable: retry 3 times with exponential backoff, then log failure and page on-call.

Constraints: Triggered via existing order events queue (not polling). Template rendered via existing template engine (no hardcoded HTML in business logic). All sends logged with order ID, timestamp, and hashed recipient email. Use existing notification service.

Out of Scope: SMS notifications. In-app notifications. Order delivered confirmation. Notification preference management UI (existing opt-out flag used as-is).

Example: soft-deleting a record

Whether 'delete' means permanent removal, reversible archival, or a two-stage soft-delete determines almost every implementation decision, yet the user story rarely says which. Here is a spec for the most common case: deletion that needs to be reversible.

SDD Ticket

Archive (Soft-Delete) a Project

Context: Teams accidentally delete projects and request restoration via support, averaging 8 tickets per month. True hard-delete is required in fewer than 5% of cases. Goal: make deletion reversible by default, with permanent deletion available to admins only.

Acceptance Criteria

'Archive project' option available to project owners and admins in project settings.

Archived project removed from all standard lists and search results.

Archived project accessible at its direct URL, showing an 'Archived' banner with a restore option.

All project members retain read-only access to archived content.

Owner can restore from the banner or from Account Settings, Archived Projects.

Restored project returns to lists at its original sort position.

Admin-only: permanently delete an archived project after typing the project name to confirm.

Error States: Archiving the last active project in an account: show a warning with explicit confirm step before proceeding. Restoring when the account plan limit is already reached: 'Your plan supports X active projects. Upgrade or archive another project to restore this one.' Permanent delete confirmation text mismatch: keep modal open, apply shake animation to the input.

Constraints: Soft-delete implemented as an archived_at timestamp column (not a boolean deleted flag). Archived projects excluded from all queries via existing default query scope. No cascade-delete of project data at archive time. Permanent delete runs asynchronously to avoid timeout on large projects. All archive, restore, and permanent-delete events written to audit trail with actor ID and timestamp.

Out of Scope: Bulk archive. Scheduled auto-archive after inactivity. Independent archival of sub-resources such as tasks and comments. Data export before permanent deletion.

What the non-technical PO writes, and what they do not

A common objection to SDD ticket writing is that it places an unfair burden on the product owner. If the PO is non-technical, how can they be expected to specify error handling, token expiry windows, and security constraints? The short answer is: they do not have to. The SDD ticket has a clear division of labour, and most of what a non-technical PO does well maps directly onto the parts of the ticket that matter most.

What a non-technical PO writes is the problem statement and user context. This is the why behind the feature: the frequency of the problem, the population it affects, the cost of not solving it. No one in the room knows this better than the PO. For the password reset example, the PO is the person who knows that support handles 30 password tickets a week, that there is a regulatory requirement around email enumeration, and that mobile users are the primary affected segment. This context shapes every implementation decision that follows.

The PO also writes the business rules they already know. In the password reset case, the PO might know that legal has said the system must not reveal registered email addresses, that the link must not be usable more than once, and that the support team's SLA requires the email to arrive within a minute. These are business rules, not technical specifications. They belong in the ticket. The PO owns them.

What the PO does not write is the translation of those business rules into technical constraints. The requirement 'links must not be reusable' translates to 'using the reset link immediately invalidates it.' The requirement 'the link must not remain valid indefinitely' translates to 'expires after 30 minutes.' The requirement 'do not reveal registered emails' translates to 'identical response regardless of whether the email is found.' A developer or a technical lead does that translation. The PO reviews it to confirm it is faithful.

The collaborative model that works well is this: the PO writes the problem statement, user context, and business rules on the ticket in plain language. In a focused session of no more than 30 minutes, a developer or tech lead reads those notes and writes the acceptance criteria, error states, and constraints. The PO then reviews the resulting spec and confirms that it captures their intent. This is not more work than a traditional refinement session. It is the same work, done more explicitly, producing an artefact that outlasts the conversation.

This model also has an important quality-checking property. When the PO reads the translated spec, they frequently catch something they had not thought through. The acceptance criterion 'identical response regardless of whether the email is found' might prompt them to confirm the regulatory interpretation with legal. The error state 'rate limit reached, show how many minutes remain' might prompt them to question whether that information should be visible at all. The spec reveals the implications of the requirement in a way the original business rule does not. That is not a failure of the process. It is the process working.

The PO writes the problem, the user context, and the business rules. The team translates them into precise acceptance criteria. The PO reviews the translation. That is the complete division of labour.

Three exercises to get your team started

The shift from user stories to SDD tickets is not primarily a writing skill. It is a thinking skill: the ability to make implicit assumptions explicit before implementation. The following three exercises are designed to build that skill in a team context, using work you are already doing. Each can be run in a single session with no preparation beyond choosing a real ticket from your backlog.

Exercise 1

The Ambiguity Hunt

Time: 15 minutes

Take a user story scheduled for the next sprint. Set a timer for five minutes. As a group, write down every question a developer would need answered to implement it fully, including questions about empty states, error conditions, edge cases, concurrent actions, and what happens when a dependency is slow or unavailable.

Most teams find 15 to 25 unanswered questions in a typical story. When the timer ends, spend the remaining ten minutes answering all of them as a group. That list of answers is your first SDD ticket draft.

Goal: make visible how much implicit knowledge lives in a story and has never been written down.

Exercise 2

The AI Prompt Test

Time: 20 minutes

Take a story from your backlog that has been waiting more than two sprints. Write your best SDD ticket for it, taking no more than 10 minutes. Then open an AI assistant and paste your ticket with this prompt: 'Implement this specification. Before you begin, ask me about anything that is unclear.'

Count the clarifying questions the AI asks. Each one is a gap in your spec. Work as a team to answer them all, then rewrite the ticket with those gaps closed.

Goal: use AI as an ambiguity detector. If the AI cannot implement without asking, neither can a developer.

Exercise 3

The Swap Test

Time: 30 minutes

Split into pairs. Each person takes one upcoming story and writes a SDD ticket, working alone for 20 minutes. Then swap tickets. Without discussing the feature, the reviewer asks one question: could I implement this without asking anything? They note every gap and unstated assumption they find.

Swap back. Each writer revises their ticket to close the gaps. Swap again. The ticket is done when the reviewer cannot find a single remaining assumption.

Goal: run once per sprint for four sprints. By sprint four, the quality of tickets in refinement will be noticeably different.

A note on expectations: the first SDD ticket your team writes will not be a good one, and neither will the second. Writing precise specifications is a skill, and it improves with deliberate practice. What changes quickly is the team's ability to see ambiguity, to notice when a requirement leaves open a question the implementation will have to answer. That awareness changes how stories are written, how they are discussed in refinement, and how developers interact with AI tools when they pick the ticket up.

The teams that make this transition well are not the ones that adopted a rigid template and enforced it from day one. They are the ones that started with these exercises, built a shared vocabulary for what a complete specification looks like, and let their own format emerge from that vocabulary. The goal is not a better template. It is a team that thinks differently about what it means to be ready to build.

Completeness is a thinking skill before it is a writing skill. These exercises train the ability to see ambiguity before implementation reveals it at the worst possible moment.