pb-spec — Plan-Build Spec

pb-spec is a set of Agent Skills Specification compliant AI Coding assistant workflow skill packages. It provides a structured process — plan → build — that turns natural-language requirements into well-architected, BDD-driven, tested code.

Design Philosophy

pb-spec implements the Plan-Build pattern: a Planner Agent generates design specs, and a Builder Agent executes code with verification. The core innovation is making .feature files the absolute source of truth — all design and tasks derive FROM scenarios, not the other way around.

Core Principles

Principle	Description
BDD-First	Feature scenarios are the source of truth. Design and tasks derive FROM scenarios.
RFC 2119 Constraints	Architectural constraints use MUST/SHOULD/MAY keywords — binding for the Builder.
DAG Execution	Tasks include DependsOn metadata for parallel execution of independent tasks.
Adaptive Steering	Tasks with Complexity=High route to reasoning models; Low to fast models.
Escalation Protocol	Repeated failures auto-escalate to stronger models for root-cause analysis.
Generator/Evaluator Isolation	Generator builds; Evaluator judges with fresh context — never inherits Generator state.

Design Standards

design.md artifacts conform to industry-standard specification formats:

Standard	Purpose	pb-spec Application
EARS Notation	Eliminate ambiguous requirements with 5 sentence patterns	Every requirement uses EARS syntax with `[REQ-XX]` IDs
C4 Model + Mermaid	Architecture topology in parseable text	Architecture sections use ```mermaid blocks
DBML / Prisma Schema	Structured data models with strict types	Data model sections use DBML or Prisma Schema DSL
MADR (ADR Records)	Architecture decision records	Every AD has `[Context]`, `[Decision]`, `[Consequences]`
RFC 2119 Constraints	Binding behavioral constraints for agents	`§Architectural Constraints` with MUST/SHOULD/MAY
Behavior Traceability Matrix	Every component maps to a Feature scenario	No scenario = remove from design

Best-Practice Alignment

Source	Core Idea	How pb-spec Applies It
RPI Strategy	Separate research, planning, and implementation	`/pb-init` + `/pb-plan` precede `/pb-build`
Plan-and-Solve Prompting	Plan first to reduce missing-step errors	`design.md` + `tasks.md` are mandatory artifacts
ReAct	Interleave reasoning and actions with environment feedback	`/pb-build` executes task-by-task with test/tool feedback loops
Reflexion	Learn from failure signals via iterative retries	Escalation protocol + DCR flow in `pb-build`
Harness Engineering (OpenAI, 2026-02-11)	Treat runtime signals and checklists as first-class harness inputs	`pb-plan` requires runtime verification hooks; `pb-build` validates logs/health evidence
openai/symphony	Long-running agents need explicit observability and deterministic escalation	`pb-build` enforces bounded retries and emits standardized DCR packets
Effective Harnesses for Long-Running Agents	Grounding, context hygiene, recovery, observability	State checks, minimal context handoff, task-local rollback
Building Effective Agents	Prefer simple composable workflows over framework complexity	Small adapter-based CLI + explicit workflow prompts
Harness Design for Long-Running Application Development	Generator/Evaluator separation; adversarial evaluation	`pb-build` dual-persona with adaptive evaluation by task complexity
shadcn/improve	Audit codebase, write self-contained plans for cheaper executors	`/pb-improve` surveys codebase, produces prioritized specs
Agent-SOP	RFC 2119 constraints, DAG tasks, adaptive model routing	RFC 2119 in `design.md`, DAG metadata in `tasks.md`, Escalation protocol
Superpowers	Composable skills, evidence-based claims, systematic debugging	`using-pb-spec` bootstrap; supporting skills ecosystem

Installation

No manual file configuration needed. As long as your AI assistant supports the standard Agent Skills specification (Claude Code, Cursor, GitHub Copilot, OpenCode, etc.), you can install with one command.

Run in your project root:

# Install all pb-spec workflow skills at once
npx skills add longcipher/pb-spec

# Or install only the ones you need
npx skills add longcipher/pb-spec --skill pb-init
npx skills add longcipher/pb-spec --skill pb-plan
npx skills add longcipher/pb-spec --skill pb-build
npx skills add longcipher/pb-spec --skill pb-improve

(After installation, skills will be placed in .agents/skills/ or the compatible local directory for your environment, and automatically indexed by your AI.)

Quick Start

Invoke these skills directly in your IDE / Agent terminal:

/pb-init: Audit the current project and non-destructively create or update the AGENTS.md architecture snapshot.
/pb-plan "requirement description": For example /pb-plan Add WebSocket auth, AI will generate design.md, tasks.md, and .feature test specs with Architecture Decisions and RFC 2119 constraints.
/pb-build "feature-name": Automatically reads tasks.md, starts the outer BDD and inner TDD dual-loop with Subagents, closing and verifying each task one by one.
/pb-refine "feature-name": (Optional) When encountering design blocks (Build Blocked) or architectural flaws, iteratively modify specs based on feedback.
/pb-improve: Audit the codebase and generate prioritized implementation specs for other agents to execute.

Skills Overview

Workflow Skills

Skill	Trigger	Output	Description
`pb-init`	`/pb-init`	`AGENTS.md`	Audit repo and safely update/append a managed snapshot block without rewriting user-authored constraints
`pb-plan`	`/pb-plan <requirement>`	`specs/<spec-dir>/design.md` + `tasks.md` + `features/*.feature`	Design proposal + Gherkin scenarios + ordered task breakdown with RFC 2119 constraints and DAG metadata
`pb-refine`	`/pb-refine <feature>`	Revised spec files	Apply feedback or Design Change Requests
`pb-build`	`/pb-build <feature-name>`	Code + tests	BDD+TDD via Generator (builds) + Evaluator (adversarial review) with Escalation protocol
`pb-improve`	`/pb-improve`	`specs/<spec-dir>/` + `specs/context.md`	Codebase audit → prioritized findings → pb-plan-compatible specs

Supporting Skills

Skill	Trigger	Description
`using-pb-spec`	Session start	Bootstrap: establishes skill invocation discipline
`pb-brainstorming`	Before creative work	Explores intent, requirements, and design before implementation
`pb-test-driven-development`	Before writing code	Red → Green → Refactor cycle, non-negotiable TDD
`pb-systematic-debugging`	Any bug or failure	Find root cause before attempting fixes
`pb-verification-before-completion`	Before claiming done	Evidence before assertions, always
`pb-requesting-code-review`	Before merge	Dispatch independent reviewer subagent
`pb-receiving-code-review`	Processing feedback	Technical evaluation, not performative agreement
`pb-dispatching-parallel-agents`	Multiple independent tasks	One agent per problem domain, concurrent execution
`pb-subagent-driven-development`	Executing plans	Fresh context per task, two-stage review
`pb-finishing-a-development-branch`	Work complete	Verify tests → Present options → Execute choice
`pb-writing-skills`	Creating/editing skills	Skills are code, not prose — test them

Supported AI Tools

This skill library uses the standard SKILL.md specification. Fully compatible with:

Cursor
Claude Code
GitHub Copilot / GitHub Spark
OpenCode
Gemini CLI & Codex
Any tool that supports agentskills.io specification or reads .agents/skills/.

Workflow

Five agent skills that chain together:

/pb-init → /pb-plan → [/pb-refine] → /pb-build
                    ↘
                    /pb-improve → specs/ → /pb-build

Supporting skills activate automatically during the workflow:

pb-brainstorming — before /pb-plan when requirements are unclear
pb-test-driven-development — during /pb-build for every task
pb-systematic-debugging — when tasks fail repeatedly
pb-verification-before-completion — before marking any task DONE
pb-requesting-code-review / pb-receiving-code-review — review cycles
pb-dispatching-parallel-agents — parallel audit in /pb-improve
pb-finishing-a-development-branch — finalization after /pb-build

1. `/pb-init` — AGENTS.md Snapshot & Safe Merge

Audits your project and writes a pb-init snapshot into AGENTS.md using managed markers:

Merge behavior is non-destructive:

If markers exist, only that managed block is replaced.
If markers do not exist, the managed block is appended.
All existing content outside the managed block is preserved verbatim.

The managed snapshot includes an Architecture Decision Snapshot so later agents inherit repo-level conventions instead of re-deciding them every run.

2. `/pb-plan <requirement>` — Design & Task Planning

Takes source material in arbitrary format and produces a complete feature spec:

specs/<YYYY-MM-DD-NO-feature-name>/
├── design.md    # Architecture, API contracts, data models, RFC 2119 constraints
├── tasks.md     # Ordered implementation tasks with DAG metadata
└── features/    # Gherkin acceptance artifacts (Source of Truth)

Key capabilities:

BDD-First: Feature scenarios are written FIRST; design and tasks derive FROM scenarios
EARS Requirements: All acceptance criteria use 5 sentence patterns for machine-checkable verification
RFC 2119 Constraints: §Architectural Constraints section with MUST/SHOULD/MAY keywords — binding for Builder
Behavior Traceability Matrix: Every design component maps to a Feature scenario (no scenario = remove from design)
C4 + Mermaid Architecture: Architecture diagrams in parseable Mermaid syntax
MADR Decisions: Architecture decisions with Context/Decision/Consequences
DBML Data Models: Structured data models in DSL (natural language forbidden)
API-First Contracts: Type signatures before implementation (OpenAPI, Protocol, Trait)
DAG-Enabled Tasks: TaskID, DependsOn, Complexity, Required Skills, EvalRule metadata
Risk-based testing: Property tests by default for broad input domains; fuzzing and benchmarks conditional
Template identity alignment: Renames generic scaffold names to project-matching identifiers
Source requirement normalization: Converts arbitrary-format input into a structured requirement ledger
Self-reconciliation: Verifies all requirements are covered across design, tasks, and scenarios before finalizing

3. `/pb-refine <feature-name>` — Design Iteration (Optional)

Reads user feedback or Design Change Requests and updates design.md and tasks.md. Maintains a revision history and cascades changes without overwriting completed work.

Validates 🛑 Build Blocked and 🔄 Design Change Request packets for required sections before modifying spec artifacts.

4. `/pb-build <feature-name>` — Subagent-Driven Implementation

Implements tasks sequentially using a Generator/Evaluator dual-persona workflow with Escalation protocol:

Generator (subagent) → READY_FOR_EVAL → Evaluator (independent context) → PASS / FAIL
                                                ├── Diff Audit (git diff + scope + architecture)
                                                ├── BDD Evidence Verification (independent re-run)
                                                ├── MCP Live Verification (Playwright / HTTP / CLI)
                                                └── Edge Case Probing (boundaries, errors, security)

On PASS  → Orchestrator marks task DONE in tasks.md
On FAIL  → Evaluator feedback → fresh Generator subagent → retry loop
On 2nd FAIL → Escalation: auto-upgrade to stronger model for root-cause analysis
On 3rd FAIL → DCR packet to /pb-refine

Key principles:

BDD-First: Feature scenarios are the source of truth; all business code must satisfy scenarios
RFC 2119 Constraints: All constraints from design.md §Architectural Constraints are BINDING
TDD is non-negotiable: Every task starts with a failing test (Red → Green → Refactor)
Fresh context per subagent: No inherited assumptions; Evaluator never inherits Generator context
Architecture decisions are binding: Executes the approved design; does not invent a different architecture
Escalation over thrashing: 2nd failure auto-escalates to stronger model; 3rd failure → DCR packet
Mode Behavior: Interactive mode (default) or Auto mode (--auto flag)

5. `/pb-improve` — Codebase Audit & Plan Generation

Audits any codebase and writes pb-plan-compatible specs for other agents to execute. The skill never modifies source code — only produces specs under specs/.

/pb-improve                        full audit → prioritized findings → specs
/pb-improve quick                  cheap pass: hotspots, top findings only
/pb-improve deep                   exhaustive: every package, every category
/pb-improve security               focused audit (also: perf, tests, bugs, ...)
/pb-improve branch                 audit only what the current branch changes
/pb-improve next                   feature suggestions — where to take the project
/pb-improve plan <description>     skip the audit, spec one thing
/pb-improve review-spec <feature>  critique and tighten an existing spec
/pb-improve reconcile              refresh the backlog: verify, unblock, retire
/pb-improve ... --issues           also publish specs as GitHub issues

How it works:

Recon — Maps the repo: stack, conventions, build/test/lint commands (verification gates). Generates specs/context.md with project context.
Audit — Fans out parallel subagents across 9 categories: correctness, security, performance, test coverage, tech debt, dependencies, DX, docs, direction.
Vet — Re-reads every cited location to drop false positives and correct mis-attributions.
Prioritize — Findings ordered by leverage (impact ÷ effort, weighted by confidence).
Spec — One spec per selected finding in specs/<spec-dir>/ with design.md (RFC 2119 constraints), tasks.md (DAG metadata), and features/*.feature (Source of Truth).

BDD-First Integration

The core innovation: .feature files are the absolute source of truth. All design and tasks derive FROM scenarios.

Feature-Driven Workflow

features/*.feature (Source of Truth)
        ↓
design.md (derives FROM features)
        ↓
tasks.md (driven BY scenarios)
        ↓
/pb-build (executes with RED→GREEN→REFACTOR)

Behavior Traceability Matrix

Every design component MUST map to a Feature scenario:

Domain Module	Core Component	Driven by Feature	BDD Tags
Auth	`JwtAuthGuard`	`features/auth/login.feature`	`@auth`, `@security`
Payment	`StripeWebhookHandler`	`features/billing/checkout.feature`	`@billing`, `@webhook`

Rule: If a design component cannot be mapped to a scenario, remove it from the design.

RFC 2119 Constraints

Design constraints use RFC 2119 keywords — binding for the Builder:

## Architectural Constraints (RFC 2119)

- **[C-01]** Database interactions **MUST** use the existing ORM layer; raw SQL **MUST NOT** be introduced.
- **[C-02]** API responses **SHOULD** maintain <200ms p99 latency.
- **[C-03]** If an unhandled edge case is encountered, the Builder **MUST** halt and file a DCR.

DAG-Enabled Tasks

Tasks include metadata for parallel execution and adaptive model routing:

### Task 2.1: "Successful login" — User authenticates

- **TaskID:** `T1`
- **DependsOn:** `None`
- **Complexity:** `High`
- **Required Skills:** Python, JWT, SQLAlchemy
- **EvalRule:** `behave --tags=@login_success` must pass

Field	Purpose
`TaskID`	Unique identifier for DAG resolution
`DependsOn`	Lists prerequisite TaskIDs; `None` = can run in parallel
`Complexity`	`Low` → fast model, `High` → reasoning model
`Required Skills`	Skills the Builder Agent needs
`EvalRule`	Explicit pass/fail criteria

Escalation Protocol

Failure Count	Action	Model Strategy
1	Retry with same model	Same model
2	Escalate — auto-upgrade to stronger model for root-cause analysis	+1 tier (Haiku→Sonnet, Sonnet→Opus)
3	File DCR, stop build	N/A

Design Philosophy: Agent Harness

pb-spec's design is inspired by Anthropic's research on Effective Harnesses for Long-Running Agents and the Superpowers methodology. Core idea: place AI agents inside a strict, observable, recoverable execution environment.

Principle	How pb-spec Implements It
State Grounding	Subagents verify workspace state before writing code
Architecture Continuity	`pb-init` records Architecture Decisions; `pb-build` verifies conformance
Error Quoting	Subagents quote specific error messages before attempting fixes
Context Hygiene	Only minimal, relevant context passed to each subagent
Recovery Loop	Pre-task snapshots + file-scoped recovery
Verification Harness	Design docs define explicit verification commands
Observability as Context	Task verification includes runtime signals (logs/health)
Escalation Loop	2nd failure auto-escalates; 3rd failure → DCR handoff to `pb-refine`
Generator/Evaluator Isolation	Generator builds; Evaluator judges with fresh context
Evidence Before Claims	`pb-verification-before-completion`: no success claims without fresh verification
RFC 2119 Constraints	Binding behavioral constraints prevent hallucination and scope creep
DAG Execution	Parallel task execution for independent tasks
Adaptive Steering	Complexity-based model routing for cost/speed optimization
Systematic Debugging	`pb-systematic-debugging`: root cause before fixes, scientific method
Skill Auto-Triggering	`using-pb-spec` bootstrap ensures skills activate at the right moments

License

Apache-2.0

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
.github/workflows		.github/workflows
docs		docs
features		features
skills		skills
src/pb_spec		src/pb_spec
tests		tests
.gitignore		.gitignore
.rumdl.toml		.rumdl.toml
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
Justfile		Justfile
LICENSE		LICENSE
README.md		README.md
cliff.toml		cliff.toml
context7.json		context7.json
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pb-spec — Plan-Build Spec

Design Philosophy

Core Principles

Design Standards

Best-Practice Alignment

Installation

Quick Start

Skills Overview

Workflow Skills

Supporting Skills

Supported AI Tools

Workflow

1. `/pb-init` — AGENTS.md Snapshot & Safe Merge

2. `/pb-plan <requirement>` — Design & Task Planning

3. `/pb-refine <feature-name>` — Design Iteration (Optional)

4. `/pb-build <feature-name>` — Subagent-Driven Implementation

5. `/pb-improve` — Codebase Audit & Plan Generation

BDD-First Integration

Feature-Driven Workflow

Behavior Traceability Matrix

RFC 2119 Constraints

DAG-Enabled Tasks

Escalation Protocol

Design Philosophy: Agent Harness

License

About

Uh oh!

Releases 37

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

pb-spec — Plan-Build Spec

Design Philosophy

Core Principles

Design Standards

Best-Practice Alignment

Installation

Quick Start

Skills Overview

Workflow Skills

Supporting Skills

Supported AI Tools

Workflow

1. /pb-init — AGENTS.md Snapshot & Safe Merge

2. /pb-plan <requirement> — Design & Task Planning

3. /pb-refine <feature-name> — Design Iteration (Optional)

4. /pb-build <feature-name> — Subagent-Driven Implementation

5. /pb-improve — Codebase Audit & Plan Generation

BDD-First Integration

Feature-Driven Workflow

Behavior Traceability Matrix

RFC 2119 Constraints

DAG-Enabled Tasks

Escalation Protocol

Design Philosophy: Agent Harness

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 37

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1. `/pb-init` — AGENTS.md Snapshot & Safe Merge

2. `/pb-plan <requirement>` — Design & Task Planning

3. `/pb-refine <feature-name>` — Design Iteration (Optional)

4. `/pb-build <feature-name>` — Subagent-Driven Implementation

5. `/pb-improve` — Codebase Audit & Plan Generation

Packages