Open
Conversation
Introduce eight canonical JSON schemas under internal/contract/schemas/shared/ that become the lingua franca for cross-pipeline handover: issue_ref, pr_ref, branch_ref, spec_ref, findings_report, plan_ref, workspace_ref, scope_result. The schemas are embedded via go:embed and exposed through a small registry API (Lookup, Exists, Names). The sentinel "string" type is accepted as a no-schema free-text fallback so pre-existing untyped pipelines keep working. This is the foundation layer of the pipeline I/O protocol; consumers land in follow-up commits. See docs/adr/010-pipeline-io-protocol.md.
Extend the pipeline schema with optional type annotations:
- InputConfig.Type — declares the canonical I/O type of the pipeline's
input (issue_ref, pr_ref, ..., or "string").
- PipelineOutput.Type — declares the canonical type of a named output.
- Step.InputRef — typed wiring for composition steps, replacing the
ad-hoc string template form. Exactly one of
'from: <step>.<output>' or 'literal: <text>'.
All new fields are optional. Unset or empty == "string", preserving
backward compatibility with every existing pipeline. See
docs/adr/010-pipeline-io-protocol.md.
Wire the shared schema registry into YAMLPipelineLoader so malformed
pipelines fail at parse time rather than at runtime. Adds:
- ValidatePipelineIOTypes — catches unknown type names on input/outputs,
malformed or conflicting input_ref blocks, and pipeline_outputs that
point at non-existent steps.
- TypedWiringCheck — cross-pipeline compatibility check (producer output
type must match consumer input type). Invoked by higher layers that
have access to a sub-pipeline loader.
YAMLPipelineLoader.Unmarshal now runs ValidatePipelineIOTypes immediately
after defaulting, so every code path that loads a pipeline (CLI, TUI,
WebUI, composition executor) gets load-time type validation for free.
Tests cover known/unknown types, string sentinel behaviour, empty-type
legacy fallback, InputRef exclusivity and the loader's rejection path.
Teach CompositionExecutor.resolveStepInput about the new Step.InputRef block. Resolution order is now: 1. InputRef.From -> raw JSON from tmplCtx.StepOutputs[srcStep] 2. InputRef.Literal -> template-resolved string 3. Legacy SubInput -> template-resolved string 4. Parent input -> tmplCtx.Input Legacy string-templated sub-pipeline inputs keep working unchanged, so every existing composition pipeline continues to run.
Prove out the typed I/O protocol on four pipelines that exercise every
branch of the design:
- impl-issue — input: issue_ref, output: pr_ref.
- impl-speckit — input: string (natural-language feature description
is the right model), output: pr_ref.
- plan-scope — input: string (fetch-epic step parses it), adds a
new typed 'scope' output bound to scope_result.
- ops-epic-runner — shows scope -> iterate over child_issues -> impl-issue
wiring. Iterate binds each element as {{item}}; the
child's declared input.type (issue_ref) matches the
element schema of scope_result.child_issues.
Both the .agents/pipelines/ and internal/defaults/pipelines/ copies are
updated. Step-level prompts, artifacts, contracts, and retry policies are
unchanged. Remaining 53 pipelines keep their implicit 'string' typing —
see the ADR for the phase-2 migration TODO list.
Record the design of the typed pipeline I/O protocol: shared schema registry, typed inputs/outputs, typed composition wiring, load-time validation, and the phase-2 migration plan for the remaining pipelines.
Add explicit input.type and pipeline_outputs[*].type declarations to all audit-* pipelines per ADR-010. Inputs default to string (free-text audit scope hints); outputs mapped to findings_report. audit-dead-code-review declares input.type: pr_ref since it scans PR diffs. Load-validated via TestAllShippedPipelinesLoad. Live smoke skipped: audit-dead-code-issue and audit-doc create GitHub issues (destructive); audit-dead-code creates branches (destructive); remainder deferred to batch end.
Declare input.type and pipeline_outputs[*].type on all impl-* pipelines per ADR-010. Inputs: issue_ref for pipelines taking a GitHub issue/PR reference (impl-issue-core, impl-research, impl-review-loop, impl-smart-route); string for free-text feature/fix descriptions (impl-feature, impl-hotfix, impl-improve, impl-prototype, impl-recinq, impl-refactor, impl-speckit). Outputs: pr_ref where the artifact genuinely references a produced PR; findings_report where the artifact is an analysis/verdict/verification report (impl-hotfix.verdict, impl-improve.verification, impl-refactor.verification, impl-smart-route.assessment, impl-issue-core.assessment). Load-validated only. Live smoke skipped — all impl-* pipelines are destructive (open PRs, create branches, push code).
Declare input.type and pipeline_outputs[*].type on all plan-* pipelines per ADR-010. plan-research takes issue_ref; plan-adr, plan-approve-implement, plan-task take string. Outputs typed plan_ref except plan-scope.report (scope_report is a bespoke verification shape, left as string). plan-scope.scope already typed scope_result by prior migration. Load-validated only. Live smoke skipped — plan-scope creates issues (destructive); plan-research posts plans to issues (destructive); plan-adr writes ADR docs (could run locally but out of scope).
Declare input.type and pipeline_outputs[*].type on all ops-* pipelines per ADR-010. Inputs: pr_ref for PR-review pipelines (ops-pr-review, ops-pr-review-core, ops-pr-fix-review); issue_ref for issue-driven ops (ops-implement-epic, ops-refresh, ops-rewrite); string for the rest (ops-bootstrap, ops-debug, ops-issue-quality, ops-parallel-audit, ops-release-harden, ops-supervise, ops-hello-world, ops-epic-runner already string). Outputs: findings_report for review/audit/debug reports; pr_ref for ops-rewrite (enhancement_results retyped to findings_report since it is a verification report, not a PR); workspace_ref for ops-bootstrap; issue_ref for ops-refresh; scope_result for ops-epic-runner.scope (matches the plan-scope sub-pipeline output). Load-validated only. All ops-* pipelines either open PRs, post comments, or modify issues — destructive, live smoke skipped.
Declare input.type: string and pipeline_outputs[*].type: findings_report on doc-changelog, doc-explain, doc-fix, doc-onboard per ADR-010. All four accept free-text scope hints and produce finding-style reports about documentation drift or generated content. Load-validated only. doc-fix and doc-onboard may open PRs (destructive); doc-changelog and doc-explain are read-only but live smoke deferred to batch-end.
Declare input.type and pipeline_outputs[*].type on all wave-* pipelines per ADR-010. All wave-* inputs are string (free-text audit/smoke focus areas, PR refs as text, branch names). Outputs are findings_report for the self-evolution pipelines (wave-audit, wave-evolve, wave-review, wave-security-audit, wave-test-forge, wave-test-hardening, wave-validate, wave-smoke-test); pr_ref for wave-land (ships a PR); findings_report for wave-review, wave-scope-audit, wave-orchestrate (retyped from incorrect pr_ref default — classification/review/issues are analysis outputs, not PR references). wave-smoke-* and wave-ontology-* pipelines declare input.type: string and keep empty pipeline_outputs by design — they exercise executor paths and need free shape per ADR-010 phase 2 notes. Load-validated only. Live smoke on read-only wave-smoke variants deferred (depends on bubblewrap + mount state).
Declare input.type and pipeline_outputs[*].type on the remaining non-categorised pipelines per ADR-010: - bench-solve: input string (problem statement), output findings_report - test-gen: input string (target package), output findings_report - full-impl-cycle: input issue_ref (GitHub issue), output pr_ref (composes impl-issue-core → test-gen → audit-* → wave-land → ops-pr-review-core, all now typed) Load-validated only. full-impl-cycle opens PRs (destructive); test-gen may create PRs; bench-solve runs the SWE-bench-style loop and is safe to smoke but deferred pending BENCH_HOME state.
Add TestAllShippedPipelinesLoad as a regression guard for the typed I/O protocol. Walks .agents/pipelines and internal/defaults/pipelines, invokes YAMLPipelineLoader.Load (which runs ValidatePipelineIOTypes) on every *.yaml file. Any future pipeline with an unknown type name, broken step reference, or malformed input_ref now fails CI instead of breaking silently at runtime.
Deletes audit pipelines subsumed by core fleet (scan, architecture, security, tests, duplicates). Per .agents/output/consolidation-map.md §8. - audit-closed, audit-consolidate, audit-correctness, audit-coverage - audit-dead-code, audit-dead-code-issue, audit-dead-code-review - audit-doc, audit-dual, audit-dx, audit-junk-code, audit-quality-loop - audit-unwired, audit-ux
Core impl set remains: impl-issue, impl-issue-core, impl-speckit, impl-recinq. Deleted variants are parameterized duplicates or composition wrappers replaceable by inception stacks. Per consolidation-map.md §8. - impl-feature, impl-hotfix, impl-improve, impl-prototype - impl-refactor, impl-research, impl-review-loop, impl-smart-route
Core ops fleet remains: ops-pr-review-core, ops-pr-review, ops-epic-runner, ops-bootstrap, ops-issue-quality, ops-parallel-audit, ops-hello-world. Deleted wrappers and specific one-offs: - ops-debug, ops-implement-epic, ops-pr-fix-review, ops-refresh - ops-release-harden, ops-rewrite, ops-supervise
Core plan fleet: plan-scope, plan-research, plan-task. Deleted wrapper variants replaceable by compositions with gates. - plan-adr (merge into plan-research) - plan-approve-implement (wrapper; replace with plan-* + gate + impl composition)
Core doc fleet: doc-onboard, doc-explain. Deleted composition wrappers and specific one-offs. - doc-changelog (release-specific; compose on demand) - doc-fix (replace with audit-doc-scan + edit composition) - bench-solve (SWE-bench specific; no active benchmarking) - test-gen (orphaned after full-impl-cycle removal)
Core wave-self fleet: wave-audit, wave-security-audit, wave-test-hardening, wave-scope-audit, wave-validate. Deleted composition wrappers that will be rebuilt as typed-I/O inception stacks. - full-impl-cycle (big wrapper; rebuild as composition) - wave-bugfix (merge into impl-issue with wave-mode flag) - wave-evolve (low-use; unclear purpose) - wave-land (wrapper; compose impl + ops-pr-review) - wave-orchestrate (orchestration is CLI, not pipeline) - wave-review (duplicate of ops-pr-review)
Executor validation moves to Go unit tests; these YAML smoke fixtures
were never part of the core fleet and not used by end users.
Keep only: wave-smoke-contracts, wave-smoke-gates as reference fixtures
for contract + gate executor behavior.
Removed wave-smoke-*: classify, git-forensics, hooks, llm-judge, mount,
skills-{claude,codex,gemini,opencode}, test, watchdog.
Removed wave-ontology-*: empty, inherit, warn.
Removed: wave-stress-test, wave-test-forge.
- Makefile: coverage target - internal/fileutil: copy_test - internal/humanize: humanize_test - internal/timeouts: timeouts_test - internal/tools: check_test
- navigator: add missing field - reviewer: add reference criteria - improvement-assessment schema: tighten constraint
… model Audits are the leanest, most frequently-run pipelines. Routing the deep-dive step to 'balanced' instead of 'strongest' respects the project-wide cheapest-first model policy and keeps costs bounded when an audit fan-out runs across many packages. The step still performs verification and data-flow tracing; balanced is adequate. The other five audit pipelines (architecture, dead-code-scan, doc-scan, duplicates, tests) already use cheapest throughout, declare typed findings_report outputs, and route through the navigator for final quality review via agent_review contract. No structural changes needed in this commit.
…t model The four implementation pipelines (impl-issue, impl-issue-core, impl-recinq, impl-speckit) previously routed multiple steps to the 'strongest' model tier, which is both expensive and against the project-wide cheapest-first policy. Swap every 'strongest' to 'balanced' across all eight files (both .agents/pipelines/ and internal/defaults/pipelines/ copies). Also introduce a plan-review handoff between the plan step and the implement step in impl-issue and impl-issue-core. The handover now declares two contracts: the existing schema validation, plus a new agent_review contract that runs the reviewer persona over the generated impl-plan.json against plan-review-criteria.md. Failing the review triggers a retry on the plan step, which prevents poor plans from corrupting the implement phase. Fix a typed-I/O bug in impl-issue-core: the 'plan' pipeline output was incorrectly typed as pr_ref (it is an implementation plan, not a PR). Retype to plan_ref, which matches the canonical shared schema for implementation plans.
The three planning pipelines (plan-research, plan-scope, plan-task) each had 1-2 steps routed to the 'strongest' model tier. Planning benefits from a capable model but rarely needs the flagship; route every 'strongest' to 'balanced' across both pipeline copies. Sync plan-scope and plan-task between .agents/pipelines/ and internal/defaults/pipelines/. The copies had drifted: the defaults version was already cleaner (no prompt-level schema duplication per feedback_prompt_no_schema_duplication, richer git diagnostics in plan-task exploration). Promote the defaults version as canonical and mirror it into .agents.
Seven ops pipelines (bootstrap, epic-runner, hello-world, issue-quality, parallel-audit, pr-review, pr-review-core) previously routed several steps to 'strongest'. PR review alone had four strongest steps across initial review, deep-dive, verdict synthesis, and publish. Swap every 'strongest' to 'balanced' across both pipeline copies. Sync ops-bootstrap between .agents/pipelines/ and internal/defaults/pipelines/. The .agents copy had a richer retry policy (max_attempts: 3) and a dependency edge the defaults version was missing (publish depends on both scaffold and assess). Promote the .agents copy as canonical.
…copies doc-explain and doc-onboard each had their final synthesis step routed to 'strongest'. Both produce narrative documents where 'balanced' is adequate and the cost difference matters when these run as part of larger composition flows. Sync doc-explain between .agents/pipelines/ and internal/defaults/pipelines/. The copies differed by one dependency edge (defaults version missed the 'explore' dependency on the final document step). Promote the .agents copy as canonical.
…ing output The five wave-self pipelines (wave-audit, wave-scope-audit, wave-security-audit, wave-test-hardening, wave-validate) each previously routed 1-3 steps to 'strongest'. Swap every 'strongest' to 'balanced' so self-audit runs stay within reasonable cost. Declare a typed pipeline_output on wave-test-hardening — it was the only wave-self pipeline missing a pipeline_outputs block. The loop produces a coverage-analysis.json artifact on every analyze-coverage iteration; expose it as a findings_report so downstream composition pipelines can consume it. Wave-self pipelines live only in .agents/pipelines/ per the language-agnostic defaults policy (feedback_defaults_agnostic): internal/defaults/pipelines/ is for shipped, project-agnostic pipelines and must not embed Wave-specific audits.
Introduce five composition pipelines that stack the surviving core Lego blocks end-to-end. Each wires typed inputs to typed outputs via shared schemas so load-time validation catches mismatches: - inception-audit: parallel fan-out over audit-architecture, audit-tests, audit-duplicates, audit-doc-scan; aggregate their findings_report outputs; reviewer persona triages the merged list into block-release / fix-this-sprint / backlog tiers. - inception-feature: full feature delivery flow. plan-scope decomposes an epic (issue_ref → scope_result); plan-research attaches research; impl-issue iterates per child (issue_ref → pr_ref); ops-pr-review fans out over produced PRs. - inception-bugfix: audit-security scan narrows the scope for impl-issue-core, whose branch is handed to ops-pr-review-core for verdict. Skips PR creation so the operator controls landing. - inception-doc: audit-doc-scan → doc-explain self-healing loop for documentation that has drifted from code reality. - inception-harden: self-referential hardening stack for Wave itself — wave-security-audit → wave-test-hardening → wave-audit in sequence. Project-internal; lives only in .agents/. Mirror the four project-agnostic wrappers into internal/defaults/pipelines/; inception-harden remains Wave-only per feedback_defaults_agnostic.
'retry' is not a valid on_failure value for agent_review contracts. Valid values: fail, skip, continue, rework, warn. Discovered during phase-4 real-run validation where impl-issue and impl-issue-core failed immediately at DAG validation. Fixes runtime error: agent_review contract has invalid on_failure value "retry"
…letions Replace deleted audit-dead-code / audit-dx refs with survivors: - audit-dead-code -> audit-dead-code-scan - audit-dx -> audit-duplicates Discovered during phase-4 validation: parallel-audit failed at load-sub-pipeline step because referenced audit-dead-code.yaml was deleted in phase-1 consolidation.
audit-unwired was deleted in phase-1 consolidation. Swap for survivors: architecture, dead-code-scan, duplicates, doc-scan.
Prior fix changed retry -> rework but rework requires a rework_step configuration we don't provide. warn is advisory-only and lets the pipeline proceed while surfacing the reviewer's concerns in the run event log.
- embed_test: drop ops-rewrite.yaml tests (pipeline removed in c1b93e9), trim known-release-pipelines expected list to actually-embedded set - init_test: supervisor persona no longer referenced by any release pipeline, use craftsman (used by impl-recinq / impl-speckit) for transitive filter check - inception-doc.yaml: drop scan step referencing deleted audit-doc-scan sub-pipeline - sync improvement-assessment schema to authoritative .agents/ copy
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Clamp[T]generic utility incmd/wave/util.goconstraining value to[min, max]cmd/wave/util_test.gospecs/007-trivial-util/Related to #7
Changes
cmd/wave/util.go— newClampgeneric functioncmd/wave/util_test.go— unit tests for int/float, min>max edge casespecs/007-trivial-util/{spec,plan,tasks}.md— planning docsTest Plan
go test ./cmd/wave/...wave-validation-sandbox; pipeline end-to-end run