Add --dry-run flag to agentworkforce agent by willwashburn · Pull Request #57 · AgentWorkforce/workforce

willwashburn · 2026-05-08T14:32:03Z

Summary

New agentworkforce agent <persona>@<tier> --dry-run flag validates a persona without spawning the harness or burning tier-model tokens. Three checks, fast-fail in order: (1) sidecar resolution; (2) buildInteractiveSpec shape; (3) real skill install in a temp dir.
Persona-maker's agentsMdContent now requires running the dry-run before handoff. The output contract gained a line for the dry-run outcome.
Closes the gap that lets a persona ship with a hallucinated skill name (source repo exists but the named skill is not in it). Today that bricks every launch with … failed (exit 1). Aborting.; the author has no way to catch it short of attempting a real launch.

Behavior

$ agentworkforce agent capability-discoverer@minimum --dry-run
→ capability-discoverer [minimum] via opencode (opencode/minimax-m2.5-free) [DRY-RUN]
✓ sidecar: (none)
✓ harness spec: opencode (2 args)
• temp dir: /var/folders/.../agentworkforce-dryrun-capability-discoverer-XXXX
• installing 2 skill(s)
• skill.sh/find-skills (skill.sh) → https://github.com/vercel-labs/skills#find-skills
  ✓ skill.sh/find-skills
• prpm/self-improving (prpm) → @prpm/self-improving
  ✓ prpm/self-improving
✓ dry-run ok: 2 skill(s) installed cleanly

On failure, exit is non-zero, the temp dir is kept and printed for inspection, and each failing skill is listed by id + exit code. On success the temp dir is removed.

What's deliberately out of scope (Tier 2/3 — not in this PR)

Harness binary check (claude / codex / opencode on PATH) — environment-dependent; author's machine doesn't prove the consumer's. Could land behind a --check-harness flag later.
MCP env-ref strict mode — resolveMcpServersLenient already drops servers whose ${VAR} refs are unset and warns. Same author/consumer-environment caveat. Could land behind --strict.
Mount setup / live MCP connectivity / actual harness boot — runtime concerns, not authoring. Skipped.

Test plan

parseAgentArgs accepts --dry-run (new tests; 126/126 CLI tests pass)
corepack pnpm run check green across the workspace
Smoke: agent debugger@minimum --dry-run → ✓ dry-run ok (no skills)
Smoke: agent capability-discoverer@minimum --dry-run → installs skill.sh + prpm skills cleanly in a temp dir, exits 0
Smoke: synthetic persona with hallucinated nextjs-anti-patterns skill → ✗ dry-run failed, exit 1, temp dir preserved
Manual: confirm persona-maker agent runs the dry-run during a real authoring session and surfaces failures back to the user

Walkthrough

This PR adds a new --dry-run flag to the CLI that validates a selected persona without launching the interactive harness. The feature resolves persona inputs, checks sidecar readability, builds the harness spec, tests skill installers in a temporary directory, and reports per-skill outcomes with optional cleanup.

Changes

Dry-run CLI validation

Layer / File(s)	Summary
Data contract `packages/cli/src/cli.ts`	`AgentFlags` interface extended with `dryRun: boolean` field.
Argument parsing `packages/cli/src/cli.ts`	Adds `mkdtempSync` import; initializes `dryRun: false` defaults in `parseAgentArgs` and `parseCreateArgs`; extends both parsers to recognize and set `--dry-run` flag.
Dry-run validation `packages/cli/src/cli.ts`	New `runDryRun(selection)` function resolves persona inputs and sidecars, validates sidecar readability, builds interactive harness spec, executes each skill installer in a fresh temp directory, reports per-skill pass/fail, and deletes temp dir on success or retains on failure.
CLI integration `packages/cli/src/cli.ts`	Updates `agent` help text to document `--dry-run` checks and temp-dir behavior; adds early branch in `runAgentSelector` to call `runDryRun` when `flags.dryRun` is set and exit with its status code; updates a create-path call site to pass `dryRun: false`.
Test coverage `packages/cli/src/cli.test.ts`	Updates existing "no flags" test to assert `dryRun: false` default; adds new test for `--dry-run` flag setting and positional arg preservation; adds new test for `--dry-run` composing with `--install-in-repo`.
Specification `personas/persona-maker.json`	Updates `agentsMdContent` anti-goals and output contract to document dry-run command invocation and outcome reporting requirements, including failing skill identifiers.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

AgentWorkforce/workforce#46: Related prior work adding sidecar support used by the new dry-run validation path.

Poem

🐰 A dry run before the leap so high,
Skills spin up in a tempdir nest,
Each installer proves its little try,
Successes cleaned, failures rest,
A careful hop before the quest.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 28.57% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Add --dry-run flag to agentworkforce agent' directly and accurately describes the main change: adding a new CLI flag feature.
Description check	✅ Passed	The description is comprehensively related to the changeset, detailing the dry-run feature implementation, behavior, test coverage, and scope boundaries.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch worktree-persona-validation

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

devin-ai-integration

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 4 additional findings.

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/cli/src/cli.ts`:
- Around line 925-945: The current dry-run uses one shared tempDir for all
installs (tempDir, loop over plan.installs), causing state from one skill to
affect others; change the loop in which you destructure inst.installCommand and
call spawnSync/subprocessExitCode so that for each inst you create a unique
subdirectory (e.g., join(tempDir, inst.skillId) or a per-install mkdtemp) and
run spawnSync with cwd set to that per-skill dir, ensuring you still keep the
top-level preserved dry-run root but isolate each installer to avoid
cross-contamination and preserve per-skill artifacts for inspection.

In `@personas/persona-maker.json`:
- Line 29: Update the dry-run validation rules so authors must run
agentworkforce agent <id>@<tier> --dry-run from a resolvable source directory
(either TARGET_DIR that matches the persona cascade or by adding the persona to
the cascade first) and must run the dry-run for every tier that has any
tier-specific runtime config (not just differing skills[])—specifically detect
per-tier overrides of claudeMd/agentsMd sidecars, permissions, mcpServers, or
harnessSettings and require a dry-run for each such tier; if CREATE_MODE is
built-in, require the author to append the intent to PERSONA_INTENTS and add the
persona to the workspace before running the tiered dry-runs so the agent CLI can
resolve <id>@<tier>.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 812bc13c-d4ec-4519-b0cf-8154611e575a

📥 Commits

Reviewing files that changed from the base of the PR and between 970787d and ca4c7aa.

⛔ Files ignored due to path filters (1)

packages/workload-router/src/generated/personas.ts is excluded by !**/generated/**

📒 Files selected for processing (3)

packages/cli/src/cli.test.ts
packages/cli/src/cli.ts
personas/persona-maker.json

coderabbitai · 2026-05-08T14:37:13Z

+  const tempDir = mkdtempSync(join(tmpdir(), `agentworkforce-dryrun-${personaId}-`));
+  process.stderr.write(`• temp dir: ${tempDir}\n`);
+  process.stderr.write(`• installing ${plan.installs.length} skill(s)\n`);
+
+  const failures: { skillId: string; exitCode: number }[] = [];
+  for (const inst of plan.installs) {
+    const [bin, ...args] = inst.installCommand;
+    if (!bin) {
+      process.stderr.write(`  skipped ${inst.skillId}: empty install command\n`);
+      continue;
+    }
+    process.stderr.write(`• ${inst.skillId} (${inst.sourceKind}) → ${inst.packageRef}\n`);
+    const res = spawnSync(bin, args, { stdio: 'inherit', shell: false, cwd: tempDir });
+    const code = subprocessExitCode(res);
+    if (code === 0) {
+      process.stderr.write(`  ✓ ${inst.skillId}\n`);
+    } else {
+      process.stderr.write(`  ✗ ${inst.skillId} (exit ${code})\n`);
+      failures.push({ skillId: inst.skillId, exitCode: code });
+    }
+  }


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Isolate each skill install instead of reusing one temp workspace.

All installers run in the same tempDir, so a partial failure can leave behind skills-lock.json, .skills/, .agents/, or other state that poisons later installs. That makes the per-skill verdicts unreliable: one bad skill can cascade into multiple false failures.

A fresh subdir per skill under the preserved dry-run root avoids cross-contamination while still leaving artifacts behind for inspection.

Suggested change

const tempDir = mkdtempSync(join(tmpdir(), `agentworkforce-dryrun-${personaId}-`)); process.stderr.write(`• temp dir: ${tempDir}\n`); process.stderr.write(`• installing ${plan.installs.length} skill(s)\n`); const failures: { skillId: string; exitCode: number }[] = []; for (const inst of plan.installs) { + const skillDir = mkdtempSync( + join(tempDir, `${inst.skillId.replace(/[^a-zA-Z0-9._-]/g, '_')}-`) + ); const [bin, ...args] = inst.installCommand; if (!bin) { process.stderr.write(` skipped ${inst.skillId}: empty install command\n`); continue; } process.stderr.write(`• ${inst.skillId} (${inst.sourceKind}) → ${inst.packageRef}\n`); - const res = spawnSync(bin, args, { stdio: 'inherit', shell: false, cwd: tempDir }); + const res = spawnSync(bin, args, { stdio: 'inherit', shell: false, cwd: skillDir }); const code = subprocessExitCode(res);

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/cli/src/cli.ts` around lines 925 - 945, The current dry-run uses one shared tempDir for all installs (tempDir, loop over plan.installs), causing state from one skill to affect others; change the loop in which you destructure inst.installCommand and call spawnSync/subprocessExitCode so that for each inst you create a unique subdirectory (e.g., join(tempDir, inst.skillId) or a per-install mkdtemp) and run spawnSync with cwd set to that per-skill dir, ensuring you still keep the top-level preserved dry-run root but isolate each installer to avoid cross-contamination and preserve per-skill artifacts for inspection.

Validates a persona without spawning the harness or burning tier-model tokens. Three checks, fast-fail in order: 1. Sidecar resolution — `claudeMd` / `agentsMd` filename refs are readable. 2. Harness-spec build — `buildInteractiveSpec` rejects malformed `permissions` patterns, `mcpServers` shape errors, and missing required harness fields. 3. Skill install — runs every `skills[].source` through its real installer (`npx -y skills add` / `npx -y prpm install`) inside a fresh temp dir, with per-skill pass/fail. Temp dir is removed on success and kept on failure for inspection. Motivation: a persona authored with a hallucinated skill name (source repo exists but the named skill is not in it) bricks every launch with "failed (exit 1). Aborting." The author has no way to catch this short of attempting a launch. Dry-run gives them a deterministic signal. Persona-maker's `agentsMdContent` now requires running the dry-run before handoff and lists its outcome in the output contract. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

devin-ai-integration Bot reviewed May 8, 2026

View reviewed changes

coderabbitai Bot reviewed May 8, 2026

View reviewed changes

willwashburn force-pushed the worktree-persona-validation branch from ca4c7aa to dfef3a7 Compare May 8, 2026 14:41

willwashburn merged commit 8238cc5 into main May 8, 2026
2 checks passed

willwashburn deleted the worktree-persona-validation branch May 8, 2026 14:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add --dry-run flag to agentworkforce agent#57

Add --dry-run flag to agentworkforce agent#57
willwashburn merged 1 commit intomainfrom
worktree-persona-validation

willwashburn commented May 8, 2026

Uh oh!

coderabbitai Bot commented May 8, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 8, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

willwashburn commented May 8, 2026

Summary

Behavior

What's deliberately out of scope (Tier 2/3 — not in this PR)

Test plan

Related

Uh oh!

coderabbitai Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

✅ Devin Review: No Issues Found

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented May 8, 2026 •

edited

Loading