Skip to content

RFC: Plugin lifecycle management in ToolHive#77

Merged
JAORMX merged 8 commits into
mainfrom
rfc-plugins-lifecycle-management
Jun 29, 2026
Merged

RFC: Plugin lifecycle management in ToolHive#77
JAORMX merged 8 commits into
mainfrom
rfc-plugins-lifecycle-management

Conversation

@JAORMX

@JAORMX JAORMX commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Summary

Proposes adding plugin lifecycle management to ToolHive, mirroring the existing skills system (RFC-0030). A plugin is the cross-vendor "bundle of primitives" unit (slash commands, subagents, Agent Skills, hooks, MCP server configs, LSP servers) declared by .claude-plugin/plugin.json.

ToolHive will let users build a plugin directory into a reproducible OCI artifact, push it to any OCI registry, install it (from registry name / OCI ref / git://), and list/info/uninstall it — reusing the registry, OCI, groups, and SQLite entries infrastructure that already serves skills. As a bridge to native client tooling, it can also generate a marketplace.json.

Why

  • Plugins are the industry-convergent bundle unit (Claude Code, Cursor, Codex, Copilot, Gemini, Kiro), but no one packages a multi-primitive plugin bundle as an immutable, content-addressable, signable OCI artifact — an open niche ToolHive is positioned to fill.
  • The skills system was deliberately built on a generic foundation (entry_type discriminator, x/dev.toolhive/<type> registry namespace, shared OCI primitives) that anticipates exactly this extension.

What's genuinely new vs. skills

  1. Install target — plugins normally need marketplace-cache + settings.json mutation. The RFC recommends the in-place skills-directory-plugin mechanism for v1 (pure filesystem, no settings mutation), with marketplace-cache deferred as a per-client PathResolver strategy. (Open Question Port THV-0597 #1)
  2. Executable surface — unlike inert skills, plugins bundle hooks and MCP servers (code). The security section centers on this: a pre-install executable-surface inventory in thv plugin info, --require-signature via cosign + the OCI Referrers API, and install-time audit events.

Notes for reviewers

  • File is named THV-XXXX-... per the RFC convention; will rename to match this PR number.
  • Touches toolhive (primary), toolhive-core (new oci/plugins + shared-primitive refactor), and toolhive-registry-server (plugin catalog).
  • Key product decisions to weigh: Open Question Port THV-0597 #1 (install target) and Port THV-1566 #4 (running bundled MCP servers through thv with isolation as the v2 headline).

🤖 Generated with Claude Code

JAORMX and others added 6 commits June 15, 2026 08:04
Proposes adding plugin (multi-primitive bundle) lifecycle management to
ToolHive, mirroring the existing skills system: build, push, install,
uninstall, list, info, validate, and marketplace generation. Plugins are
packaged as reproducible OCI artifacts (dev.toolhive.plugins.v1), reusing
the shared toolhive-core OCI primitives, the SQLite entries table, the
registry provider seam, groups, the git resolver, and the multi-client
PathResolver.

Centers on what makes plugins different from skills: an executable surface
(hooks + bundled MCP servers), addressed via a pre-install component
inventory, signature verification over the OCI Referrers API, and
install-time audit events.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Dotted-edge label collided with the .-> arrow syntax; angle brackets and
special characters in flowchart labels broke the lexer. Use pipe-label form
and quote labels containing special characters.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Make explicit (verified against current Claude Code docs) that a
skills-directory plugin activates all components — commands, agents, hooks,
MCP, LSP, skills — not just skills. Add project-scope trust caveats
(MCP per-server approval, LSP trust, monitors don't load, no repo-root
walk-up) and note the session-only --plugin-dir/--plugin-url alternatives.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…alization

Plugin bundle formats never converged (unlike skills' SKILL.md standard), so
a single tree does not install across clients. Reframe the design into two
layers: a fully client-agnostic distribution layer (build/OCI/push/catalog/
pull/verify/inventory) and a per-client materialization layer behind a
MaterializationAdapter seam. Scope v1 materialization to the .claude-plugin
family (Claude Code + Codex), with Cursor/Copilot/Gemini as future adapters.
Add the "plugin formats did not converge" design constraint, the adapter
interface, a multi-manifest alternative, and corrected goals/non-goals/
open-questions/forward-compatibility.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…x source)

Codex does NOT auto-discover in-place: loading requires a cache install plus
a [plugins.*] enabled entry in ~/.codex/config.toml, and Codex activates only
a subset of components (skills/MCP/apps/hooks — no commands or subagents).
Model Codex as a distinct v1 adapter: cache-install + surgical config.toml
round-trip edit, SupportedComponents + install-time warning on dropped
commands/agents, revert-on-uninstall. Add an adapter comparison table, a
client-config-mutation security note, and update goals/summary/open-questions.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@JAORMX

JAORMX commented Jun 15, 2026

Copy link
Copy Markdown
Contributor Author

So I went and did the cross-client research to sanity-check this (Claude Code, Cursor, Codex, Copilot, Gemini, Kiro, Continue, Goose, Zed, OpenCode, plus MCPB / Agent Skills / the OCI side). Two things came out of it: the factual claims mostly hold up, and a bigger structural problem that I think we got backwards. The structural one first, because it matters more.

This RFC is written through the skills lens: distribute an artifact, place the files where the client loads them, done. That works for skills because a skill is inert Markdown with no runtime. But a plugin isn't inert. Its .mcp.json is a set of MCP servers, and MCP servers are exactly the thing ToolHive exists to run... in a container, behind the proxy, with a permission profile, network isolation, secrets and audit.

So if we "install" a plugin by dropping its .mcp.json into the client's load path (skills-dir, marketplace cache, whatever materialization we land on), the client spawns those servers itself, unsandboxed, and we've bypassed the entire reason ToolHive exists. That's the opposite of what we want.

Right now the RFC files "managed MCP servers from plugins" under Non-Goals and Open Question #4, as a forward-looking v2 thing. I think that's inverted. For ToolHive specifically, running the bundled MCP servers through the runtime is the v1 thesis. It's the only part of a plugin install that a git clone + native install doesn't already give you. Everything else (commands, agents, skills, hooks-as-files) is supporting cast.

Concrete recommendation, and to be clear up front: this is an install-time thing, not a packaging one. The OCI artifact stays exactly what the RFC already says, one opaque, verbatim, single-layer, signable tree. I'm not touching that, and per-primitive layers (Alternative 2) are still the wrong call for v1. What changes is materialization: instead of placing the whole tree verbatim into the client's load path, treat components differently by their nature.

  • Inert components (commands, agents, skills, hooks, LSP config): materialize into the client's load path as-is. The skills model genuinely fits here, and this is where the placement-mechanism choice lives... which is honestly the low-stakes half.
  • MCP servers (.mcp.json): don't hand the raw spawn config to the client. Register each as a ToolHive workload, then rewrite the materialized client config to point at our proxy URL, exactly like pkg/client/converter.go already does for every other MCP server. The plugin's .mcp.json becomes a source of workload definitions, nothing more.

One honest consequence: this breaks the RFC's "byte-identical to a native install" compatibility goal, but only for .mcp.json. The inert components stay byte-identical to what you packaged; the MCP entries deliberately get rewritten to route through ToolHive. That's the whole point, so we should call it out as an intentional trade rather than pretend the install is a verbatim copy.

The genuinely hard v1 question, and the one the RFC should center on, is the gap between how plugins define servers and how we run them. Plugins use local command/args (node ${CLAUDE_PLUGIN_ROOT}/mcp/index.js). ToolHive runs containerized/remote workloads. So: do we run it as a proxied stdio local-process workload (middleware and audit, but weaker isolation)? Repackage into a container at build time? Offer both with a per-server policy? Plus the secrets angle (userConfig sensitive fields feeding the workload) and what default permission profile an untrusted bundled server gets.

On the materialization mechanism for the inert half: I'd drop the in-place skills-directory approach entirely. It's a discovery side-door meant for authoring (claude plugin init), it produces name@skills-dir rather than managed installs, it overlaps the skills directory, and project scope degrades in confusing ways (monitors don't load, no walk-up to the repo root, MCP needs per-server approval).

Go native instead. And the cleanest way to do that without standing up a daemon: we're already pulling and extracting the OCI artifact, so stage the bundle to a local ToolHive-managed dir and register that dir as a local-path marketplace source for the client. No URL server to keep alive, works offline, and we still own the supply chain because we verify the digest/signature before staging. The client then installs from the local path through its own native lifecycle (cache + enable). The "that's a lot of client state to own" worry doesn't really apply... registration + enabledPlugins is the same class of thing we already write for MCP servers and skills.

And note the bundle we stage isn't the raw one. The .mcp.json has already been rewritten to point at our proxy URLs (per the split above), so what the client's marketplace installs is the inert components plus MCP entries that route back through ToolHive. The local-path staging and the workload-registration aren't two separate features, they're the same install doing both halves.

Smaller factual fixes from the research, all verified against current docs:

  • Copilot's manifest path is .github/plugin/plugin.json (nested under plugin/), not .github/plugin.json. It's repeated in a few spots.
  • The Codex fallback to .claude-plugin/plugin.json is real, but only in the source (codex-rs/utils/plugins/src/plugin_namespace.rs), not the prose docs. Worth citing the code, otherwise a reviewer checking docs alone will mark it unverifiable.
  • cagent packages agents (it can bundle multi-agent teams), not "single agents"; the official MCP Registry points at npm/PyPI/MCPB/OCI, not just container images.
  • "distributed exclusively through Git/GitHub" misses npm as a native source type. The OCI-absence point still stands though, and it held up under a hard counterexample hunt: the MCP "Skills Over MCP" charter itself lists bundle-as-a-single-OCI-artifact as out of scope / not yet built.
  • Tekton doesn't key layers with org.opencontainers.image.title, that's Conftest. Tekton uses dev.tekton.image.{name,kind,apiVersion}.

The central differentiation claim (nobody packages a multi-primitive bundle as a signed OCI artifact) is solid, for what it's worth.

Happy to restructure the RFC around the bundle-decomposition thesis if we agree on it. That's a bigger rewrite than the factual patches, so flagging it before I touch anything.

JAORMX and others added 2 commits June 15, 2026 08:48
ToolHive already round-trip-edits ~/.codex/config.toml to register MCP servers
(pkg/client TOMLMapConfigUpdater + pkg/fileutils AtomicWriteFile/WithFileLock,
with a test proving hand-maintained fields survive). Reframe the Codex
adapter's config mutation from "riskiest piece" to reuse of well-trodden code;
narrow Open Question #1b to the local-marketplace-vs-direct-[plugins] choice.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Per review: v1 does not run/proxy/rewrite a plugin's bundled MCP servers.
The artifact carries .mcp.json verbatim but ToolHive ignores it; the intended
managed model references first-class ToolHive servers via the `requires`
mechanism rather than executing servers bundled in the plugin. Add a dedicated
"MCP servers in a plugin" section, Alternative 7 (run/repackage bundled MCP,
considered+deferred), reframe Open Question #4 and Forward Compatibility,
update Goals/Non-Goals/Summary/Security/mitigations, and add a deferred-work
note. Managed MCP via references = follow-up RFC.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@aponcedeleonch

aponcedeleonch commented Jun 15, 2026

Copy link
Copy Markdown
Member

the RFC treats "bundle vs. primitive" as a single decision, but I think it's actually two decisions, and they have different answers.

Distribution / trust unit. The bundle clearly wins here, and the RFC's reasoning holds. Agreed on keeping the bundle as what we package and sign.

Materialization unit. This is where I think a primitive-aware model beats what's proposed. Right now materialization is "place the whole tree and let the client auto-discover it," which leans on the Claude Code skills-directory-plugin trick and forces Codex into a special case where commands/agents are silently dropped. If instead materialization fans the bundle's primitives out to each client's native discovery locations, given that client's support for each primitive type, then:

  • The Codex "subset of components" behavior stops being a special case and becomes the general algorithm: install the primitives the client supports, skip and report the rest.
  • We stop depending on the Claude-Code-specific skills-dir auto-discovery mechanism.
  • We can reach clients that have no plugin concept at all but do support some primitives, which the current adapter seam can't help with by definition.

Worth noting we already accept the primitive model for the hardest component: bundled MCP servers aren't managed in v1, and the intended end state is to requires-reference first-class ToolHive servers rather than execute bundled command/args. That's primitive-first for MCP. The question I'd ask is why stop there, at least at the materialization layer.

Two constraints decide whether primitive fan-out is even possible for a given client, and I think they belong in the doc explicitly:

  • ${CLAUDE_PLUGIN_ROOT} and shared assets. Hooks call scripts/, commands may call bin/, paths resolve relative to one root. Fan those out individually and you break the intra-bundle references. So full decomposition isn't always safe.
  • In-client plugin identity. On clients with a real plugin concept (Claude Code, Codex, Cursor), keeping the tree intact preserves name:command namespacing and enable/disable-as-a-unit. Fan it out and the client sees loose primitives, not "a plugin."

So I don't think this is "primitives instead of plugins." It's: keep the bundle as the distribution and trust unit, and make materialization a per-client choice between placing the bundle intact and fanning primitives out, with ${CLAUDE_PLUGIN_ROOT} and native-plugin-identity as the test for which is possible. Bundle-intact for clients with a native plugin concept, primitive fan-out for clients without one. That reads to me like a third MaterializationAdapter strategy alongside the in-place and cache-install ones, and it's what actually widens client coverage past the .claude-plugin family.

Might be worth capturing as an Open Question, or folding into the materialization section so the adapter seam is described as choosing a placement strategy rather than always placing the whole tree.

@JAORMX

JAORMX commented Jun 16, 2026

Copy link
Copy Markdown
Contributor Author

Did the cross-client research on this. The "no plugin concept" set fan-out targets is nearly empty now — Cursor, Copilot, Gemini, Kiro, Goose, and Continue all shipped native bundle formats, alongside Claude Code and Codex. Only OpenCode and Zed's AI surface lack one. So native-bundle materialization already reaches ~8 of 10 harnesses; a fan-out engine for the remaining ~1.5 isn't worth the per-client scope (the Alternative 6 surface).

It also doesn't widen coverage much: fan-out can't add a capability a client lacks (Codex has no commands/agents concept at all), and your ${CLAUDE_PLUGIN_ROOT} point cuts the other way — the root-variable clients are the full-bundle ones, so it argues for keeping the tree intact.

Keeping native bundle as the v1 thesis. I'll fold in the one durable bit — the adapter seam shouldn't assume "always place the whole tree" forever, so fan-out stays open as a future adapter for the loose-only clients.

@aponcedeleonch aponcedeleonch left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed on the call. Native-bundle materialization for v1, fan-out deferred to a future adapter. And your two structural points land: fan-out can't add a capability a client lacks, and the ${CLAUDE_PLUGIN_ROOT} constraint argues for keeping the tree intact, not against it. I'll concede both.

I did go and check what each client's bundle format actually is, since "shipped a native bundle format" is doing a lot of work in the ~8-of-10 number, and the formats turn out not to be the same kind of thing. Three tiers:

Near drop-in, and the reason matters: Cursor, Copilot, and Codex are cheap to target specifically because they converged on the .claude-plugin format. Cursor mirrors the component dirs and SKILL.md layout almost exactly. Copilot literally accepts .claude-plugin/plugin.json and ships a ${PLUGIN_ROOT} variable. Codex accepts both .claude-plugin/plugin.json and .claude-plugin/marketplace.json and even aliases CLAUDE_PLUGIN_ROOT. That convergence is the actual strength of the native-bundle thesis, and I'd lead the doc with it rather than with a client count.

Real bundle, but a genuine per-client adapter: Gemini CLI has full primitive parity but its own gemini-extension.json manifest and TOML command files, so commands need transcoding. Kiro Powers are a real multi-primitive bundle but the manifest is a Markdown POWER.md with no commands/agents/skills split. Neither is "place the tree"; both are mapping work we'd need to scope.

Not an installable plugin bundle at all: Goose "recipes" are run-on-invocation workflow files (a parameterized prompt plus inline MCP), not a set of primitives you install. Continue "bundles" are explicitly a hub-side UI grouping that, per their own docs, isn't represented in config files; "add bundle" just expands into individual block references.

That last tier is where I'd push back on the "~1.5 fan-out targets" number. Goose and Continue are in the bundle column in your count, but you can't materialize a bundle into either. They're fan-out-or-skip clients, same as OpenCode and Zed. So the real fan-out-relevant set is closer to four than to one and a half.

None of this changes the v1 decision. Defer fan-out, ship native bundle, and the .claude-plugin convergence makes that genuinely cheap for the three clients that matter most. I'd just adjust the doc to (1) rest the thesis on that convergence rather than an 8-of-10 reach, (2) name the Gemini and Kiro adapter cost explicitly instead of folding them into "native bundle just works," and (3) be upfront that Goose and Continue are fan-out-or-skip, not bundle-place.

Approving on that basis.

@jhrozek jhrozek left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding a few comments as I read, none blocking. the two I'd actually want to settle before merge are the materialization target (in-place skills-dir vs the native staging your top comment proposes) and the matching mcp question at the bottom, since the doc and your own comment point opposite ways there. the rest are nits / a couple factual fixes on the codex section.

Comment thread rfcs/THV-0077-plugins-lifecycle-management.md
Comment thread rfcs/THV-0077-plugins-lifecycle-management.md
Comment thread rfcs/THV-0077-plugins-lifecycle-management.md
Comment thread rfcs/THV-0077-plugins-lifecycle-management.md
Comment thread rfcs/THV-0077-plugins-lifecycle-management.md

@jhrozek jhrozek left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

solid RFC, really thorough research behind it. left a few comments inline but nothing blocking — the only two I'd genuinely like settled are the materialization target and the matching mcp question, since the doc and your top comment point opposite ways there. happy for those to be a follow-up. approving.

@Sanskarzz

Copy link
Copy Markdown

Hey i am following this because the plugin lifecycle work looks really interesting and i am trying to understand the v1 boundary better.

One thing that stood out to me is that the implementation contract for materialization seems important to make explicit. Since different clients can activate different parts of the same bundle, would it be useful for the RFC to define what the materialization result records per client?

For example things like components placed, components actually active, components skipped or dropped, components gated by client trust or scope, client config or state mutates, and anything left unmanaged such as bundled MCP servers in v1.

I am still getting familiar with this area so apologies if I missed an existing section.

@JAORMX

JAORMX commented Jun 22, 2026

Copy link
Copy Markdown
Contributor Author

@Sanskarzz Thanks, this is a good callout. You didn’t miss it, the RFC had the MaterializeResult return type in the adapter interface, but it didn’t actually say what that result is supposed to capture.

I added that contract now. The result records the target client/scope, where the plugin was placed, which components were placed, which ones are expected to be active, which ones were skipped because the client can’t load them, which ones are gated by client trust/scope, any bundled MCP servers left unmanaged in v1, and the client state ToolHive mutated so uninstall can roll it back.

So, this should make the client-specific behavior explicit instead of implied. For example, Codex dropping commands/agents, Claude/Codex mutating different client state, and bundled MCP being present but not managed by ToolHive in v1. Good catch.

@Sanskarzz

Copy link
Copy Markdown

@JAORMX Thanks that makes sense. I'll soon reread the related sections again.

@JAORMX JAORMX merged commit 0a5d30c into main Jun 29, 2026
1 check passed
JAORMX added a commit to stacklok/toolhive that referenced this pull request Jun 29, 2026
…5676)

* Add plugin core service, validation, and storage (Phase 2, THV-0077)

Stand up the plugin distribution layer's foundation: types, manifest
parser, validator, the build/push/validate/content service, and SQLite
storage. Mirrors pkg/skills and pkg/skills/skillsvc file-for-file,
substituting toolhive-core's oci/plugins package for oci/skills.

Phase 2 implements only the build/push/validate/list-builds/delete-build/
get-content surface on pluginsvc.New (returning plugins.PluginService).
Install/uninstall/list/info and the MaterializationAdapter are declared
in the issue but land in Phase 3 (#5527); REST API and CLI are Phase 4
(#5528). App wiring is deferred to Phase 4 — this lands the library and
storage layer plus the migration, which applies to the shared DB on
existing deployments.

Storage introduces a typed EntryType ("skill"/"plugin") replacing the
stringly-typed entry_type literal in skill_store.go, and a new 002
migration adding installed_plugins + plugin_dependencies off the existing
entries table (reusing, not redefining, its UNIQUE(entry_type, name)).

Exit gate tests: parser/validator units (keywords-must-be-array type
mismatch, component-path traversal rejection, bundled-skill validation
reuse), packager determinism via the service, migration up/down, and a
build→push round-trip against a mock OCI registry.

Part of #5525
Refs RFC stacklok/toolhive-rfcs#77

* Address panel review round 2 findings

Narrow PluginService to the 6 Phase-2 methods and drop the unused store
param from pluginsvc.New (now an option, WithStore, for Phase 3). Round-1
declared the full 10-method interface but *service only implemented 6,
making var s plugins.PluginService = pluginsvc.New(...) a compile error.

Backfill test gaps the QA reviewer flagged:
- parser symlink/oversize guards (security-relevant TOCTOU/bomb paths)
- validateLocalPath null-byte rejection
- List ORDER BY e.name assertion
- Update shrink-to-zero-dependencies (DELETE-then-empty-insert path)
- migration-down test now asserts skill_dependencies survives 002-Down

Part of #5525
Refs RFC stacklok/toolhive-rfcs#77

* Address PR review: cap components, persist license, fix error wrapping

- Add MaxComponentsPerGroup cap (100) to parser to bound validation cost
- Persist License field: add license column to migration 002, wire
  through Create/Update/scanPluginFields, assert round-trip in test
- Use dual %w in parser error wrapping so errors.As/Is can reach the
  underlying JSON error (Go 1.20+, go.mod is 1.26)
- Clarify absolute-path component error message
- Switch content.go from deprecated ociskills.DecompressTar to
  oci/artifact.DecompressTar, dropping the deprecation and cross-package
  dependency
- Guard OCI config blob read with maxConfigSize (1MB) since
  manifest.Config.Size is advisory
- Drop unused store field and WithStore option from pluginsvc.service
  (re-add in Phase 3 when the store is actually read)
JAORMX added a commit to stacklok/toolhive that referenced this pull request Jun 30, 2026
…V-0077)

The Wave 0 client/discovery and groups changes added SupportsPlugins to
ClientAppStatus and Plugins to the group model, but the swagger spec wasn't
regenerated. CI's Swagger verification check caught the drift.

Part of #5525
Refs RFC stacklok/toolhive-rfcs#77
JAORMX added a commit to stacklok/toolhive that referenced this pull request Jun 30, 2026
… metadata (Wave 0, THV-0077)

Stand up the Phase-3 plugin install/list/info/uninstall foundation that
the later waves build on:

- plugins.MaterializationAdapter interface (pkg/plugins/adapter.go) generalizes
  skills.PathResolver to multi-component plugins. Instead of resolving a single
  skill path, the adapter owns extraction + optional config mutation for a
  multi-component plugin tree, because the materialization strategy differs per
  client (Claude Code = pure filesystem; Codex = FS cache + TOML mutation).
  ComponentType enum mirrors ociplugins.ComponentInventory keys
  (commands/agents/skills/hooks/mcpServers/lspServers).

- pluginsvc.service widens with store/groupManager/materializers/installer/
  gitResolver/pluginLookup fields and matching With* options, plus a
  pluginLock (per-(scope,name,projectRoot) mutex mirroring skillsvc.skillLock).
  New() defaults installer and gitResolver. The PluginService interface itself
  is NOT widened yet (that is Wave 2, after the methods exist); New still
  returns the narrowed Phase-2 interface.

- groups.Group gains a Plugins []string field, with AddPluginToGroup/
  RemovePluginFromAllGroups (pkg/groups/plugins.go) ported line-for-line from
  the skills analogues.

- client.clientAppConfig gains SupportsPlugins/PluginsGlobalPath/
  PluginsProjectPath/PluginsPlatformPrefix, populated for Claude Code
  (~/.claude/plugins) and Codex (~/.codex/plugins/cache). ClientManager gains
  SupportsPlugins/ListPluginSupportingClients/GetPluginPath
  (pkg/client/plugins.go), ported from the skills helpers. DiscoveryStatus
  surfaces SupportsPlugins.

- plugins.PluginInfo gains UnmaterializedComponents (per-client component
  types the adapter does not load) for the Phase-3 Info surface.

Part of #5525
Refs RFC stacklok/toolhive-rfcs#77
JAORMX added a commit to stacklok/toolhive that referenced this pull request Jun 30, 2026
…entations (Phase 3, THV-0077)

Widen the plugin service from the Phase-2 build/validate/push/content surface
to the full install/uninstall/list/info lifecycle, plus the per-client
materialization seam:

- plugins.MaterializationAdapter gains DegradesOnProjectScope so Info can
  surface which clients degrade a project-scope install without re-running
  Materialize. ClaudeCode reports false (supports both scopes); Codex reports
  true (registration always lands in the user-scoped config.toml).

- adapters.ClaudeCodeAdapter: pure-filesystem extraction into
  ~/.claude/plugins/<name>/ via the hardened skills.Installer (path-traversal/
  symlink/oversize guards). No config mutation. Drops mcpServers/lspServers.

- adapters.CodexAdapter: cache extraction under ~/.codex/plugins/cache/<name>/
  + config.toml [plugins.<name>] table mutation. Reuses the shared TOML
  read/write helpers (exported as client.ReadTOMLConfig/WriteTOMLConfig) under
  fileutils.WithFileLock — no new TOML code. Drops commands/agents/lspServers.
  Dematerialize reverts its own [plugins.*] additions (and the empty table)
  while unrelated [mcp_servers.*]/[other] tables survive (exit-gate test).

- pluginsvc.install.go dispatches by reference type (git → OCI → registry
  name), with the per-(scope,name,projectRoot) pluginLock and two-phase
  release. install_oci.go enforces the name==repo-last-component consistency
  check (422 on mismatch). install_git.go reuses gitresolver's skill-agnostic
  helpers (ParseGitReference/IsGitReference/ResolveAuth/WriteFiles) but clones
  + reads .claude-plugin/plugin.json directly, since gitresolver.Resolve is
  skill-specific. install_extraction.go is the shared materialize+persist core
  with rollback; list.go mirrors skillsvc; info.go surfaces
  UnmaterializedComponents (static diff) and ProjectScopeDegradedClients;
  uninstall.go dematerializes per-client then deletes the record.

- plugins.PluginService widens with Install/Uninstall/List/Info; mocks
  regenerated.

- Removed dead WithInstaller/WithGitResolver wiring: adapters own their
  skills.Installer, and install_git uses WithGitClient (test seam) +
  gitresolver.ResolveAuth directly, so the service-level fields were never
  read.

Part of #5525
Refs RFC stacklok/toolhive-rfcs#77
JAORMX added a commit to stacklok/toolhive that referenced this pull request Jun 30, 2026
Document the plugins system mirroring docs/arch/12-skills-system.md: manifest
format, OCI artifact layout, install dispatch (git/OCI/name),
MaterializationAdapter seam (Claude Code pure-FS vs Codex FS+TOML), component
inventory + dropped-component warnings, name/repo consistency check, extraction
safety, and per-client scope degradation. Add entry #14 to the docs/arch index
and mermaid graph.

Part of #5525
Refs RFC stacklok/toolhive-rfcs#77
JAORMX added a commit to stacklok/toolhive that referenced this pull request Jun 30, 2026
…V-0077)

The Wave 0 client/discovery and groups changes added SupportsPlugins to
ClientAppStatus and Plugins to the group model, but the swagger spec wasn't
regenerated. CI's Swagger verification check caught the drift.

Part of #5525
Refs RFC stacklok/toolhive-rfcs#77
JAORMX added a commit to stacklok/toolhive that referenced this pull request Jul 1, 2026
Mirror MaterializeRequest with a DematerializeRequest struct so the
two sides of the MaterializationAdapter interface stay symmetric — a
future client needing more context to revert cleanly gets it without
forcing an interface break. Replace DegradesOnProjectScope() bool with
a ScopeSupport struct carrying DegradesOnProjectScope plus an optional
Reason (not yet consumed by Info). Update both adapters, all callers,
regenerate mocks, and fix tests. Resolves PR #5685 review comments on
adapter symmetry and scope-support descriptor.

Part of #5525
Refs RFC stacklok/toolhive-rfcs#77
JAORMX added a commit to stacklok/toolhive that referenced this pull request Jul 1, 2026
Claude Code does not auto-discover plugins from the filesystem alone;
it requires an enabledPlugins entry plus an extraKnownMarketplaces
registration in settings.json. The adapter now writes a per-plugin
marketplace.json declaring the plugin under the toolhive marketplace
with a local source, and patches settings.json (hujson, under
WithFileLock + AtomicWriteFile 0600) to add the marketplace entry
pointing at the shared plugins parent directory and enable the plugin
as <name>@ToolHive. Dematerialize reverts both mutations and removes
the marketplace entry when no toolhive plugins remain. Also extracts
cleanupAfterRemove into helpers.go and adds HomeDir() to ClientManager.
Resolves the Claude Code blocker from PR #5685 review.

Part of #5525
Refs RFC stacklok/toolhive-rfcs#77
JAORMX added a commit to stacklok/toolhive that referenced this pull request Jul 1, 2026
Codex does not read the invented [plugins.<name>] table with a path key. Switch to the marketplace model: a shared ~/.agents/plugins/marketplace.json declares the toolhive marketplace with a local source pointing at the stable plugins cache parent directory, and ~/.codex/config.toml holds enable state as [plugins."<name>@ToolHive"] with enabled = true. Dematerialize removes the enable table and deletes the marketplace file when no toolhive plugins remain. The getPluginsMap helper now returns an error when the plugins key exists but is not a table, preventing silent clobbering of a malformed config. Resolves the Codex blocker from PR #5685 review.

Part of #5525
Refs RFC stacklok/toolhive-rfcs#77
JAORMX added a commit to stacklok/toolhive that referenced this pull request Jul 1, 2026
The git install path forced every file to mode 0644, dropping the executable bit that hook scripts and entry points need. collectPluginFiles now calls f.Mode.ToOSFileMode() to carry the real git mode through, so a git-sourced plugin lands identically to the same plugin from OCI. A subdir-aware name-consistency check (manifest name must match gitRef.SkillName(), which returns the subdir last segment or repo last segment) closes the bare-repo squatting vector with the same 422 the OCI path returns. The duplicated ref-classification and auth-resolution helpers are removed from install_git.go in favor of the newly exported gitresolver.CloneConfigForRef, ClientForURL, and CloneTimeout. Resolves PR #5685 review comments on exec-bit, name consistency, and gitresolver dedup.

Part of #5525
Refs RFC stacklok/toolhive-rfcs#77
JAORMX added a commit to stacklok/toolhive that referenced this pull request Jul 1, 2026
ValidatePluginName errors in Info and Uninstall were returned without a status code, causing the transport layer to map a bad name to 500 instead of 400. Both now wrap with httperr.WithCode(err, http.StatusBadRequest), matching Install and skillsvc. The three stale //nolint:unused directives on pluginLock are dropped (the lock is actively used by install/uninstall flows) and the comment is reworded to describe it as live. Adds List 500-branch coverage, store/group error propagation tests, and the all+explicit client rejection test. Resolves PR #5685 review comments on status-code wrapping and stale nolint.

Part of #5525
Refs RFC stacklok/toolhive-rfcs#77
JAORMX added a commit to stacklok/toolhive that referenced this pull request Jul 1, 2026
Add ExtractPlugin to the skills Installer interface using a 0755
permission mask (vs 0644 for skills) so plugin hook scripts keep
their executable bit through extraction. Both adapters now call
ExtractPlugin. The existing 0644 mask for skills is unchanged.

Move the Codex hasToolhivePlugin check inside the removeCodexPlugin
file lock to eliminate a TOCTOU race where concurrent dematerialize
calls could read stale config state. removeCodexPlugin now returns
whether toolhive plugins remain, computed atomically with the
removal.

Replace os.IsNotExist with errors.Is(err, os.ErrNotExist) in both
adapters per go-style.md.

Part of #5525
Refs RFC stacklok/toolhive-rfcs#77
JAORMX added a commit to stacklok/toolhive that referenced this pull request Jul 2, 2026
… metadata (Wave 0, THV-0077)

Stand up the Phase-3 plugin install/list/info/uninstall foundation that
the later waves build on:

- plugins.MaterializationAdapter interface (pkg/plugins/adapter.go) generalizes
  skills.PathResolver to multi-component plugins. Instead of resolving a single
  skill path, the adapter owns extraction + optional config mutation for a
  multi-component plugin tree, because the materialization strategy differs per
  client (Claude Code = pure filesystem; Codex = FS cache + TOML mutation).
  ComponentType enum mirrors ociplugins.ComponentInventory keys
  (commands/agents/skills/hooks/mcpServers/lspServers).

- pluginsvc.service widens with store/groupManager/materializers/installer/
  gitResolver/pluginLookup fields and matching With* options, plus a
  pluginLock (per-(scope,name,projectRoot) mutex mirroring skillsvc.skillLock).
  New() defaults installer and gitResolver. The PluginService interface itself
  is NOT widened yet (that is Wave 2, after the methods exist); New still
  returns the narrowed Phase-2 interface.

- groups.Group gains a Plugins []string field, with AddPluginToGroup/
  RemovePluginFromAllGroups (pkg/groups/plugins.go) ported line-for-line from
  the skills analogues.

- client.clientAppConfig gains SupportsPlugins/PluginsGlobalPath/
  PluginsProjectPath/PluginsPlatformPrefix, populated for Claude Code
  (~/.claude/plugins) and Codex (~/.codex/plugins/cache). ClientManager gains
  SupportsPlugins/ListPluginSupportingClients/GetPluginPath
  (pkg/client/plugins.go), ported from the skills helpers. DiscoveryStatus
  surfaces SupportsPlugins.

- plugins.PluginInfo gains UnmaterializedComponents (per-client component
  types the adapter does not load) for the Phase-3 Info surface.

Part of #5525
Refs RFC stacklok/toolhive-rfcs#77
JAORMX added a commit to stacklok/toolhive that referenced this pull request Jul 2, 2026
…entations (Phase 3, THV-0077)

Widen the plugin service from the Phase-2 build/validate/push/content surface
to the full install/uninstall/list/info lifecycle, plus the per-client
materialization seam:

- plugins.MaterializationAdapter gains DegradesOnProjectScope so Info can
  surface which clients degrade a project-scope install without re-running
  Materialize. ClaudeCode reports false (supports both scopes); Codex reports
  true (registration always lands in the user-scoped config.toml).

- adapters.ClaudeCodeAdapter: pure-filesystem extraction into
  ~/.claude/plugins/<name>/ via the hardened skills.Installer (path-traversal/
  symlink/oversize guards). No config mutation. Drops mcpServers/lspServers.

- adapters.CodexAdapter: cache extraction under ~/.codex/plugins/cache/<name>/
  + config.toml [plugins.<name>] table mutation. Reuses the shared TOML
  read/write helpers (exported as client.ReadTOMLConfig/WriteTOMLConfig) under
  fileutils.WithFileLock — no new TOML code. Drops commands/agents/lspServers.
  Dematerialize reverts its own [plugins.*] additions (and the empty table)
  while unrelated [mcp_servers.*]/[other] tables survive (exit-gate test).

- pluginsvc.install.go dispatches by reference type (git → OCI → registry
  name), with the per-(scope,name,projectRoot) pluginLock and two-phase
  release. install_oci.go enforces the name==repo-last-component consistency
  check (422 on mismatch). install_git.go reuses gitresolver's skill-agnostic
  helpers (ParseGitReference/IsGitReference/ResolveAuth/WriteFiles) but clones
  + reads .claude-plugin/plugin.json directly, since gitresolver.Resolve is
  skill-specific. install_extraction.go is the shared materialize+persist core
  with rollback; list.go mirrors skillsvc; info.go surfaces
  UnmaterializedComponents (static diff) and ProjectScopeDegradedClients;
  uninstall.go dematerializes per-client then deletes the record.

- plugins.PluginService widens with Install/Uninstall/List/Info; mocks
  regenerated.

- Removed dead WithInstaller/WithGitResolver wiring: adapters own their
  skills.Installer, and install_git uses WithGitClient (test seam) +
  gitresolver.ResolveAuth directly, so the service-level fields were never
  read.

Part of #5525
Refs RFC stacklok/toolhive-rfcs#77
JAORMX added a commit to stacklok/toolhive that referenced this pull request Jul 2, 2026
Document the plugins system mirroring docs/arch/12-skills-system.md: manifest
format, OCI artifact layout, install dispatch (git/OCI/name),
MaterializationAdapter seam (Claude Code pure-FS vs Codex FS+TOML), component
inventory + dropped-component warnings, name/repo consistency check, extraction
safety, and per-client scope degradation. Add entry #14 to the docs/arch index
and mermaid graph.

Part of #5525
Refs RFC stacklok/toolhive-rfcs#77
JAORMX added a commit to stacklok/toolhive that referenced this pull request Jul 2, 2026
…V-0077)

The Wave 0 client/discovery and groups changes added SupportsPlugins to
ClientAppStatus and Plugins to the group model, but the swagger spec wasn't
regenerated. CI's Swagger verification check caught the drift.

Part of #5525
Refs RFC stacklok/toolhive-rfcs#77
JAORMX added a commit to stacklok/toolhive that referenced this pull request Jul 2, 2026
Mirror MaterializeRequest with a DematerializeRequest struct so the
two sides of the MaterializationAdapter interface stay symmetric — a
future client needing more context to revert cleanly gets it without
forcing an interface break. Replace DegradesOnProjectScope() bool with
a ScopeSupport struct carrying DegradesOnProjectScope plus an optional
Reason (not yet consumed by Info). Update both adapters, all callers,
regenerate mocks, and fix tests. Resolves PR #5685 review comments on
adapter symmetry and scope-support descriptor.

Part of #5525
Refs RFC stacklok/toolhive-rfcs#77
JAORMX added a commit to stacklok/toolhive that referenced this pull request Jul 2, 2026
Claude Code does not auto-discover plugins from the filesystem alone;
it requires an enabledPlugins entry plus an extraKnownMarketplaces
registration in settings.json. The adapter now writes a per-plugin
marketplace.json declaring the plugin under the toolhive marketplace
with a local source, and patches settings.json (hujson, under
WithFileLock + AtomicWriteFile 0600) to add the marketplace entry
pointing at the shared plugins parent directory and enable the plugin
as <name>@ToolHive. Dematerialize reverts both mutations and removes
the marketplace entry when no toolhive plugins remain. Also extracts
cleanupAfterRemove into helpers.go and adds HomeDir() to ClientManager.
Resolves the Claude Code blocker from PR #5685 review.

Part of #5525
Refs RFC stacklok/toolhive-rfcs#77
JAORMX added a commit to stacklok/toolhive that referenced this pull request Jul 2, 2026
Codex does not read the invented [plugins.<name>] table with a path key. Switch to the marketplace model: a shared ~/.agents/plugins/marketplace.json declares the toolhive marketplace with a local source pointing at the stable plugins cache parent directory, and ~/.codex/config.toml holds enable state as [plugins."<name>@ToolHive"] with enabled = true. Dematerialize removes the enable table and deletes the marketplace file when no toolhive plugins remain. The getPluginsMap helper now returns an error when the plugins key exists but is not a table, preventing silent clobbering of a malformed config. Resolves the Codex blocker from PR #5685 review.

Part of #5525
Refs RFC stacklok/toolhive-rfcs#77
JAORMX added a commit to stacklok/toolhive that referenced this pull request Jul 2, 2026
The git install path forced every file to mode 0644, dropping the executable bit that hook scripts and entry points need. collectPluginFiles now calls f.Mode.ToOSFileMode() to carry the real git mode through, so a git-sourced plugin lands identically to the same plugin from OCI. A subdir-aware name-consistency check (manifest name must match gitRef.SkillName(), which returns the subdir last segment or repo last segment) closes the bare-repo squatting vector with the same 422 the OCI path returns. The duplicated ref-classification and auth-resolution helpers are removed from install_git.go in favor of the newly exported gitresolver.CloneConfigForRef, ClientForURL, and CloneTimeout. Resolves PR #5685 review comments on exec-bit, name consistency, and gitresolver dedup.

Part of #5525
Refs RFC stacklok/toolhive-rfcs#77
JAORMX added a commit to stacklok/toolhive that referenced this pull request Jul 2, 2026
ValidatePluginName errors in Info and Uninstall were returned without a status code, causing the transport layer to map a bad name to 500 instead of 400. Both now wrap with httperr.WithCode(err, http.StatusBadRequest), matching Install and skillsvc. The three stale //nolint:unused directives on pluginLock are dropped (the lock is actively used by install/uninstall flows) and the comment is reworded to describe it as live. Adds List 500-branch coverage, store/group error propagation tests, and the all+explicit client rejection test. Resolves PR #5685 review comments on status-code wrapping and stale nolint.

Part of #5525
Refs RFC stacklok/toolhive-rfcs#77
JAORMX added a commit to stacklok/toolhive that referenced this pull request Jul 2, 2026
Add ExtractPlugin to the skills Installer interface using a 0755
permission mask (vs 0644 for skills) so plugin hook scripts keep
their executable bit through extraction. Both adapters now call
ExtractPlugin. The existing 0644 mask for skills is unchanged.

Move the Codex hasToolhivePlugin check inside the removeCodexPlugin
file lock to eliminate a TOCTOU race where concurrent dematerialize
calls could read stale config state. removeCodexPlugin now returns
whether toolhive plugins remain, computed atomically with the
removal.

Replace os.IsNotExist with errors.Is(err, os.ErrNotExist) in both
adapters per go-style.md.

Part of #5525
Refs RFC stacklok/toolhive-rfcs#77
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants