You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Write three new or updated architecture documents that formally capture the local-mode vMCP story now that Phase 4 (optimizer flag wiring) is complete: docs/arch/vmcp-local.md for the thv vmcp CLI surface and optimizer lifecycle, docs/arch/vmcp-library.md for the pkg/vmcp/ library embedding pattern and stability guarantees, and an update to docs/arch/10-virtual-mcp-architecture.md to cross-reference local mode alongside the existing Kubernetes-oriented content. Also add deprecation guidance for the StacklokLabs/mcp-optimizer Python project now that the Go-native optimizer is stable.
Context
Phase 5 of RFC THV-0059 is documentation-only work that can proceed once #4887 lands (the full optimizer lifecycle is stable and observable). The existing docs/arch/10-virtual-mcp-architecture.md covers only the Kubernetes-side of vMCP (CRD-managed VirtualMCPServer, dynamic backend discovery, operator status reporting). Two new files are needed to document the local CLI path introduced in this plan:
docs/arch/vmcp-local.md — describes thv vmcp serve/validate/init, quick-mode config generation, three optimizer tiers (Tier 0 no optimizer, Tier 1 FTS5-only, Tier 2 managed TEI container), and the TEI container lifecycle (naming, idempotent reuse, health polling, fail-fast, graceful shutdown).
docs/arch/vmcp-library.md — formalizes the library embedding pattern already used in production by github.com/stacklok/brood-box, provides the pkg/vmcp/ stability table from RFC THV-0059 (Stable vs Experimental per package), and documents the doc.go annotation convention for package-level stability declarations.
The existing architecture doc needs a single new "Local CLI Mode" section that links to these new docs and updates the "Deployment" section to remove the implication that cmd/vmcp/ is the only CLI entry point.
StacklokLabs/mcp-optimizer is the Python predecessor that thv vmcp serve --optimizer replaces. Now that the Go-native solution ships in every ToolHive release and is stable, a deprecation notice pointing users to thv vmcp is appropriate.
Dependencies: Depends on #4887 (use Temp ID until GitHub issue number is known) Blocks: (none)
Acceptance Criteria
docs/arch/vmcp-local.md exists and covers all of the following:
Overview diagram showing the thv vmcp serve request path (Cobra CLI → pkg/vmcp/cli/serve.go → vMCP server → backends)
Zero-config quick mode: how --group triggers in-memory YAML config generation, the required groupRef field, and the 127.0.0.1-only binding security requirement
Optimizer tier table (Tier 0, Tier 1 FTS5-only, Tier 2 managed TEI) with flag names and behavior
TEI container lifecycle: naming scheme (thv-embedding-<model-short-hash>), idempotent create-or-reuse logic, health polling with exponential backoff (30–60 s first-start budget), fail-fast on explicit --optimizer-embedding failure, graceful stop on shutdown
A note on the ARM64/Apple Silicon Rosetta 2 emulation path for the amd64-only TEI CPU image
docs/arch/vmcp-library.md exists and covers all of the following:
The library embedding pattern: how to import pkg/vmcp/ packages, the brood-box reference implementation (github.com/stacklok/brood-box/internal/infra/mcp/), and what guarantees ToolHive provides
The pkg/vmcp/ stability table per RFC THV-0059 (every sub-package listed as Stable, Experimental, or Internal with a one-line rationale)
The doc.go annotation convention: example showing how //go:doc:stability (or equivalent) is used in package docs to declare stability level
Compatibility guarantees for Stable packages: no breaking API changes, no import-path renames, semver-aligned deprecation policy
Guidance for downstream embedders on how to pin and upgrade
docs/arch/10-virtual-mcp-architecture.md is updated:
A new "Local CLI Mode" section added (or the "Deployment" section expanded) that describes thv vmcp alongside the existing standalone vmcp binary reference
Cross-references to the new vmcp-local.md and vmcp-library.md documents added to the "Related Documentation" section
The existing K8s-oriented content is not altered — only additive changes
A deprecation notice for StacklokLabs/mcp-optimizer is included in docs/arch/vmcp-local.md (or a standalone note in vmcp-library.md) that:
Clearly states the Python mcp-optimizer project is deprecated in favor of thv vmcp serve --optimizer
Provides a migration path (flag equivalents, feature parity notes)
Links to the StacklokLabs/mcp-optimizer repository for historical reference
docs/arch/README.md is updated to list the two new documents in the Documentation Index with their descriptions
All internal cross-references (relative Markdown links) resolve correctly
All code paths referenced in the new docs (file names, function names, flag names) are accurate and match the state of the codebase after Wire optimizer flags into thv vmcp serve #4887 lands
Code reviewed and approved
Technical Approach
Recommended Implementation
This is a pure documentation PR — no Go source changes. The deliverables are two new Markdown files and additive edits to two existing files.
docs/arch/vmcp-local.md should follow the template from docs/arch/README.md (# Topic Name, ## Overview, ## Why This Exists, ## How It Works, ## Key Components, ## Implementation, ## Related Documentation). Use a Mermaid sequence diagram for the TEI lifecycle and a simple table for the optimizer tiers. The TEI lifecycle section should be modelled on the sequence diagram style used in docs/arch/01-deployment-modes.md (detached process model diagram).
docs/arch/vmcp-library.md should open with a "Why a Stability Table" rationale (downstream consumers like brood-box need predictability), then present the stability table as a Markdown table, then give the doc.go annotation example, then the compatibility guarantees.
Update to docs/arch/10-virtual-mcp-architecture.md: Add a short section after the existing "Deployment" section (currently at the bottom of the file, which lists cmd/thv-operator/... and cmd/vmcp/). The new section should read as "Local CLI Mode" and describe thv vmcp as the recommended path for local/non-Kubernetes use. Add the two new files to the "Related Documentation" section at the bottom.
docs/arch/README.md: Insert two new entries (items 14 and 15) in the Documentation Index after item 13 (Skills System). Update the Architecture Map Mermaid diagram to include vMCPLocal and vMCPLibrary nodes connected from the existing vMCP node.
Deprecation notice: Add a top-level ## Migration from StacklokLabs/mcp-optimizer section in docs/arch/vmcp-local.md. Map the Python CLI flags to thv vmcp equivalents, note that the Go-native optimizer provides FTS5 keyword search (Tier 1) and TEI semantic search (Tier 2) as feature-complete replacements.
Patterns & Frameworks
Docs structure: follow docs/arch/README.md template — ## Overview, ## Why This Exists, ## How It Works, ## Key Components, ## Implementation, ## Related Documentation
Mermaid diagrams: match the style used in existing arch docs — sequenceDiagram for lifecycle flows, graph TB for component hierarchies, graph LR for pipelines; use consistent node fill colors (#90caf9 blue for ToolHive components, #ffb74d orange for containers/external services)
Stability table format: use a Markdown table with columns Package, Stability, Rationale; list every sub-package under pkg/vmcp/ (aggregator, auth, cache, client, composer, config, discovery, health, k8s, optimizer, router, schema, server, session, status, workloads, and the new cli/)
Cross-references: use relative paths (e.g., [Local CLI Mode](vmcp-local.md)) consistent with existing links in the arch docs
Code pointers: use backtick paths relative to repo root (e.g., `pkg/vmcp/cli/serve.go`) — consistent with the style throughout the existing arch docs
No nolint comments, no Go changes: this item contains zero Go source modifications
Code Pointers
docs/arch/10-virtual-mcp-architecture.md — File to update; the "Deployment" section (line 293–300 area) and "Related Documentation" section (bottom) are the insertion points
docs/arch/README.md — Documentation Index (items 1–13) and Architecture Map Mermaid diagram need the two new entries
docs/arch/01-deployment-modes.md — Reference for Mermaid diagram style (detached process model sequence diagram); and for the table-based mode comparison format to mirror in the optimizer tier table
pkg/vmcp/cli/embedding_manager.go (created by Implement EmbeddingServiceManager in pkg/vmcp/cli/ #4884) — Source of truth for TEI container naming convention, health-polling logic, and idempotent reuse; document EmbeddingServiceManagerConfig.Model and .Image fields
pkg/vmcp/optimizer/optimizer.go — GetAndValidateConfig, NewOptimizerFactory — describe these as the optimizer activation calls in the architecture doc
pkg/vmcp/config/config.go — Config.Optimizer *OptimizerConfig, OptimizerConfig.EmbeddingService — describe how Tier 3 (config-file) passes the TEI URL directly
Component Interfaces
The following is the optimizer tier table to be reproduced in docs/arch/vmcp-local.md:
| Tier | Flag(s) | Optimizer | External Service | Exposed Tools ||------|---------|-----------|-----------------|---------------|| 0 | (none) | None | None | All backend tools passed through || 1 |`--optimizer`| FTS5 keyword (SQLite in-process) | None |`find_tool`, `call_tool` only || 2 |`--optimizer-embedding`| FTS5 + TEI semantic | Managed TEI container |`find_tool`, `call_tool` only || 3 |`optimizer.embeddingService` in config YAML | FTS5 + external embedding service | User-managed |`find_tool`, `call_tool` only |
The following is the pkg/vmcp/ stability table structure to be reproduced in docs/arch/vmcp-library.md:
| Package | Stability | Notes ||---------|-----------|-------||`pkg/vmcp/config`| Stable | Config structs and YAML loader; public API ||`pkg/vmcp/aggregator`| Stable | Backend discovery and capability merge ||`pkg/vmcp/router`| Stable | Request routing and tool name translation ||`pkg/vmcp/server`| Stable | Server constructor and lifecycle ||`pkg/vmcp/session`| Stable | Session factory and per-session routing table ||`pkg/vmcp/auth`| Stable | Incoming/outgoing auth interfaces ||`pkg/vmcp/client`| Stable | Backend HTTP client ||`pkg/vmcp/health`| Stable | Health monitor ||`pkg/vmcp/status`| Stable | StatusReporter interface and CLI/K8s reporters ||`pkg/vmcp/optimizer`| Experimental | Optimizer interface; TEI integration may evolve ||`pkg/vmcp/cli`| Experimental | New in this plan; API may change before stabilization ||`pkg/vmcp/composer`| Experimental | Composite tool DAG executor ||`pkg/vmcp/cache`| Internal | Token cache; not for external use ||`pkg/vmcp/discovery`| Internal | Discovery middleware; use via aggregator ||`pkg/vmcp/k8s`| Internal | Kubernetes-specific discovery; not for local embedding ||`pkg/vmcp/workloads`| Internal | Backend workload helpers for K8s mode ||`pkg/vmcp/schema`| Internal | MCP schema parsing; subject to change |
Note: the exact stability ratings should be verified against the final state of RFC THV-0059 before publishing. The table above is derived from the RFC's stability section and the intake document. If any ratings differ in the merged RFC, use the RFC as the authoritative source.
Testing Strategy
This item produces no Go source code; there are no unit or integration tests to write.
Manual verification checklist (reviewer should confirm before approving):
Description
Write three new or updated architecture documents that formally capture the local-mode vMCP story now that Phase 4 (optimizer flag wiring) is complete:
docs/arch/vmcp-local.mdfor thethv vmcpCLI surface and optimizer lifecycle,docs/arch/vmcp-library.mdfor thepkg/vmcp/library embedding pattern and stability guarantees, and an update todocs/arch/10-virtual-mcp-architecture.mdto cross-reference local mode alongside the existing Kubernetes-oriented content. Also add deprecation guidance for theStacklokLabs/mcp-optimizerPython project now that the Go-native optimizer is stable.Context
Phase 5 of RFC THV-0059 is documentation-only work that can proceed once #4887 lands (the full optimizer lifecycle is stable and observable). The existing
docs/arch/10-virtual-mcp-architecture.mdcovers only the Kubernetes-side of vMCP (CRD-managedVirtualMCPServer, dynamic backend discovery, operator status reporting). Two new files are needed to document the local CLI path introduced in this plan:docs/arch/vmcp-local.md— describesthv vmcp serve/validate/init, quick-mode config generation, three optimizer tiers (Tier 0 no optimizer, Tier 1 FTS5-only, Tier 2 managed TEI container), and the TEI container lifecycle (naming, idempotent reuse, health polling, fail-fast, graceful shutdown).docs/arch/vmcp-library.md— formalizes the library embedding pattern already used in production bygithub.com/stacklok/brood-box, provides thepkg/vmcp/stability table from RFC THV-0059 (Stable vs Experimental per package), and documents thedoc.goannotation convention for package-level stability declarations.The existing architecture doc needs a single new "Local CLI Mode" section that links to these new docs and updates the "Deployment" section to remove the implication that
cmd/vmcp/is the only CLI entry point.StacklokLabs/mcp-optimizeris the Python predecessor thatthv vmcp serve --optimizerreplaces. Now that the Go-native solution ships in every ToolHive release and is stable, a deprecation notice pointing users tothv vmcpis appropriate.Dependencies: Depends on #4887 (use Temp ID until GitHub issue number is known)
Blocks: (none)
Acceptance Criteria
docs/arch/vmcp-local.mdexists and covers all of the following:thv vmcp serverequest path (Cobra CLI →pkg/vmcp/cli/serve.go→ vMCP server → backends)--grouptriggers in-memory YAML config generation, the requiredgroupReffield, and the127.0.0.1-only binding security requirementthv vmcp init→thv vmcp validate→thv vmcp serveworkflowthv-embedding-<model-short-hash>), idempotent create-or-reuse logic, health polling with exponential backoff (30–60 s first-start budget), fail-fast on explicit--optimizer-embeddingfailure, graceful stop on shutdowndocs/arch/vmcp-library.mdexists and covers all of the following:pkg/vmcp/packages, thebrood-boxreference implementation (github.com/stacklok/brood-box/internal/infra/mcp/), and what guarantees ToolHive providespkg/vmcp/stability table per RFC THV-0059 (every sub-package listed as Stable, Experimental, or Internal with a one-line rationale)doc.goannotation convention: example showing how//go:doc:stability(or equivalent) is used in package docs to declare stability leveldocs/arch/10-virtual-mcp-architecture.mdis updated:thv vmcpalongside the existing standalonevmcpbinary referencevmcp-local.mdandvmcp-library.mddocuments added to the "Related Documentation" sectionStacklokLabs/mcp-optimizeris included indocs/arch/vmcp-local.md(or a standalone note invmcp-library.md) that:mcp-optimizerproject is deprecated in favor ofthv vmcp serve --optimizerdocs/arch/README.mdis updated to list the two new documents in the Documentation Index with their descriptionsthv vmcp serve#4887 landsTechnical Approach
Recommended Implementation
This is a pure documentation PR — no Go source changes. The deliverables are two new Markdown files and additive edits to two existing files.
docs/arch/vmcp-local.mdshould follow the template fromdocs/arch/README.md(# Topic Name,## Overview,## Why This Exists,## How It Works,## Key Components,## Implementation,## Related Documentation). Use a Mermaid sequence diagram for the TEI lifecycle and a simple table for the optimizer tiers. The TEI lifecycle section should be modelled on the sequence diagram style used indocs/arch/01-deployment-modes.md(detached process model diagram).docs/arch/vmcp-library.mdshould open with a "Why a Stability Table" rationale (downstream consumers like brood-box need predictability), then present the stability table as a Markdown table, then give thedoc.goannotation example, then the compatibility guarantees.Update to
docs/arch/10-virtual-mcp-architecture.md: Add a short section after the existing "Deployment" section (currently at the bottom of the file, which listscmd/thv-operator/...andcmd/vmcp/). The new section should read as "Local CLI Mode" and describethv vmcpas the recommended path for local/non-Kubernetes use. Add the two new files to the "Related Documentation" section at the bottom.docs/arch/README.md: Insert two new entries (items 14 and 15) in the Documentation Index after item 13 (Skills System). Update the Architecture Map Mermaid diagram to includevMCPLocalandvMCPLibrarynodes connected from the existingvMCPnode.Deprecation notice: Add a top-level
## Migration from StacklokLabs/mcp-optimizersection indocs/arch/vmcp-local.md. Map the Python CLI flags tothv vmcpequivalents, note that the Go-native optimizer provides FTS5 keyword search (Tier 1) and TEI semantic search (Tier 2) as feature-complete replacements.Patterns & Frameworks
docs/arch/README.mdtemplate —## Overview,## Why This Exists,## How It Works,## Key Components,## Implementation,## Related DocumentationsequenceDiagramfor lifecycle flows,graph TBfor component hierarchies,graph LRfor pipelines; use consistent node fill colors (#90caf9blue for ToolHive components,#ffb74dorange for containers/external services)Package,Stability,Rationale; list every sub-package underpkg/vmcp/(aggregator, auth, cache, client, composer, config, discovery, health, k8s, optimizer, router, schema, server, session, status, workloads, and the newcli/)[Local CLI Mode](vmcp-local.md)) consistent with existing links in the arch docs`pkg/vmcp/cli/serve.go`) — consistent with the style throughout the existing arch docsCode Pointers
docs/arch/10-virtual-mcp-architecture.md— File to update; the "Deployment" section (line 293–300 area) and "Related Documentation" section (bottom) are the insertion pointsdocs/arch/README.md— Documentation Index (items 1–13) and Architecture Map Mermaid diagram need the two new entriesdocs/arch/01-deployment-modes.md— Reference for Mermaid diagram style (detached process model sequence diagram); and for the table-based mode comparison format to mirror in the optimizer tier tablecmd/thv/app/vmcp.go(created by Addthv vmcp serveandthv vmcp validatesubcommands #4883, extended by Implement zero-config quick mode forthv vmcp serve#4886 and Wire optimizer flags intothv vmcp serve#4887) — Source of truth for flag names (--group,--config,--optimizer,--optimizer-embedding,--embedding-model,--embedding-image) and subcommand names (serve,validate,init)pkg/vmcp/cli/serve.go(created by Extract shared vMCP logic intopkg/vmcp/cli/(serve + validate) #4879, extended through Wire optimizer flags intothv vmcp serve#4887) — Source of truth for the optimizer wiring logic, quick-mode config generation, and TEI lifecycle integration; document theServeConfigstruct fieldspkg/vmcp/cli/embedding_manager.go(created by Implement EmbeddingServiceManager in pkg/vmcp/cli/ #4884) — Source of truth for TEI container naming convention, health-polling logic, and idempotent reuse; documentEmbeddingServiceManagerConfig.Modeland.Imagefieldspkg/vmcp/cli/init.go(created by Addinit.gotopkg/vmcp/cli/for config scaffolding #4882) — Source of truth for the config scaffolding template and--group/--config/--outputflagspkg/vmcp/optimizer/optimizer.go—GetAndValidateConfig,NewOptimizerFactory— describe these as the optimizer activation calls in the architecture docpkg/vmcp/config/config.go—Config.Optimizer *OptimizerConfig,OptimizerConfig.EmbeddingService— describe how Tier 3 (config-file) passes the TEI URL directlyComponent Interfaces
The following is the optimizer tier table to be reproduced in
docs/arch/vmcp-local.md:The following is the
pkg/vmcp/stability table structure to be reproduced indocs/arch/vmcp-library.md:Note: the exact stability ratings should be verified against the final state of RFC THV-0059 before publishing. The table above is derived from the RFC's stability section and the intake document. If any ratings differ in the merged RFC, use the RFC as the authoritative source.
Testing Strategy
This item produces no Go source code; there are no unit or integration tests to write.
Manual verification checklist (reviewer should confirm before approving):
thv vmcp serve#4887 is mergeddocs/arch/resolve (no 404s)thv vmcp serveandthv vmcp validatesubcommands #4883, Implement zero-config quick mode forthv vmcp serve#4886, and Wire optimizer flags intothv vmcp serve#4887StacklokLabs/mcp-optimizerand the Go-native replacement flagsOut of Scope
*.go) — this is a documentation-only PRdocs-websiterepository — those are a separate follow-upREADME.mdat the repo root — natural follow-up after this arch doc landsthv vmcp statusdocumentation — the subcommand is deferred per RFC open questionsvmcpbinary beyond noting it exists and is preserved for K8s useReferences
docs/arch/10-virtual-mcp-architecture.md— Existing K8s-oriented vMCP architecture doc; the file being updateddocs/arch/README.md— Documentation index and template; guides the new file structuredocs/arch/01-deployment-modes.md— Reference for Mermaid diagram style and table formatthv vmcp serve#4887 — Upstream dependency; optimizer flags must be fully wired before this doc can be accurate