Skip to content

Architecture documentation for local vMCP #4890

Description

@yrobla

Description

Write three new or updated architecture documents that formally capture the local-mode vMCP story now that Phase 4 (optimizer flag wiring) is complete: docs/arch/vmcp-local.md for the thv vmcp CLI surface and optimizer lifecycle, docs/arch/vmcp-library.md for the pkg/vmcp/ library embedding pattern and stability guarantees, and an update to docs/arch/10-virtual-mcp-architecture.md to cross-reference local mode alongside the existing Kubernetes-oriented content. Also add deprecation guidance for the StacklokLabs/mcp-optimizer Python project now that the Go-native optimizer is stable.

Context

Phase 5 of RFC THV-0059 is documentation-only work that can proceed once #4887 lands (the full optimizer lifecycle is stable and observable). The existing docs/arch/10-virtual-mcp-architecture.md covers only the Kubernetes-side of vMCP (CRD-managed VirtualMCPServer, dynamic backend discovery, operator status reporting). Two new files are needed to document the local CLI path introduced in this plan:

  • docs/arch/vmcp-local.md — describes thv vmcp serve/validate/init, quick-mode config generation, three optimizer tiers (Tier 0 no optimizer, Tier 1 FTS5-only, Tier 2 managed TEI container), and the TEI container lifecycle (naming, idempotent reuse, health polling, fail-fast, graceful shutdown).
  • docs/arch/vmcp-library.md — formalizes the library embedding pattern already used in production by github.com/stacklok/brood-box, provides the pkg/vmcp/ stability table from RFC THV-0059 (Stable vs Experimental per package), and documents the doc.go annotation convention for package-level stability declarations.

The existing architecture doc needs a single new "Local CLI Mode" section that links to these new docs and updates the "Deployment" section to remove the implication that cmd/vmcp/ is the only CLI entry point.

StacklokLabs/mcp-optimizer is the Python predecessor that thv vmcp serve --optimizer replaces. Now that the Go-native solution ships in every ToolHive release and is stable, a deprecation notice pointing users to thv vmcp is appropriate.

Dependencies: Depends on #4887 (use Temp ID until GitHub issue number is known)
Blocks: (none)

Acceptance Criteria

  • docs/arch/vmcp-local.md exists and covers all of the following:
    • Overview diagram showing the thv vmcp serve request path (Cobra CLI → pkg/vmcp/cli/serve.go → vMCP server → backends)
    • Zero-config quick mode: how --group triggers in-memory YAML config generation, the required groupRef field, and the 127.0.0.1-only binding security requirement
    • Config-file mode: the thv vmcp initthv vmcp validatethv vmcp serve workflow
    • Optimizer tier table (Tier 0, Tier 1 FTS5-only, Tier 2 managed TEI) with flag names and behavior
    • TEI container lifecycle: naming scheme (thv-embedding-<model-short-hash>), idempotent create-or-reuse logic, health polling with exponential backoff (30–60 s first-start budget), fail-fast on explicit --optimizer-embedding failure, graceful stop on shutdown
    • A note on the ARM64/Apple Silicon Rosetta 2 emulation path for the amd64-only TEI CPU image
  • docs/arch/vmcp-library.md exists and covers all of the following:
    • The library embedding pattern: how to import pkg/vmcp/ packages, the brood-box reference implementation (github.com/stacklok/brood-box/internal/infra/mcp/), and what guarantees ToolHive provides
    • The pkg/vmcp/ stability table per RFC THV-0059 (every sub-package listed as Stable, Experimental, or Internal with a one-line rationale)
    • The doc.go annotation convention: example showing how //go:doc:stability (or equivalent) is used in package docs to declare stability level
    • Compatibility guarantees for Stable packages: no breaking API changes, no import-path renames, semver-aligned deprecation policy
    • Guidance for downstream embedders on how to pin and upgrade
  • docs/arch/10-virtual-mcp-architecture.md is updated:
    • A new "Local CLI Mode" section added (or the "Deployment" section expanded) that describes thv vmcp alongside the existing standalone vmcp binary reference
    • Cross-references to the new vmcp-local.md and vmcp-library.md documents added to the "Related Documentation" section
    • The existing K8s-oriented content is not altered — only additive changes
  • A deprecation notice for StacklokLabs/mcp-optimizer is included in docs/arch/vmcp-local.md (or a standalone note in vmcp-library.md) that:
    • Clearly states the Python mcp-optimizer project is deprecated in favor of thv vmcp serve --optimizer
    • Provides a migration path (flag equivalents, feature parity notes)
    • Links to the StacklokLabs/mcp-optimizer repository for historical reference
  • docs/arch/README.md is updated to list the two new documents in the Documentation Index with their descriptions
  • All internal cross-references (relative Markdown links) resolve correctly
  • All code paths referenced in the new docs (file names, function names, flag names) are accurate and match the state of the codebase after Wire optimizer flags into thv vmcp serve #4887 lands
  • Code reviewed and approved

Technical Approach

Recommended Implementation

This is a pure documentation PR — no Go source changes. The deliverables are two new Markdown files and additive edits to two existing files.

docs/arch/vmcp-local.md should follow the template from docs/arch/README.md (# Topic Name, ## Overview, ## Why This Exists, ## How It Works, ## Key Components, ## Implementation, ## Related Documentation). Use a Mermaid sequence diagram for the TEI lifecycle and a simple table for the optimizer tiers. The TEI lifecycle section should be modelled on the sequence diagram style used in docs/arch/01-deployment-modes.md (detached process model diagram).

docs/arch/vmcp-library.md should open with a "Why a Stability Table" rationale (downstream consumers like brood-box need predictability), then present the stability table as a Markdown table, then give the doc.go annotation example, then the compatibility guarantees.

Update to docs/arch/10-virtual-mcp-architecture.md: Add a short section after the existing "Deployment" section (currently at the bottom of the file, which lists cmd/thv-operator/... and cmd/vmcp/). The new section should read as "Local CLI Mode" and describe thv vmcp as the recommended path for local/non-Kubernetes use. Add the two new files to the "Related Documentation" section at the bottom.

docs/arch/README.md: Insert two new entries (items 14 and 15) in the Documentation Index after item 13 (Skills System). Update the Architecture Map Mermaid diagram to include vMCPLocal and vMCPLibrary nodes connected from the existing vMCP node.

Deprecation notice: Add a top-level ## Migration from StacklokLabs/mcp-optimizer section in docs/arch/vmcp-local.md. Map the Python CLI flags to thv vmcp equivalents, note that the Go-native optimizer provides FTS5 keyword search (Tier 1) and TEI semantic search (Tier 2) as feature-complete replacements.

Patterns & Frameworks

  • Docs structure: follow docs/arch/README.md template — ## Overview, ## Why This Exists, ## How It Works, ## Key Components, ## Implementation, ## Related Documentation
  • Mermaid diagrams: match the style used in existing arch docs — sequenceDiagram for lifecycle flows, graph TB for component hierarchies, graph LR for pipelines; use consistent node fill colors (#90caf9 blue for ToolHive components, #ffb74d orange for containers/external services)
  • Stability table format: use a Markdown table with columns Package, Stability, Rationale; list every sub-package under pkg/vmcp/ (aggregator, auth, cache, client, composer, config, discovery, health, k8s, optimizer, router, schema, server, session, status, workloads, and the new cli/)
  • Cross-references: use relative paths (e.g., [Local CLI Mode](vmcp-local.md)) consistent with existing links in the arch docs
  • Code pointers: use backtick paths relative to repo root (e.g., `pkg/vmcp/cli/serve.go`) — consistent with the style throughout the existing arch docs
  • No nolint comments, no Go changes: this item contains zero Go source modifications

Code Pointers

  • docs/arch/10-virtual-mcp-architecture.md — File to update; the "Deployment" section (line 293–300 area) and "Related Documentation" section (bottom) are the insertion points
  • docs/arch/README.md — Documentation Index (items 1–13) and Architecture Map Mermaid diagram need the two new entries
  • docs/arch/01-deployment-modes.md — Reference for Mermaid diagram style (detached process model sequence diagram); and for the table-based mode comparison format to mirror in the optimizer tier table
  • cmd/thv/app/vmcp.go (created by Add thv vmcp serve and thv vmcp validate subcommands #4883, extended by Implement zero-config quick mode for thv vmcp serve #4886 and Wire optimizer flags into thv vmcp serve #4887) — Source of truth for flag names (--group, --config, --optimizer, --optimizer-embedding, --embedding-model, --embedding-image) and subcommand names (serve, validate, init)
  • pkg/vmcp/cli/serve.go (created by Extract shared vMCP logic into pkg/vmcp/cli/ (serve + validate) #4879, extended through Wire optimizer flags into thv vmcp serve #4887) — Source of truth for the optimizer wiring logic, quick-mode config generation, and TEI lifecycle integration; document the ServeConfig struct fields
  • pkg/vmcp/cli/embedding_manager.go (created by Implement EmbeddingServiceManager in pkg/vmcp/cli/ #4884) — Source of truth for TEI container naming convention, health-polling logic, and idempotent reuse; document EmbeddingServiceManagerConfig.Model and .Image fields
  • pkg/vmcp/cli/init.go (created by Add init.go to pkg/vmcp/cli/ for config scaffolding #4882) — Source of truth for the config scaffolding template and --group/--config/--output flags
  • pkg/vmcp/optimizer/optimizer.goGetAndValidateConfig, NewOptimizerFactory — describe these as the optimizer activation calls in the architecture doc
  • pkg/vmcp/config/config.goConfig.Optimizer *OptimizerConfig, OptimizerConfig.EmbeddingService — describe how Tier 3 (config-file) passes the TEI URL directly

Component Interfaces

The following is the optimizer tier table to be reproduced in docs/arch/vmcp-local.md:

| Tier | Flag(s) | Optimizer | External Service | Exposed Tools |
|------|---------|-----------|-----------------|---------------|
| 0 | (none) | None | None | All backend tools passed through |
| 1 | `--optimizer` | FTS5 keyword (SQLite in-process) | None | `find_tool`, `call_tool` only |
| 2 | `--optimizer-embedding` | FTS5 + TEI semantic | Managed TEI container | `find_tool`, `call_tool` only |
| 3 | `optimizer.embeddingService` in config YAML | FTS5 + external embedding service | User-managed | `find_tool`, `call_tool` only |

The following is the pkg/vmcp/ stability table structure to be reproduced in docs/arch/vmcp-library.md:

| Package | Stability | Notes |
|---------|-----------|-------|
| `pkg/vmcp/config` | Stable | Config structs and YAML loader; public API |
| `pkg/vmcp/aggregator` | Stable | Backend discovery and capability merge |
| `pkg/vmcp/router` | Stable | Request routing and tool name translation |
| `pkg/vmcp/server` | Stable | Server constructor and lifecycle |
| `pkg/vmcp/session` | Stable | Session factory and per-session routing table |
| `pkg/vmcp/auth` | Stable | Incoming/outgoing auth interfaces |
| `pkg/vmcp/client` | Stable | Backend HTTP client |
| `pkg/vmcp/health` | Stable | Health monitor |
| `pkg/vmcp/status` | Stable | StatusReporter interface and CLI/K8s reporters |
| `pkg/vmcp/optimizer` | Experimental | Optimizer interface; TEI integration may evolve |
| `pkg/vmcp/cli` | Experimental | New in this plan; API may change before stabilization |
| `pkg/vmcp/composer` | Experimental | Composite tool DAG executor |
| `pkg/vmcp/cache` | Internal | Token cache; not for external use |
| `pkg/vmcp/discovery` | Internal | Discovery middleware; use via aggregator |
| `pkg/vmcp/k8s` | Internal | Kubernetes-specific discovery; not for local embedding |
| `pkg/vmcp/workloads` | Internal | Backend workload helpers for K8s mode |
| `pkg/vmcp/schema` | Internal | MCP schema parsing; subject to change |

Note: the exact stability ratings should be verified against the final state of RFC THV-0059 before publishing. The table above is derived from the RFC's stability section and the intake document. If any ratings differ in the merged RFC, use the RFC as the authoritative source.

Testing Strategy

This item produces no Go source code; there are no unit or integration tests to write.

Manual verification checklist (reviewer should confirm before approving):

Out of Scope

  • Changes to any Go source files (*.go) — this is a documentation-only PR
  • User-facing docs in the docs-website repository — those are a separate follow-up
  • Quickstart additions to README.md at the repo root — natural follow-up after this arch doc lands
  • thv vmcp status documentation — the subcommand is deferred per RFC open questions
  • GPU/ARM64-native TEI image selection guidance — deferred; CPU image with Rosetta 2 is the only supported path
  • Documentation of the standalone vmcp binary beyond noting it exists and is preserved for K8s use
  • Unix socket transport for vMCP — future follow-up

References

  • RFC THV-0059 — Primary source of truth for optimizer tiers, stability table, TEI lifecycle, and library embedding pattern
  • GitHub Issue #4808 — Parent tracking issue
  • docs/arch/10-virtual-mcp-architecture.md — Existing K8s-oriented vMCP architecture doc; the file being updated
  • docs/arch/README.md — Documentation index and template; guides the new file structure
  • docs/arch/01-deployment-modes.md — Reference for Mermaid diagram style and table format
  • Wire optimizer flags into thv vmcp serve #4887 — Upstream dependency; optimizer flags must be fully wired before this doc can be accurate
  • StacklokLabs/mcp-optimizer — Python predecessor project; referenced in the deprecation notice

Metadata

Metadata

Assignees

No one assigned

    Labels

    cliChanges that impact CLI functionalityenhancementNew feature or requestvmcpVirtual MCP Server related issues
    No fields configured for Chore 🧹.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions