Skip to content

Add project-level skills lock file (POC)#5715

Open
samuv wants to merge 1 commit into
mainfrom
poc-thv-skills-lock
Open

Add project-level skills lock file (POC)#5715
samuv wants to merge 1 commit into
mainfrom
poc-thv-skills-lock

Conversation

@samuv

@samuv samuv commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Why: Skills installed with --scope project have no reproducible pin. Team members can't restore an identical set of skills, and there's no controlled path to pull newer content when the catalog moves. Every other package manager (npm, pnpm, Cargo, go) solves this with a committed lock file; skills didn't have one.
  • What: Introduce toolhive.lock.yaml at the project root that pins name, version, source, resolvedReference, and digest for every project-scoped skill install. Adds thv skill sync (restore pinned state) and thv skill upgrade (re-resolve sources, install newer content). Lock entries are client-agnostic; sync installs for all detected clients by default.

Note: This is a POC. It is intentionally a single large commit to make the end-to-end flow reviewable as one unit. If accepted, it should be split into reviewable PRs (lockfile pkg → install/uninstall hooks → sync → upgrade → CLI) per the repo's 400-line PR guideline. An RFC capturing the design context is being opened in stacklok/toolhive-rfcs.

Fixes #

Type of change

  • Bug fix
  • New feature
  • Refactoring (no behavior change)
  • Dependency update
  • Documentation
  • Other (describe):

Test plan

  • Unit tests (task test)
  • E2E tests (task test-e2e)
  • Linting (task lint-fix)
  • Manual testing (describe below)

Unit tests cover the lockfile package, the install/uninstall lock hooks, Sync, Upgrade, the new API routes, and CLI helpers (resolveProjectRoot, shortDigest). pkg/skills/..., pkg/api/v1, and cmd/thv/app all pass task test. Manual end-to-end verification of thv skill install --scope projectsyncupgrade against real GHCR artifacts was done locally.

API Compatibility

  • This PR does not break the v1beta1 API, OR the api-break-allowed label is applied and the migration guidance is described above.

New read-only-ish endpoints (POST /skills/sync, POST /skills/upgrade) are additive; no existing operator CRD surface is touched.

Changes

File Change
pkg/skills/lockfile/lockfile.go New: schema + file-locked load/save/upsert/remove for toolhive.lock.yaml
pkg/skills/skillsvc/install.go Project-scope install upserts a lock entry (records original source)
pkg/skills/skillsvc/uninstall.go Project-scope uninstall removes the lock entry
pkg/skills/skillsvc/sync.go New: Sync — install each entry at pinned resolvedReference@digest; report/prune unmanaged
pkg/skills/skillsvc/upgrade.go New: Upgrade — re-resolve source, compare digests, install + rewrite on change; --dry-run; immutable-source detection
pkg/skills/skillsvc/pin.go New: digest-pinning + immutability helpers (pinOCIReference, pinGitReference, isImmutableSource)
pkg/skills/service.go, options.go Sync/Upgrade on the SkillService interface + option/result types
pkg/skills/client/{client,dto}.go HTTP client methods + DTOs for sync/upgrade
pkg/api/v1/skills.go, skills_types.go POST /skills/sync, POST /skills/upgrade routes + DTOs
cmd/thv/app/skill_sync.go, skill_upgrade.go New CLI commands with git-root auto-detection, --prune/--dry-run/--clients/--project-root
cmd/thv/app/skill_helpers.go resolveProjectRoot helper (git-root auto-detection)
docs/arch/12-skills-system.md New "Project Lock File" section, diagram, endpoint/CLI/key-files tables
docs/cli/*, docs/server/* Regenerated CLI + swagger docs

Does this introduce a user-facing change?

Yes. New thv skill sync and thv skill upgrade commands, and project-scoped installs now write a toolhive.lock.yaml at the project root (intended to be committed to git). User-scope installs are unchanged.

Implementation plan

Approved implementation plan

Agreed design (from interview):

  • Single lock file written by install commands — no separate manifest, no version-constraint resolution.
  • File: toolhive.lock.yaml at the project root (committed to git).
  • Entry schema: name, version, source (what the user typed), resolvedReference (concrete OCI ref / git URL), digest (OCI sha256:... or git commit). Top-level version: 1; entries sorted by name for stable diffs. (An installedAt field was originally proposed and later dropped to match the convention every other package manager follows — lock files store identity + source + integrity, not chronology.)
  • Client-agnostic: the lock pins content; sync installs for all detected clients (override with --clients).
  • Scope: only --scope project operations touch the lock; user-scope installs never do.
  • Sync: installs exactly resolvedReference@digest for missing/drifted entries; reports unmanaged project-scope skills; --prune uninstalls them.
  • Upgrade: digest-based. Re-resolves each entry's source, installs and rewrites the entry if the digest changed. Immutable sources (git commit, @sha256) reported as not upgradable. --dry-run prints what would change.
  • Placement: logic lives server-side in skillsvc with new API endpoints; CLI commands are thin wrappers. CLI auto-detects the git root from cwd for sync/upgrade, --project-root overrides.

Flow:

flowchart LR
    installCmd["thv skill install --scope project"] --> svc[skillsvc install]
    svc --> files["client skill dirs (.claude/skills/...)"]
    svc --> db[(SQLite state)]
    svc --> lock["toolhive.lock.yaml (upsert entry)"]
    syncCmd["thv skill sync"] --> lock2["read lock"] --> pinned["install resolvedReference@digest"]
    upgradeCmd["thv skill upgrade"] --> reresolve["re-resolve source vs catalog"] --> compare{"digest changed?"} -->|yes| rewrite["install + rewrite lock entry"]
Loading

Risks / notes for the POC:

  • Git pinned-commit installs: gitresolver supports checkout of arbitrary commit hashes (full clone when needed).
  • OCI pull-by-digest: ociskills.RegistryClient.Pull() accepts repo@sha256:... refs.
  • PR sizing: exceeds the 400-line guideline; intended to be split post-acceptance.

Special notes for reviewers

  • The installedAt field was deliberately omitted from the lock entry schema. Every major lock file (npm/pnpm/yarn/Cargo/go/poetry) stores identity + source + integrity but not chronology, for good reasons: reproducibility is guaranteed by the digest not the timestamp; timestamps are environment-local and produce meaningless merge churn on no-op regenerations; and "when was this pinned" is already answered better by git blame. git log -p -- toolhive.lock.yaml serves any recency/audit need.
  • Lock-file write errors during install/uninstall are logged but never fail the operation — the skill's files and DB record are already correct at that point, and a subsequent sync/upgrade can repair the lock file. This mirrors the existing best-effort file-cleanup pattern in uninstall.go.
  • Sync and Upgrade call installInternal (not Install) so they don't overwrite the entry's Source with an already-resolved reference — Source must stay as the user's original input so upgrade can re-resolve it.
  • A separate, unrelated bug was found and fixed during this work (TOOLHIVE_DEV=true blindly enabled plain HTTP for all OCI pulls, breaking real-registry auth via oras-go's cross-origin redirect guard). That fix is not in this PR; it's stashed locally and will be a separate PR.

Made with Cursor

Introduce toolhive.lock.yaml, committed at the project root, that pins the
name, version, source, resolved reference, and digest of every project-scoped
skill install. This gives teams reproducible skill installs (restorable via
"thv skill sync") and a path to refresh pinned content when the catalog moves
("thv skill upgrade"), mirroring the role package-lock.json plays for npm.

- New pkg/skills/lockfile package: schema, load/save, and file-locked
  upsert/remove using pkg/fileutils.WithFileLock; entries sorted by name for
  stable diffs.
- skillsvc install/uninstall hooks: project-scope installs upsert a lock
  entry (recording the original source string for later re-resolution);
  uninstalls remove it. User-scope installs never touch the lock.
- Sync: installs each entry strictly at its pinned resolvedReference@digest,
  reports unmanaged project-scope skills, and prunes them with --prune.
- Upgrade: re-resolves each entry's original source, compares digests, and
  installs + rewrites the entry on change; immutable sources (OCI digest,
  git commit) reported as not-upgradable; --dry-run supported.
- API: POST /skills/sync and POST /skills/upgrade plus skills client methods.
- CLI: thv skill sync and thv skill upgrade with git-root auto-detection and
  --project-root override; JSON/text output.
- Docs: regenerated CLI/swagger docs; updated docs/arch/12-skills-system.md
  with a Project Lock File section, diagram, and endpoint/CLI tables.

Co-authored-by: Cursor <cursoragent@cursor.com>
@github-actions github-actions Bot added the size/XL Extra large PR: 1000+ lines changed label Jul 3, 2026

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Large PR Detected

This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.

How to unblock this PR:

Add a section to your PR description with the following format:

## Large PR Justification

[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformation

Alternative:

Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.

See our Contributing Guidelines for more details.


This review will be automatically dismissed once you add the justification section.

@JAORMX

JAORMX commented Jul 3, 2026

Copy link
Copy Markdown
Collaborator

Love it! Let's talk on Monday about this. Shall we have an RFC too? Just for docs

// Load reads and parses the lock file for projectRoot. A missing lock file
// is not an error — it returns an empty Lockfile ready to be populated.
func Load(projectRoot string) (*Lockfile, error) {
data, err := os.ReadFile(Path(projectRoot)) // #nosec G304 -- projectRoot is validated by callers before reaching this package

path := Path(projectRoot)
tmp := path + ".tmp"
if err := os.WriteFile(tmp, data, 0o644); err != nil { // #nosec G306 -- lock file is committed to git, not sensitive
if err := os.WriteFile(tmp, data, 0o644); err != nil { // #nosec G306 -- lock file is committed to git, not sensitive
return fmt.Errorf("writing lock file: %w", err)
}
if err := os.Rename(tmp, path); err != nil {
if err := os.WriteFile(tmp, data, 0o644); err != nil { // #nosec G306 -- lock file is committed to git, not sensitive
return fmt.Errorf("writing lock file: %w", err)
}
if err := os.Rename(tmp, path); err != nil {
return fmt.Errorf("writing lock file: %w", err)
}
if err := os.Rename(tmp, path); err != nil {
_ = os.Remove(tmp)
@codecov

codecov Bot commented Jul 3, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 62.42424% with 124 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.63%. Comparing base (d70cc41) to head (ecb41a8).
⚠️ Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
pkg/skills/skillsvc/upgrade.go 51.92% 38 Missing and 12 partials ⚠️
pkg/skills/client/client.go 0.00% 19 Missing ⚠️
pkg/skills/lockfile/lockfile.go 73.84% 9 Missing and 8 partials ⚠️
pkg/skills/skillsvc/pin.go 41.37% 13 Missing and 4 partials ⚠️
pkg/skills/skillsvc/sync.go 71.15% 9 Missing and 6 partials ⚠️
pkg/skills/skillsvc/install.go 84.21% 2 Missing and 1 partial ⚠️
pkg/skills/skillsvc/uninstall.go 57.14% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##             main    #5715    +/-   ##
========================================
  Coverage   70.62%   70.63%            
========================================
  Files         681      686     +5     
  Lines       68809    69184   +375     
========================================
+ Hits        48596    48865   +269     
- Misses      16664    16739    +75     
- Partials     3549     3580    +31     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@samuv

samuv commented Jul 3, 2026

Copy link
Copy Markdown
Contributor Author

Love it! Let's talk on Monday about this. Shall we have an RFC too? Just for docs

Yeah 🚀 , I opened this RFC: stacklok/toolhive-rfcs#80. I just refined it a bit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/XL Extra large PR: 1000+ lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants