Skip to content

feat(cli, migrate): add stash encrypt commands + @cipherstash/migrate#357

Draft
coderdan wants to merge 10 commits intomainfrom
encryption-migrations
Draft

feat(cli, migrate): add stash encrypt commands + @cipherstash/migrate#357
coderdan wants to merge 10 commits intomainfrom
encryption-migrations

Conversation

@coderdan
Copy link
Copy Markdown
Contributor

Summary

Adds first-class support for migrating existing plaintext columns to eql_v2_encrypted — a production-shaped flow that today has no good answer in either Stack or Proxy land. Ships as a new CLI command group + library, usable by both Stack (Protect.js) and Proxy users.

Lifecycle

Each column walks through:

schema-added → dual-writing → backfilling → backfilled → cut-over → dropped

State model (three layers, kept separate on purpose)

  • Repo manifest.cipherstash/migrations.json: desired columns, index set, target phase. Code-reviewable intent.
  • EQL intenteql_v2_configuration: unchanged. Proxy continues to read this as its source of truth.
  • Runtime state (new)cipherstash.cs_migrations: append-only event log — per-column phase, backfill cursor, rows processed. Installed by stash db install. Designed to be upstreamed into EQL as eql_v2_migrations in a later release so Stack and Proxy own it jointly.

Why a new table instead of reusing eql_v2_configuration: its CHECK constraint rejects custom metadata, its state enum is global (only one {active, pending, encrypting} at a time) so it can't represent multiple columns in different phases, and backfill-cadence writes would collide with Proxy's 60s config refresh. Full reasoning in the design doc.

New CLI commands (under stash encrypt)

Command Purpose
status per-column table: phase, EQL state, indexes, progress, drift flags
plan diff intent (.cipherstash/migrations.json) vs observed state
advance --to <phase> record a phase transition (dual-writing is user-declared)
backfill chunked, resumable, idempotent; txn-per-chunk with atomic checkpoint; SIGINT-safe; auto-detects single-column PK
cutover eql_v2.rename_encrypted_columns() in a txn; optional Proxy refresh via CIPHERSTASH_PROXY_URL
drop generates DROP COLUMN <col>_plaintext migration file

New package @cipherstash/migrate

Exposes the same primitives (runBackfill, appendEvent, progress, renameEncryptedColumns, …) so users can embed backfill in their own workers/cron without the CLI. Example in packages/migrate/README.md.

Phase 1 scope / Phase 2 follow-ups

  • Phase 1 (this PR): Protect/Stack client-side backfill — CLI dynamic-imports the user's encryption client, encrypts in-process, writes payloads directly.
  • Phase 2: Proxy-mode backfill (SQL-through-Proxy using the same cs_migrations state), stash db introspect --json / stash env set CLI subcommands, upstream cs_migrationseql_v2_migrations in EQL.

Test plan

  • pnpm --filter @cipherstash/migrate test — 14 unit tests pass (state DAO, manifest round-trip, SQL identifier quoting)
  • pnpm --filter @cipherstash/cli test — all 126 existing tests still pass
  • pnpm -w build — full workspace builds clean
  • pnpm exec biome check <changed files> — clean
  • ./dist/bin/stash.js --help shows the six new encrypt subcommands
  • Manual e2e against a local Postgres: bash packages/cli/scripts/e2e-encrypt.sh — seeds 5000-row users table, runs install → advance → backfill (with SIGINT + resume) → status → cutover → drop. Requires CipherStash credentials in env.
  • Verify Proxy interop after cutover: SELECT email FROM users via Proxy returns plaintext, direct Postgres returns ciphertext JSON.

Design doc

docs/plans/encryption-migrations.md — full architecture including state-layer rationale, index-on-backfill implications, Proxy compatibility gotchas, and phased rollout.

Adds first-class support for migrating existing plaintext columns to
`eql_v2_encrypted` in production databases — the flow that currently has
no good answer in either Stack or Proxy land.

Per-column lifecycle:
  schema-added → dual-writing → backfilling → backfilled → cut-over → dropped

State lives in three layers so Proxy interop stays clean:
  - `.cipherstash/migrations.json` — repo-side intent (indexes, target phase)
  - `eql_v2_configuration` — EQL intent, unchanged; Proxy reads as before
  - `cipherstash.cs_migrations` — NEW append-only event log for per-column
    runtime state (phase, backfill cursor, rows processed). Installed by
    `stash db install`. Designed to upstream into EQL as `eql_v2_migrations`
    in a later release so Stack and Proxy own it jointly.

New CLI commands under `stash encrypt`:
  - status    per-column table: phase, EQL state, indexes, progress, drift
  - plan      diff intent vs observed
  - advance   record a phase transition (dual-writing is user-declared)
  - backfill  chunked, resumable, idempotent; txn-per-chunk with checkpoint;
              SIGINT-safe; uses user's encryption client via jiti dynamic
              import; auto-detects single-column PK
  - cutover   `eql_v2.rename_encrypted_columns()` in a txn; optional Proxy
              refresh via CIPHERSTASH_PROXY_URL
  - drop      generates a DROP COLUMN <col>_plaintext migration file

New package `@cipherstash/migrate` exposes the same primitives as a library
(`runBackfill`, `appendEvent`, `progress`, `renameEncryptedColumns`, …) so
users can embed backfill in their own workers/cron without the CLI process.

Design doc: docs/plans/encryption-migrations.md
Manual e2e script: packages/cli/scripts/e2e-encrypt.sh

Phase 1 scope: Protect/Stack client-side backfill. Proxy-mode backfill
(SQL-through-Proxy using the same cs_migrations state) is Phase 2.
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Apr 23, 2026

🦋 Changeset detected

Latest commit: 700009a

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 2 packages
Name Type
@cipherstash/cli Minor
@cipherstash/migrate Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 23, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9a28b9ef-f946-4eb5-9d4c-07201ac74e99

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch encryption-migrations

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Expand TypeDoc across the @cipherstash/migrate public API and the stash
encrypt command option interfaces. No behaviour change — docs only.

Highlights:
  - BackfillOptions: each field now explains the three separate name
    spaces (physical table/column vs. schema column key) and common
    defaults (chunkSize = 1000, encryptedColumn = <col>_encrypted).
  - BackfillCommandOptions: CLI flag semantics with an example of when
    schemaColumnKey needs to differ from column.
  - MigrationEvent / MigrationPhase: describes the event-vs-phase
    mapping and the backfill_started/backfill_checkpoint distinction.
  - EQL wrappers: explain that renameEncryptedColumns is the cut-over
    primitive, and that reloadConfig must run through Proxy.
  - installMigrationsSchema: documents why cs_migrations is kept
    separate from eql_v2_configuration (CHECK constraint, global
    state enum, write-frequency mismatch).
  - Manifest: field-level documentation of cast_as values, index kinds,
    and how targetPhase interacts with advance/plan/drop.
  - Module-level @packageDocumentation in src/index.ts for TypeDoc's
    package overview.
…stgres

Adds packages/migrate/src/__tests__/backfill.integration.test.ts —
gated on PG_TEST_URL so it skips in CI without a Postgres available.

Covers the full backfill state machine against a real transactional
Postgres using a stub encryption client (no CipherStash credentials
required):

  - happy-path completion + correct terminal state event
  - idempotency on re-run (row-level hash unchanged; zero new writes)
  - resume from checkpoint after mid-run AbortSignal
  - error event recorded + exception rethrown on encrypt failure
  - pre-encrypted rows preserved (the `encrypted IS NULL` guard)
  - empty-table fast path
  - event log ordering (backfill_started → checkpoint* → backfilled)
  - latestByColumn / progress readbacks

Run locally:
  cd local && docker compose up -d
  PG_TEST_URL=postgres://cipherstash:password@localhost:5432/cipherstash \\
    pnpm -F @cipherstash/migrate test backfill.integration
…ation

`stash db install --drizzle` now appends the cipherstash.cs_migrations
schema DDL to the generated EQL migration file, so `drizzle-kit migrate`
rolls the tracking table out to every environment alongside EQL itself.

Before this change the drizzle path only wrote EQL SQL; the cs_migrations
schema was installed directly against the connected DB (in the non-drizzle
branch) and never appeared in migration history. That meant prod deploys
running from drizzle migrations alone got EQL but no cs_migrations, and
`stash encrypt ...` would fail with "schema cipherstash does not exist"
until someone ran an out-of-band install.

Also exports MIGRATIONS_SCHEMA_SQL from @cipherstash/migrate so other
consumers can embed the DDL in their own migration pipelines.
…orts

loadEncryptionContext used to require the user's encryption client file
to export an EncryptedTable-shaped object (tableName + build()). Users
following the drizzle pattern typically only export the pgTable and the
initialised client, leaving the extractEncryptionSchema(...) result as
a non-exported const — which the loader couldn't see. Backfill would
then fail with "Table X was not found in the encryption client exports.
Available: (none)".

Now the loader does a second pass over module exports, detects drizzle
pgTables via Symbol.for('drizzle:Name'), dynamic-imports
@cipherstash/stack/drizzle, and calls extractEncryptionSchema() on each
to derive the EncryptedTable on the fly. Silently no-ops if the drizzle
subpath isn't installed (Supabase / generic projects are unaffected).

Manually-exported EncryptedTables still win over auto-derived ones
(the set-if-absent check preserves the explicit export).
Two correctness bugs in the backfill path, diagnosed from a real run
that wrote plaintext values through to the encrypted column:

1) The CLI defaulted `schemaColumnKey` to the plaintext column name
   (`--column`). But under the drizzle convention the EncryptedTable's
   column keys are the *encrypted* column names — because that's what
   the user declared via `encryptedType('foo_encrypted', ...)`. With
   the wrong key, `bulkEncryptModels` saw a model key that didn't
   match any configured encrypted column and returned the models
   unchanged. The runner then wrote the plaintext into the encrypted
   column, which Postgres rendered as `(82.60)`-shaped composite values
   because `eql_v2_encrypted` is a composite type. Default now uses
   the encrypted column name.

2) Added a leak guard inside runBackfill: after bulkEncryptModels
   returns, inspect `data[0][schemaColumnKey]`. Real ciphertext is
   always an object (the EQL envelope with c/k/v fields); if we see
   a primitive, throw with an actionable message that names the key
   the schema should use. Prevents any future schema/key mismatch
   from silently corrupting data — it fails loudly on the first chunk
   before any write commits.

Updated the TypeDoc on BackfillOptions to make the two conventions
(drizzle-extracted vs handwritten encryptedTable) explicit.
… leak guard

Replace the hand-rolled object-shape check in runBackfill with the
canonical isEncryptedPayload helper already exported by @cipherstash/stack.
The helper checks for the actual EQL envelope shape (v, i, and either
c or sv) rather than just `typeof === 'object'`, so it also catches
non-null objects that happen to lack ciphertext fields.

Also validates every row in the returned chunk (not just the first)
and reports the offending primary key in the error message so a user
hitting a partial failure knows which row to look at.

Integration test stubs updated to return valid-shaped payloads
({v, i, c}) so they still exercise the write path under the new guard.
…ryption

pg's node driver returns `numeric` as a JS string (to preserve
precision), but an EncryptedTable schema declaring `dataType('number')`
expects a JS number — so bulkEncryptModels errored out with "Cannot
convert String to Float. String values can only be used with Utf8Str".

Fix is split across both packages:

- @cipherstash/migrate: new optional `transformPlaintext` callback on
  BackfillOptions. Invoked on each row's plaintext before it goes into
  the model passed to bulkEncryptModels. Library stays generic; does
  not know anything about schemas.

- @cipherstash/cli: new `buildPlaintextCoercer` inspects
  `tableSchema.build().columns[schemaColumnKey].cast_as` and returns
  an appropriate coercer:
    number / double / real / int / decimal → Number(string)
    bigint / big_int                        → BigInt(string)
    date / timestamp                        → new Date(string)
    boolean                                 → "true"/"false" → boolean
    string / text / json / jsonb / unknown  → identity

Null and undefined are always passed through unchanged.
The backfill "Backfilling x.y → y_enc" log line now also prints the
schema's cast_as value so a user diagnosing a type-coercion issue can
see immediately whether the coercer is reading the right dataType from
the EncryptedTable (vs. falling through to identity).

Refactored buildPlaintextCoercer to return { transform, castAs } so
the caller can log the detected value; behaviour unchanged.
… by protect-ffi

Investigation into "Cannot convert String to Date" for a column with
cast_as: 'date' turned up a genuine protect-ffi 0.21.2 limitation:
its JsPlaintext wire enum has only String/Number/Boolean/JsonB
variants — no JS Date representation. napi-rs serialises JS Date to
ISO string via Date.toJSON, and the Rust side then refuses it because
string values are only valid for Utf8Str columns. The Rust-internal
NaiveDate / Timestamp types exist but have no JS-visible wire format.

Not a tool bug; not fixable here. But running a backfill that will
inevitably fail on the first chunk is a poor UX. Add a pre-flight
check: if the schema declares cast_as 'date' or 'timestamp', print a
warning explaining the FFI limitation and the mitigation (change to
dataType: 'string' / ISO strings) and prompt before continuing.
Accepts --yes-style confirmation via the standard clack confirm UI.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant