feat: opt-in anonymous telemetry client#171
Conversation
Greptile SummaryThis PR adds an opt-in anonymous telemetry client to VectorFlow. It introduces three new Confidence Score: 4/5Safe to merge — no bugs or security issues found; implementation is correct and well-tested. No P0 or P1 findings. The feature is complete, follows all tRPC/audit/RBAC patterns, payloads are non-PII, and every code path has unit tests. Score is 4 rather than 5 because several pre-existing CI failures are called out in the PR description as blockers that must be resolved before merge. No files require special attention — all changed files are clean. Important Files Changed
Sequence DiagramsequenceDiagram
participant U as User (Browser)
participant SW as Setup Wizard / Settings UI
participant API as /api/setup POST
participant TRPC as tRPC telemetry.update
participant DB as PostgreSQL (SystemSettings)
participant Cron as node-cron (03:42 UTC)
participant Pulse as pulse.terrifiedbug.com
Note over U,Pulse: First-time enable (Setup Wizard "Yes")
U->>SW: Click "Yes, share anonymous stats" + Complete Setup
SW->>API: POST /api/setup {telemetryChoice: "yes", ...}
API->>DB: completeSetup() upsert — telemetryEnabled=true, ULID, enabledAt
API-->>SW: {success: true}
API--)Pulse: sendTelemetryHeartbeat() fire-and-forget
Note over U,Pulse: Toggle in Settings → Telemetry
U->>SW: Toggle on
SW->>TRPC: telemetry.update {enabled: true}
TRPC->>DB: findUnique SystemSettings
DB-->>TRPC: {telemetryEnabled: false, instanceId: null, ...}
Note right of TRPC: isFirstEnable=true → generate ULID
TRPC->>DB: update {telemetryEnabled: true, instanceId: ULID, enabledAt: now}
TRPC-->>SW: {ok: true}
TRPC--)Pulse: sendTelemetryHeartbeat() fire-and-forget
Note over U,Pulse: Daily cron tick
Cron->>DB: findUnique SystemSettings
DB-->>Cron: {telemetryEnabled: true, instanceId: ..., oidcIssuer: ...}
Cron->>DB: pipeline.count (draft/active/paused) + vectorNode.count
DB-->>Cron: counts
Cron->>Pulse: POST /api/v1/ping {schema_version:1, instance_id, ...}
Pulse-->>Cron: 204 No Content
Reviews (1): Last reviewed commit: "fix(telemetry): immediate heartbeat from..." | Re-trigger Greptile |
Skip credit-card VRL fixtures when the installed Vector binary lacks repeat() (added after 0.54). Uses a runtime capability probe so the tests self-enable once CI is updated to a newer Vector release. Bump bcrypt "rejects incorrect token" tests to 15 s timeout — the full bcrypt comparison is intentionally slow and was flaking at 5 s on loaded CI runners.
Summary
Adds an opt-in anonymous telemetry client. When enabled, each VectorFlow instance sends one daily heartbeat to the centralised
pulse.terrifiedbug.comreceiver containing aggregate, non-PII counts (instance ID, version, agent count, pipeline count, auth method, deployment mode). Off by default; never sends without explicit consent.Implements design at
docs/superpowers/specs/2026-04-25-vf-telemetry-client-design.md(gitignored, in main worktree).What's included
SystemSettings:telemetryEnabled,telemetryInstanceId(ULID),telemetryEnabledAtbuildHeartbeatPayloadpure function andsendTelemetryHeartbeatorchestrator (src/server/services/telemetry-payload.ts+telemetry-sender.ts)node-cronscheduler hooked intoinstrumentation.tssingleton startup (leader-elected, multi-replica safe)telemetryrouter withgetandupdateprocedures, bothrequireSuperAdmin-gated;updateis audit-loggeddocs/public/operations/telemetry.mdWhat's NOT collected
Hostnames, IP addresses, pipeline names/configs/VRL, user identifiers, source/sink endpoints, or any data flowing through pipelines. Receiver derives country server-side from request IP and never stores the IP itself.
Behaviour notes
enabledAt, fires immediate fire-and-forget heartbeat so the instance shows up on Pulse without waiting up to 24 hours.instanceIdandenabledAtso Pulse sees the same anonymous instance.enabled=false. Doesn't touchinstanceIdorenabledAt.Retry-After: honored once for the next call, then forgotten on process restart.Schema migration
Adds three nullable/defaulted columns to
SystemSettings. No destructive operations. Pre-existing schema drift (AlertMetric enum on WebhookDelivery/WebhookEndpoint, FK changes on AuditLog/Environment) was deliberately excluded from this migration — it predates this branch and should be addressed separately.Test plan
pnpm test:run,pnpm build,pnpm lint)npx prisma migrate reset --forcethennpx prisma migrate deploy/setup, complete Step 1 + 2, verify Step 3 has both buttons, "Complete setup" disabled until a button is clickedSystemSettingsrow hastelemetryEnabled=true, 26-chartelemetryInstanceId,telemetryEnabledAtset; immediate heartbeat arrives at Pulse within secondsSystemSettingsrow hastelemetryEnabled=false, both other fields null; no heartbeat firesnode-cronschedule registered, fires42 3 * * *)/admin/auditshows atelemetry.updateentry on toggle changes/docs/operations/telemetryresolves)Out of scope
Pre-existing CI failures (not caused by this branch)
dlp-vrl-integration.test.ts— 4 tests fail because the locally installed Vector binary doesn't definerepeat()in VRLagent-token.test.ts— 1 test times out at 5000ms in a bcrypt-heavy negative pathThese predate this branch and should be tracked separately. They will need to be resolved before this PR can merge.