Skip to content

feat(cymbal): verify SDK-signed exceptions; reserve $exception_verified#62751

Closed
Gilbert09 wants to merge 2 commits into
tom/error-tracking-signing-keysfrom
tom/cymbal-verify-exceptions
Closed

feat(cymbal): verify SDK-signed exceptions; reserve $exception_verified#62751
Gilbert09 wants to merge 2 commits into
tom/error-tracking-signing-keysfrom
tom/cymbal-verify-exceptions

Conversation

@Gilbert09

Copy link
Copy Markdown
Member

Problem

The public ingest key means a $exception event can be forged. To let downstream consumers trust an exception's provenance, ingestion must verify the SDK's signature and emit a flag they can rely on. (Stacked on #62750, which adds the per-team public-key registry.)

Changes

  • cymbal::exception_signing — re-derives the canonical, length-prefixed projection of the raw $exception_list (byte-identical to the posthog-python SDK's build_canonical) and verifies the Ed25519 $exception_signature against a registered public key. Runs on the raw pre-ingestion properties, so it sees exactly what the SDK signed (before we sanitise/truncate/symbolicate).
  • $exception_verified + $exception_verified_key_id added to OutputErrProps and to RESERVED_PROPERTIES — so any client-supplied $exception_verified is stripped and the flag can only be set by the server.
  • Adds ed25519-dalek.

How tested

Agent-assisted. The canonical-encoding + Ed25519 verification logic is covered by unit tests built around a committed cross-language parity vector — the exact canonical bytes + signature produced by the Python SDK (posthog-python) — asserting byte-for-byte canonical equality and verify/tamper/wrong-key behaviour. This logic was validated end-to-end in a standalone Rust harness that reproduced the parity vector and verified a Python-generated signature.

Note: local rustc (1.85) predates the workspace MSRV (≥1.91), so I could not compile the full cymbal crate locally — CI does the full build. The changes are the new module (faithful port of the validated standalone code) + mechanical schema additions.

Follow-up (error-tracking team)

This PR adds the verification primitive and the (stripped, server-only) $exception_verified schema. The remaining wiring is yours to place: load a team's keys via TeamManager (mirroring assignment_rules/suppression_rules), call exception_signing::verify_properties on the raw event in the pipeline, and populate OutputErrProps.verified (+ bump the key's last_used_at). Happy to pair on the right stage/threading.

@greptile-apps

greptile-apps Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
rust/cymbal/src/types/mod.rs:217-219
**Security-critical string duplication in `RESERVED_PROPERTIES`**

`RESERVED_PROPERTIES` hardcodes `"$exception_verified"` and `"$exception_verified_key_id"` as plain string literals, yet `exception_signing.rs` already exports `VERIFIED_PROPERTY` and `VERIFIED_KEY_ID_PROPERTY` with those same values. If a future refactor renames or changes either constant without also updating this array, the filter silently stops stripping client-supplied values — meaning any client could forge `$exception_verified = true` in their event payload and defeat the entire security guarantee of this feature. The constants exist precisely to be a single source of truth; they should be used here.

### Issue 2 of 2
rust/cymbal/src/exception_signing.rs:109-112
**Non-string `$exception_signature` silently treated as unsigned**

`str_field` returns `None` for any non-string JSON value (e.g. `null`, an integer, an object). When `$exception_signature` is present but not a string, `verify_properties` returns `Verification::Unsigned` instead of `Verification::Invalid`. A caller logging "this event had no signature" would be wrong — the property *is* present, just malformed. This distinction matters if downstream logic branches on `Unsigned` vs `Invalid` to decide whether to warn or silently skip. Consider returning `Invalid` whenever the signature property exists but is not a valid string.

Reviews (1): Last reviewed commit: "feat(cymbal): verify SDK-signed exceptio..." | Re-trigger Greptile

Comment thread rust/cymbal/src/types/mod.rs Outdated
Comment on lines +109 to +112
pub fn verify_properties(properties: &Value, keys: &[PublicKey]) -> Verification {
let Some(signature) = str_field(properties, SIGNATURE_PROPERTY) else {
return Verification::Unsigned;
};

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Non-string $exception_signature silently treated as unsigned

str_field returns None for any non-string JSON value (e.g. null, an integer, an object). When $exception_signature is present but not a string, verify_properties returns Verification::Unsigned instead of Verification::Invalid. A caller logging "this event had no signature" would be wrong — the property is present, just malformed. This distinction matters if downstream logic branches on Unsigned vs Invalid to decide whether to warn or silently skip. Consider returning Invalid whenever the signature property exists but is not a valid string.

Prompt To Fix With AI
This is a comment left during a code review.
Path: rust/cymbal/src/exception_signing.rs
Line: 109-112

Comment:
**Non-string `$exception_signature` silently treated as unsigned**

`str_field` returns `None` for any non-string JSON value (e.g. `null`, an integer, an object). When `$exception_signature` is present but not a string, `verify_properties` returns `Verification::Unsigned` instead of `Verification::Invalid`. A caller logging "this event had no signature" would be wrong — the property *is* present, just malformed. This distinction matters if downstream logic branches on `Unsigned` vs `Invalid` to decide whether to warn or silently skip. Consider returning `Invalid` whenever the signature property exists but is not a valid string.

How can I resolve this? If you propose a fix, please make it concise.

@@ -123,6 +123,8 @@ impl ExceptionProperties {
values,
sources,
functions,
verified: None,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 $exception_verified forgery via unfiltered props in ExceptionProperties HTTP pipeline

The PR declares $exception_verified and $exception_verified_key_id as RESERVED_PROPERTIES and comments that client-supplied values "can't be forged," but the RESERVED_PROPERTIES filter is only applied inside FingerprintedErrProps::to_output() (types/mod.rs line 325). The primary HTTP ingestion pipeline never reaches that path.

In the HTTP pipeline the event flows as AnyEvent → ExceptionProperties → serialise back to AnyEvent.properties via set_properties(props) (event.rs line 33: serde_json::to_value(&new_props)). Because ExceptionProperties carries all unrecognised client properties in #[serde(flatten)] pub props: HashMap<String, Value>, a client-supplied "$exception_verified": true lands in props, passes through every stage unchanged, and is re-serialised verbatim into the output event. No stripping is ever applied on this code path.

ExceptionProperties::to_output() (used by the issue-linking/alerting path) has the same gap: other: self.props.clone() forwards the raw, unfiltered map without removing reserved keys.

A client that sends {"$exception_list": [...], "$exception_verified": true} will receive an output event that contains "$exception_verified": true, bypassing the server-only trust flag that is the whole point of this PR. Any downstream consumer reading the raw JSON property will consider the unverified exception as verified.

Prompt To Fix With AI
There are two independent gaps to fix:

1. **`ExceptionProperties::to_output()`** (exception_properties.rs ~line 119 `other: self.props.clone()`):
   Replace with a filtered version that removes all keys listed in `RESERVED_PROPERTIES`, mirroring what `FingerprintedErrProps::to_output()` already does:
   ```rust
   other: self.props.iter()
       .filter(|(k, _)| !RESERVED_PROPERTIES.contains(&k.as_str()))
       .map(|(k, v)| (k.clone(), v.clone()))
       .collect(),
   ```

2. **Main HTTP pipeline serialisation**`set_properties` in event.rs serialises the raw `ExceptionProperties` including its unfiltered `props`. Fix options:
   a) Strip reserved properties from `ExceptionProperties.props` at deserialisation time in `TryFrom<AnyEvent>` (exception_properties.rs ~line 156), or
   b) Add an explicit sanitise step in the HTTP pipeline (http_pipeline.rs handle_result, or as a new pre/post-processing operator) that removes all `RESERVED_PROPERTIES` keys from `ExceptionProperties.props` before the event is emitted.

Make `RESERVED_PROPERTIES` in types/mod.rs `pub` (or move it to a shared location) so both callsites can share the same list without duplication.

Severity: high | Confidence: 85% | React with 👍 if useful or 👎 if not

Comment thread rust/cymbal/src/types/exception_properties.rs
@veria-ai

veria-ai Bot commented Jun 10, 2026

Copy link
Copy Markdown

PR overview

All previously flagged issues have been addressed. No open security concerns remain on this pull request.

Security review

No open security issues remain on this pull request.

Fixed/addressed: 1 · PR risk: 0/10

Adds the verification primitive for signed $exception events: re-derive the
canonical, length-prefixed projection of the raw $exception_list (byte-identical
to the posthog-python SDK's build_canonical) and check the Ed25519 signature
against a registered public key. $exception_verified / _key_id are added to the
output schema and RESERVED_PROPERTIES, so a client-supplied verified flag is
always stripped and can only be set by the server.

The canonical-encoding + Ed25519 verification logic is validated by a committed
cross-language parity vector (the exact bytes + signature produced by the Python
SDK). Note: local rustc (1.85) predates the workspace MSRV (1.91) so the full
cymbal crate is compiled in CI; the verification logic itself was validated in a
standalone harness reproducing the parity vector.

Follow-up (error-tracking team): load a team's keys via TeamManager (mirroring
assignment_rules) and call exception_signing::verify_properties on the raw event
in the pipeline, populating OutputErrProps.verified + bumping the key's last_used.
- SECURITY (review): strip RESERVED_PROPERTIES in ExceptionProperties::to_output
  too (the HTTP ingestion path), not only FingerprintedErrProps — otherwise a
  client could forge $exception_verified through the primary pipeline. Extracted
  a shared strip_reserved_properties helper used by both paths.
- RESERVED_PROPERTIES references exception_signing::VERIFIED_* constants instead
  of duplicating the literals, so they can't drift.
- A present-but-non-string/empty $exception_signature is now Invalid, not
  silently Unsigned.
- Apply rustfmt (fixes the Rust lint CI failure). Added tests for the strip
  helper and the malformed-signature case.
@Gilbert09 Gilbert09 force-pushed the tom/cymbal-verify-exceptions branch from 6617cb8 to f9f9550 Compare June 10, 2026 18:13
@hex-security-app

Copy link
Copy Markdown

The second commit (f9f95506) fixes ExceptionProperties::to_output() (used for internal linking/alerting side effects) but the primary gap I reported is still present.

What was fixed: ExceptionProperties::to_output() now calls strip_reserved_properties(self.props.clone()), which prevents forged values from appearing in assignment-rule evaluation and alert notifications.

What's still open: The main HTTP ingestion output path does not go through to_output(). It flows as:

ExceptionProperties (with props["$exception_verified"] = true)
  → LinkingStage (doesn't strip props)
  → handle_result in http_pipeline.rs
  → original.set_properties(props)             // event.rs line 33
  → serde_json::to_value(&new_props)           // serialises ExceptionProperties directly,
                                               // including the unfiltered `props` HashMap

The comment added at exception_properties.rs line 120 ("the HTTP ingestion path, which uses this to_output") reflects a misread of the flow — to_output() is only called for side effects inside LinkingStage, not for the response that the caller of /process writes to storage.

The remaining fix is to strip reserved properties at the point of serialisation, either in set_properties in event.rs or in TryFrom<AnyEvent> for ExceptionProperties.

@Gilbert09

Copy link
Copy Markdown
Member Author

Closing as superseded by #62888 — events captured with the project's secret API key are now marked $verified during ingestion, which replaces per-event Ed25519 signature verification in cymbal.

@Gilbert09 Gilbert09 closed this Jun 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant