feat(prometheus_remote_write source, prometheus_remote_write sink): add support for Prometheus native histograms#25220
Conversation
|
All contributors have signed the CLA ✍️ ✅ |
0dafb93 to
ee8cdaa
Compare
|
I have read the CLA Document and I hereby sign the CLA |
ee8cdaa to
ff70104
Compare
Native histograms (also known as sparse or exponential histograms) are a Prometheus feature that uses exponential bucket boundaries determined by a schema parameter rather than fixed explicit boundaries. This enables high-resolution histograms with efficient storage. This change adds: - A new `MetricValue::NativeHistogram` variant with supporting types (`NativeHistogramCount`, `NativeHistogramSpan`, `NativeHistogramBuckets`, `NativeHistogramResetHint`) - Proto definitions in both Vector's internal event proto (for disk buffers) and the Prometheus remote write proto (matching the upstream definition) - `prometheus_remote_write` source: parses incoming `Histogram` proto messages from `TimeSeries.histograms` and emits them as native histogram metrics - `prometheus_remote_write` sink: emits native histograms as proper `Histogram` proto messages, enabling lossless pass-through - Lossy fallback conversion `native_histogram_to_agg_histogram()` for sinks that don't natively support the format (text exposition, InfluxDB, GreptimeDB, Datadog) The native histogram variant preserves both integer (delta-encoded) and float (absolute) bucket representations, as well as the zero bucket, schema, reset hint, and separate positive/negative bucket spans per the Prometheus specification.
The `NativeHistogram` variant added to `MetricValue` increases the size of `Event`, which trips `clippy::result_large_err` on this test-only helper function that uses `Result<Event, Event>`. Since this is internal test code where the pattern is intentional (routing events to either success or dropped output), allow the lint.
The quickcheck and proptest Arbitrary impls for MetricValue previously only generated the 7 original variants, so property-based roundtrip tests never exercised NativeHistogram proto serialization. Both impls now generate internally-consistent native histograms (int/float type agreement between count and buckets, matching span lengths and bucket counts). Also switch wrapping_add to saturating_add in iter_absolute for safer delta decoding behavior on pathological inputs.
Both schema and reset_hint are used when constructing the temporary MetricValue for conversion, so the discard tuple and its comment were misleading and unnecessary.
… handling, and variant-preserving zero()
== Motivation ==
Fix wire-compat and conversion correctness bugs in the native histogram support
== Details ==
Three issues found in review:
(1) TimeSeries.histograms proto field number — upstream Prometheus
prompb/types.proto assigns field 3 to `exemplars` and field 4 to
`histograms` [1]. Using field 3 here meant a real Prometheus sender's
native histograms would be silently dropped as an unknown field, and
any exemplars present would trigger a wire-type DecodeError on the
entire WriteRequest batch. Field 3 is now reserved and `histograms`
moved to field 4.
(2) native_histogram_to_agg_histogram() — when `zero_threshold == 0.0`
(the proto default for double) and the histogram has both negative
observations and a nonzero zero_count, the negative-collapse bucket
and the zero bucket both landed at upper_limit 0.0. Downstream this
yielded duplicate `le="0"` lines in text exposition and a HashMap key
collision in the InfluxDB sink. The two are now merged into a single
bucket at 0.0 with combined count.
(3) MetricValue::zero() for NativeHistogram — using `::default()` always
returns the Integer variant, silently flipping a Float (gauge)
histogram to Integer. This changed `is_float()` and therefore which
proto wire field (count_int vs count_float) is selected on
serialization. Added variant-preserving NativeHistogramCount::zero()
and NativeHistogramBuckets::clear() helpers.
Regression tests added for (2) and (3). Also added sanjams2 to the
changelog fragment authors line.
[1] https://github.com/prometheus/prometheus/blob/main/prompb/types.proto
ff70104 to
4e96d87
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0dafb93d33
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| _ => { | ||
| let mut metrics = IndexMap::default(); | ||
| metrics.insert(key, NativeHistogramMetric { histogram }); | ||
| self.0 | ||
| .insert(name.into(), GroupKind::NativeHistogram(metrics)); |
There was a problem hiding this comment.
Avoid overwriting classic histogram groups
When a request contains both classic histogram samples (e.g. *_bucket, *_sum, *_count) and a native histogram for the same metric family, this branch replaces the existing GroupKind::Histogram entry with a new GroupKind::NativeHistogram map. Because self.0 is keyed only by metric name, the replacement drops all previously parsed classic histogram points for that family, leading to silent data loss in mixed-encoding remote-write payloads.
Useful? React with 👍 / 👎.
Summary
This PR adds support for Prometheus native histograms (also known as sparse/exponential histograms), enabling lossless pass-through of native histogram data between Prometheus-compatible systems via the remote write protocol.
What's included
MetricValue::NativeHistogramvariant with supporting types:NativeHistogramCount— integer (counter) or float (gauge) countNativeHistogramSpan— sparse bucket span (offset + length)NativeHistogramBuckets— delta-encoded integer or absolute float bucket countsNativeHistogramResetHint— reset hint per Prometheus specprometheus_remote_writesource: parses incomingHistogramproto messages fromTimeSeries.histogramsand emits them as native histogram metricsprometheus_remote_writesink: emits native histograms as properHistogramproto messagesnative_histogram_to_agg_histogram()) for sinks without native support:prometheus_exporter(text exposition format)influxdbgreptimedbdatadog_metrics(converted via aggregated histogram → DDSketch)Design notes
Native histograms use exponential bucket boundaries determined by a
schemaparameter (2^(2^-schema)growth factor), with a sparse representation using spans rather than storing every bucket. The implementation preserves:For sinks without native support, the conversion computes explicit bucket upper bounds from
(schema, index)pairs, yielding a classic aggregated histogram. This is lossy (collapses the sparse exponential structure into fixed buckets) but allows existing pipelines to continue operating.Native histograms are not added to
add()/subtract()— they returnfalse(reinitialize), which is the same pattern used for mismatched aggregated histograms. Proper merging would require schema alignment and span merging, which can be added in a follow-up if needed.Test plan
cargo test -p prometheus-parser— 18 tests pass including newparse_request_native_histogramcargo test -p vector-core --lib event::metric— 28 tests pass including 11 native histogram testscargo test -p vector sinks::prometheus::collector— 25 tests pass includingencodes_native_histogram_as_text_fallbackandencodes_native_histogram_as_remote_writecargo clippy --workspace --all-targets --all-features— passescargo fmt --all --check— passes./scripts/check_changelog_fragments.sh— passes