Metric reporting from native to JVM is a hot-path allocator / JNI bottleneck

## Describe the problem

Profiling with async-profiler on native execution shows a significant amount of time spent in metric reporting. This happens on the per-batch hot path, so the cost is paid for every batch returned from DataFusion back to the JVM.

## Where the cost comes from

`update_comet_metric` is called from `jni_api.rs`:
- Once for every batch returned to the JVM (`executePlan` fast path)
- Periodically during `ScanExec` polling (every ~100 polls, gated by `metrics_update_interval`)
- At stream end

For each call, the pipeline does the following:

**Rust side** (`native/core/src/execution/metrics/utils.rs`):
1. `to_native_metric_node` recursively walks the `SparkPlan` tree.
2. Per node: allocates a fresh `HashMap<String, i64>`, inserts each metric with `name.to_string()` (a per-metric allocation), and pushes the children into a new `Vec`.
3. Serializes the resulting `NativeMetricNode` protobuf via `prost`'s `encode_to_vec()`: allocates a `Vec<u8>`.
4. `byte_array_from_slice` allocates a new Java `byte[]` and copies the protobuf bytes in.
5. One JNI upcall to `set_all_from_bytes([B)V`.

**JVM side** (`CometMetricNode.scala`):

6. `Metric.NativeMetricNode.parseFrom(bytes)`: full protobuf tree re-allocation.
7. Recursive walk pairing the deserialized tree with the local `CometMetricNode` tree via `zip(children)`.
8. Per metric: `metrics.get(metricName)` lookup on the node's `Map[String, SQLMetric]`, then `SQLMetric.set(v)`.

So every update cycle includes: a tree of per-node `HashMap` allocations, per-metric `String` allocations, protobuf serialization, a Java byte-array allocation and copy, protobuf deserialization into a fresh tree, and a per-metric hash-map lookup on the JVM side, none of which carry state across cycles even though the tree shape and metric names are stable for the lifetime of the plan.

## Impact

Visible in async-profiler flame graphs as a disproportionate share of native-thread CPU during query execution, especially for plans with many nodes and small-to-medium batch sizes (where update frequency is high relative to per-batch work).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Metric reporting from native to JVM is a hot-path allocator / JNI bottleneck #4072

Describe the problem

Where the cost comes from

Impact

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Metric reporting from native to JVM is a hot-path allocator / JNI bottleneck #4072

Description

Describe the problem

Where the cost comes from

Impact

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions