Skip to content

chore(codecs): centralize events_dropped emission for batch encoding errors#25199

Open
pront wants to merge 7 commits intomasterfrom
pavlos/centralize-events-dropped-batch-encoding
Open

chore(codecs): centralize events_dropped emission for batch encoding errors#25199
pront wants to merge 7 commits intomasterfrom
pavlos/centralize-events-dropped-batch-encoding

Conversation

@pront
Copy link
Copy Markdown
Member

@pront pront commented Apr 15, 2026

Summary

After #25156 (Parquet encoding) landed, events_dropped emission across batch encoding error paths is inconsistent -- Parquet covers most paths but misses some, Arrow IPC covers almost none. Every new error path requires the author to remember to add emission manually, which is easy to miss.

This PR moves events_dropped emission to a single wrapper at the batch encoding chokepoint (encode_input), so all current and future error paths are covered automatically. Individual internal events keep their error logs and component_errors_total counters for diagnostics.

Risk: The non_log_count partial drop in Parquet (non-log events filtered before encoding) can double-count if the batch also fails to encode. This is a rare intersection and acceptable.

Vector configuration

Used to verify metrics across three error scenarios. Base config (swap batch_encoding and schema file per scenario):

sources:
  demo:
    type: demo_logs
    format: json
    interval: 0.5
  internal_metrics:
    type: internal_metrics
    scrape_interval_secs: 5

sinks:
  s3:
    type: aws_s3
    inputs: ["demo"]
    bucket: test-bucket
    region: us-east-1
    endpoint: http://localhost:4566
    batch:
      max_events: 2
      timeout_secs: 2
    batch_encoding:
      codec: parquet
      schema_mode: strict          # or relaxed for scenarios 2 & 3
      schema_file: /tmp/test.schema
    encoding:
      codec: text
    compression: none
    auth:
      access_key_id: test
      secret_access_key: test
  metrics_console:
    type: console
    inputs: ["internal_metrics"]
    encoding:
      codec: json

How did you test this PR?

  • make check-clippy -- clean
  • cargo nextest run -p codecs --features arrow,parquet -- 277 passed
  • cargo nextest run --package vector --lib sinks::aws_s3 -- 2 passed
  • cargo nextest run --package vector --lib sinks::clickhouse -- 29 passed
  • cargo nextest run --package vector --lib sinks::util::encoding -- 10 passed
  • Local run with three error scenarios, all emitting Events dropped and component_errors_total:

Scenario 1 -- Strict mode (extra field rejected)

Schema: message logs { required binary message (STRING); }

codecs::internal_events: Failed writing bytes.
  error=SerializingError(Strict schema mode: event contains field 'host' not in schema)

component_events_dropped: Events dropped intentional=false count=2 reason="Failed writing bytes."

{"name":"component_errors_total", "tags":{"component_id":"s3", "error_type":"encoder_failed", "stage":"sending"}, "counter":{"value":5.0}}

Scenario 2 -- Null constraint violation (required field missing)

Schema: message logs { required binary message (STRING); required binary nonexistent_field (STRING); }

codecs::internal_events: Schema constraint violation.
  error=Null value for non-nullable field 'nonexistent_field'

codecs::internal_events: Failed writing bytes.
  error=SerializingError(Null value for non-nullable field 'nonexistent_field')

component_events_dropped: Events dropped intentional=false count=2 reason="Failed writing bytes."

{"name":"component_errors_total", "tags":{"error_code":"encoding_null_constraint", "error_type":"encoder_failed", "stage":"sending"}, "counter":{"value":5.0}}

Scenario 3 -- Type mismatch (string field vs int64 schema)

Schema: message logs { required int64 message; }

This is a build_record_batch ArrowJsonDecode error that previously had zero events_dropped emission. Now covered by the centralized wrapper.

codecs::internal_events: Failed writing bytes.
  error=SerializingError(Arrow JSON decoding error: failed to parse "..." as Int64)

component_events_dropped: Events dropped intentional=false count=2 reason="Failed writing bytes."

{"name":"component_errors_total", "tags":{"error_type":"encoder_failed", "stage":"sending"}, "counter":{"value":5.0}}

Change Type

  • Bug fix
  • New feature
  • Dependencies
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes
  • No

Does this PR include user facing changes?

  • Yes. Please add a changelog fragment based on our guidelines.
  • No. A maintainer will apply the no-changelog label to this PR.

References

pront and others added 2 commits April 15, 2026 12:21
…rors

Move events_dropped emission from individual internal events inside
serializers to a single wrapper in (Transformer, BatchEncoder)::encode_input.
This ensures all batch encoding error paths (Arrow IPC and Parquet) consistently
emit events_dropped without requiring each new error path to remember to add it.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions bot added the domain: sinks Anything related to the Vector's sinks label Apr 15, 2026
@pront pront added the no-changelog Changes in this PR do not need user-facing explanations in the release changelog label Apr 15, 2026
Covers the build_record_batch ArrowJsonDecode error path where a schema
expects int64 but the event contains a string value.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@pront pront marked this pull request as ready for review April 15, 2026 16:51
@pront pront requested a review from a team as a code owner April 15, 2026 16:51
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2caa7428ec

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/sinks/util/encoding.rs Outdated
pront and others added 2 commits April 15, 2026 13:18
…nting

Replace EncoderWriteError with a direct ComponentEventsDropped emission
in the batch encode wrapper. EncoderWriteError was incrementing
component_errors_total and logging "Failed writing bytes." which
double-counted errors (codec-specific events already increment
component_errors_total) and was misleading (the failure was encoding,
not writing).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a0def5bf25

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/sinks/util/encoding.rs Outdated
pront and others added 2 commits April 15, 2026 14:24
Move ComponentEventsDropped and UNINTENTIONAL imports inside the
cfg(feature = "codecs-arrow") impl block to avoid unused import
errors when the feature is disabled.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@pront pront changed the title fix(codecs): centralize events_dropped emission for batch encoding errors chore(codecs): centralize events_dropped emission for batch encoding errors Apr 15, 2026
@pront pront changed the title chore(codecs): centralize events_dropped emission for batch encoding errors chore(codecs): centralize events_dropped emission for batch encoding errors Apr 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

domain: sinks Anything related to the Vector's sinks no-changelog Changes in this PR do not need user-facing explanations in the release changelog

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants