Skip to content

Prevent malformed Elasticsearch log entries from crashing task log fetch#69306

Open
hkc-8010 wants to merge 2 commits into
apache:mainfrom
hkc-8010:fix/es-log-handler-malformed-log-entry
Open

Prevent malformed Elasticsearch log entries from crashing task log fetch#69306
hkc-8010 wants to merge 2 commits into
apache:mainfrom
hkc-8010:fix/es-log-handler-malformed-log-entry

Conversation

@hkc-8010

@hkc-8010 hkc-8010 commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Summary

  • A single stored log entry with a non-string event field (for example, a task that
    logs a list or dict as the sole message argument) currently crashes the entire
    task-log-fetch request with an unhandled pydantic.ValidationError, instead of
    degrading gracefully.
  • _read() built StructuredLogMessage objects from stored Elasticsearch hits without
    catching validation failures. This change catches ValidationError per hit and falls
    back to a stringified event, matching the existing fallback pattern in
    file_task_handler.py's _log_stream_to_parsed_log_stream.

Changes

  • providers/elasticsearch/src/airflow/providers/elasticsearch/log/es_task_handler.py:
    wrap StructuredLogMessage construction in a try/except ValidationError with a
    stringified-event fallback and a warning log.
  • providers/elasticsearch/tests/unit/elasticsearch/log/test_es_task_handler.py: add
    coverage for malformed event handling and for read() surviving one bad hit among
    otherwise-valid hits.

PR Checklist

  • My PR is targeted at the main branch
  • Tests added/updated
  • prek hooks all pass
  • Unit tests pass locally

closes: #69304

A single stored log entry with a non-string `event` field (for example,
a task that logs a list or dict as the sole message argument) currently
crashes the entire task-log-fetch request with an unhandled
pydantic.ValidationError, instead of degrading gracefully.

_read() built StructuredLogMessage objects from stored Elasticsearch
hits without catching validation failures. This catches ValidationError
per hit and falls back to a stringified event, matching the existing
fallback pattern in file_task_handler.py's _log_stream_to_parsed_log_stream.

closes: apache#69304
The new StructuredLogMessage fallback tests exercise an Airflow 3-only type, but the provider compatibility matrix still runs this provider against Airflow 2.11.1. Guarding those tests the same way as the runtime path keeps the compatibility job focused on behavior that actually exists in that release line.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Malformed stored log entry crashes task log fetch API with unhandled ValidationError

1 participant