[v3-3-test] Reduce noise in the daily CI duration trend alert (#69113) by potiuk · Pull Request #69337 · apache/airflow

potiuk · 2026-07-03T14:40:44Z

Backport of #69113 to v3-3-test.

The duration monitor flagged jobs by comparing a single nightly canary run against
the median of the preceding runs, so any one slow run tripped the alert. This pins
the monitor to green canary runs, compares the median of the last few runs against
the baseline, and requires a larger absolute jump before flagging individual jobs.

Cherry-picked cleanly from e99daee. The v3-3-test copy of
scripts/ci/analyze_ci_job_durations.py already supports ONLY_SUCCESSFUL,
LATEST_RUNS, and JOB_MIN_ABS_INCREASE_MINUTES with matching defaults, so the
workflow-only change is self-contained.

closes: #69332

Was generative AI tooling used to co-author this PR?

Yes — Claude Code (Opus 4.8)

Generated-by: Claude Code (Opus 4.8) following the guidelines

The duration monitor flagged jobs by comparing a single nightly canary run against the median of the preceding runs, so any one slow run — slow PyPI, runner queue pressure, a cold cache — tripped the alert. Because a different run was "latest" each day, a different set of jobs was flagged each day, and network-bound constraint-resolution jobs that legitimately swing tens of minutes dominated nearly every alert. The result was a near-daily alert whose contents swung wildly and carried little signal. Compare the median of the last few nightly runs against the baseline so the two sides are symmetric and one unlucky run no longer trips it, and require a larger absolute jump before flagging individual jobs. Pin the monitor to successful (green) canary runs only. A failed or cancelled canary stops partway, so its truncated wall-clock and per-job durations would skew the baseline downwards and mask real regressions. The script already defaults to this, but the guarantee is now explicit at the call site so it cannot be silently changed. (cherry picked from commit e99daee)

potiuk requested review from amoghrajesh, ashb, bugraoz93, gopidesupavan, jason810496 and jscheffl as code owners July 3, 2026 14:40

boring-cyborg Bot added the area:dev-tools label Jul 3, 2026

potiuk mentioned this pull request Jul 3, 2026

Remove unused ts-sdk label rule from Boring Cyborg config #69340

Merged

1 task

Merge branch 'v3-3-test' into backport-69113-v3-3-test

fdeff12

potiuk merged commit ddf7776 into apache:v3-3-test Jul 3, 2026
64 checks passed

potiuk deleted the backport-69113-v3-3-test branch July 3, 2026 16:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[v3-3-test] Reduce noise in the daily CI duration trend alert (#69113)#69337

[v3-3-test] Reduce noise in the daily CI duration trend alert (#69113)#69337
potiuk merged 2 commits into
apache:v3-3-testfrom
potiuk:backport-69113-v3-3-test

potiuk commented Jul 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

potiuk commented Jul 3, 2026

Was generative AI tooling used to co-author this PR?

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant