Skip to content

[v3-3-test] Reduce noise in the daily CI duration trend alert (#69113)#69337

Merged
potiuk merged 2 commits into
apache:v3-3-testfrom
potiuk:backport-69113-v3-3-test
Jul 3, 2026
Merged

[v3-3-test] Reduce noise in the daily CI duration trend alert (#69113)#69337
potiuk merged 2 commits into
apache:v3-3-testfrom
potiuk:backport-69113-v3-3-test

Conversation

@potiuk

@potiuk potiuk commented Jul 3, 2026

Copy link
Copy Markdown
Member

Backport of #69113 to v3-3-test.

The duration monitor flagged jobs by comparing a single nightly canary run against
the median of the preceding runs, so any one slow run tripped the alert. This pins
the monitor to green canary runs, compares the median of the last few runs against
the baseline, and requires a larger absolute jump before flagging individual jobs.

Cherry-picked cleanly from e99daee. The v3-3-test copy of
scripts/ci/analyze_ci_job_durations.py already supports ONLY_SUCCESSFUL,
LATEST_RUNS, and JOB_MIN_ABS_INCREASE_MINUTES with matching defaults, so the
workflow-only change is self-contained.

closes: #69332


Was generative AI tooling used to co-author this PR?
  • Yes — Claude Code (Opus 4.8)

Generated-by: Claude Code (Opus 4.8) following the guidelines

The duration monitor flagged jobs by comparing a single nightly canary run
against the median of the preceding runs, so any one slow run — slow PyPI,
runner queue pressure, a cold cache — tripped the alert. Because a different
run was "latest" each day, a different set of jobs was flagged each day, and
network-bound constraint-resolution jobs that legitimately swing tens of
minutes dominated nearly every alert. The result was a near-daily alert whose
contents swung wildly and carried little signal.

Compare the median of the last few nightly runs against the baseline so the
two sides are symmetric and one unlucky run no longer trips it, and require a
larger absolute jump before flagging individual jobs.

Pin the monitor to successful (green) canary runs only. A failed or cancelled
canary stops partway, so its truncated wall-clock and per-job durations would
skew the baseline downwards and mask real regressions. The script already
defaults to this, but the guarantee is now explicit at the call site so it
cannot be silently changed.

(cherry picked from commit e99daee)
@potiuk potiuk merged commit ddf7776 into apache:v3-3-test Jul 3, 2026
64 checks passed
@potiuk potiuk deleted the backport-69113-v3-3-test branch July 3, 2026 16:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant