Skip to content

Flaky test report: committed-code failures on 2026-06-07 #289

@andrross

Description

@andrross

Summary

6 distinct test failures were observed across 4 gradle-check builds running against committed code (Timer/main and Post Merge Action) in the past 24 hours (2026-06-07). None reproduced locally with the original seed, indicating non-deterministic (environment/timing-sensitive) flakes.

Failures

1. ClientYamlTestSuiteIT — date_histogram profiler

Field Value
Test org.opensearch.test.rest.ClientYamlTestSuiteIT.test {p0=search.aggregation/10_histogram/date_histogram profiler}
Build 79670 (Timer, main)
Seed 437FF60E325ECFF2
Reproduced locally No
First failure 2024-03-26
Total unique builds affected 234
Pattern Chronic flake. Active since March 2024 with a peak of 65 builds in Sep 2024. Recent months: 8 in Apr 2026, 12 in May 2026, 5 in Jun 2026. Stable/chronic.

2. FullRollingRestartIT — testFullRollingRestart_withNoRecoveryPayloadAndSource

Field Value
Test org.opensearch.recovery.FullRollingRestartIT.testFullRollingRestart_withNoRecoveryPayloadAndSource {p0={"cluster.indices.replication.strategy":"SEGMENT"}}
Build 79670 (Timer, main)
Seed 437FF60E325ECFF2
Reproduced locally No
First failure 2024-10-11
Total unique builds affected 136
Pattern Appeared Oct 2024, heavy burst in Jul-Aug 2025 (47+24 builds). Resurfaced Feb 2026 onward with 9-19 builds/month. Worsening since mid-April 2026 runner change.

3. MixedClusterClientYamlTestSuiteIT — cluster health with closed index

Field Value
Test org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=cluster.health/10_basic/cluster health with closed index}
Build 79636 (Post Merge Action)
Seed 41AB9E2A76F93150
Reproduced locally Could not run (requires bwc build with JAVA21_HOME)
First failure 2024-03-25
Total unique builds affected 140
Pattern Chronic flake since Mar 2024. Big spike in Sep 2024 (56 builds). Quieter in late 2025 but resurfaced with 9 builds in Apr 2026 and 11 in May 2026. Worsening since mid-April 2026 runner change.

4. IngestFromKinesisIT — testKinesisIngestion_RewindByOffset

Field Value
Test org.opensearch.plugin.kinesis.IngestFromKinesisIT.testKinesisIngestion_RewindByOffset
Build 79629 (Post Merge Action)
Seed E8356F7D14169C51
Reproduced locally No
First failure 2025-03-24
Total unique builds affected 128
Pattern Burst of 51 failures in Mar 2025, subsided, then another burst of 51 in Mar 2026. Sporadic in between. Bursty/episodic.

5. RestoreShallowSnapshotV2IT — testHashedPrefixTranslogMetadataCombination

Field Value
Test org.opensearch.remotestore.RestoreShallowSnapshotV2IT.testHashedPrefixTranslogMetadataCombination {p0={"opensearch.experimental.feature.writable_warm_index.enabled":"false"}}
Build 79652 (Post Merge Action)
Seed C85B3FCE78DA4C96
Reproduced locally No
First failure 2024-11-28
Total unique builds affected 77
Pattern Steady low-level flake (3-8 builds/month) since Nov 2024. No clear trend change. Stable/chronic.

6. RareClusterStateIT — testDisassociateNodesWhileShardInit

Field Value
Test org.opensearch.cluster.coordination.RareClusterStateIT.testDisassociateNodesWhileShardInit
Build 79652 (Post Merge Action)
Seed C85B3FCE78DA4C96
Reproduced locally No
First failure 2024-11-04
Total unique builds affected 54
Pattern Low-frequency flake until Apr 2026 (12 builds) and May 2026 (18 builds). Worsening — correlates with mid-April runner change to m7a.8xlarge.

Summary Table (sorted by builds affected)

Test Builds Affected First Seen Trend Reproduced
ClientYamlTestSuiteIT (date_histogram profiler) 234 2024-03-26 Stable/chronic No
MixedClusterClientYamlTestSuiteIT (cluster health) 140 2024-03-25 Worsening N/A (env)
FullRollingRestartIT (SEGMENT replication) 136 2024-10-11 Worsening No
IngestFromKinesisIT (RewindByOffset) 128 2025-03-24 Bursty/episodic No
RestoreShallowSnapshotV2IT (hashed prefix) 77 2024-11-28 Stable/chronic No
RareClusterStateIT (disassociate nodes) 54 2024-11-04 Worsening No

Notes

  • None of the 5 runnable tests reproduced with the original seed locally, confirming these are non-deterministic timing-sensitive flakes.
  • Three tests (FullRollingRestartIT, MixedClusterClientYamlTestSuiteIT, RareClusterStateIT) show a clear worsening pattern starting April 2026, correlating with the CI runner migration from m5.8xlarge to m7a.8xlarge.
  • IngestFromKinesisIT shows a distinctive bursty pattern (51 failures concentrated in single months) suggesting it may be triggered by specific infrastructure or dependency changes.
  • All tests have been failing for 6+ months, indicating these are well-known chronic flakes rather than new regressions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions