CNTRLPLANE-2549:test/library/controlplane: add WaitForControlPlaneRolloutAll helper#2137
CNTRLPLANE-2549:test/library/controlplane: add WaitForControlPlaneRolloutAll helper#2137wangke19 wants to merge 1 commit intoopenshift:masterfrom
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: wangke19 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
Migrate the refresh-CA e2e test to be compatible with the OTE (openshift-tests-extension) framework while retaining the existing go-test path for backwards compatibility. Key changes: - Extract testRefreshCA(testing.TB) for dual Ginkgo/go-test usage - Add [Disruptive][Timeout:20m] labels so openshift-tests grants a 20-minute per-test timeout (vs the default 15min) for this deliberately disruptive CA rotation test - Add waitForControlPlaneRolloutAll() to wait for kube-apiserver, kube-controller-manager, and kube-scheduler static-pod revision rollouts to complete after CA rotation, reducing monitor test failures from expected transient disruption The stabilization wait tracks rollout via operator/v1 nodeStatuses (not node annotations, which are absent on OCP 4.21+). It re-reads LatestAvailableRevision each poll to chase mid-rollout re-revisions automatically. The wait is best-effort: if operators don't fully stabilize, the CA rotation test still passes. Relates to: openshift/library-go#2137
f01795a to
597220d
Compare
Migrate the refresh-CA e2e test to be compatible with the OTE (openshift-tests-extension) framework while retaining the existing go-test path for backwards compatibility. Key changes: - Extract testRefreshCA(testing.TB) for dual Ginkgo/go-test usage - Add [Disruptive][Timeout:20m] labels so openshift-tests grants a 20-minute per-test timeout (vs the default 15min) for this deliberately disruptive CA rotation test - Add waitForControlPlaneRolloutAll() to wait for kube-apiserver, kube-controller-manager, and kube-scheduler static-pod revision rollouts to complete after CA rotation, reducing monitor test failures from expected transient disruption The stabilization wait tracks rollout via operator/v1 nodeStatuses (not node annotations, which are absent on OCP 4.21+). It re-reads LatestAvailableRevision each poll to chase mid-rollout re-revisions automatically. The wait is best-effort: if operators don't fully stabilize, the CA rotation test still passes. Relates to: openshift/library-go#2137
Add WaitForControlPlaneRolloutAll, WaitForControlPlaneRollout, and WaitForClusterOperatorStable to the test/library package for use by operators that need to wait for control-plane stabilization after disruptive cluster changes (e.g. CA rotation). Tracks rollout via operator/v1 nodeStatuses[].currentRevision (not node annotations, which are absent on OCP 4.21+). Re-reads LatestAvailableRevision each poll so mid-rollout re-revisions are chased automatically. Logs only on state transitions.
597220d to
716944f
Compare
|
@wangke19: This pull request references CNTRLPLANE-2549 which is a valid jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@wangke19: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Summary
Add a reusable test helper package
test/library/controlplanefor waiting on control-plane static-pod operator revision rollouts after disruptive cluster changes (e.g. CA rotation, cert regeneration).Motivation
After a disruptive CA rotation, the three control-plane static-pod operators (
kube-apiserver,kube-controller-manager,kube-scheduler) each trigger a node-by-node revision rollout. Tests that need to wait for the cluster to stabilize before asserting monitor health had no reusable utility for this in library-go.New API:
test/library/controlplaneDesign
LatestAvailableRevisionandnodeStatusesfrom the operator/v1 resource (KubeAPIServer,KubeControllerManager,KubeScheduler) — the canonical source for static pod rollout progress on OCP 4.x (no node annotations required).LatestAvailableRevisioneach poll interval so mid-rollout re-revisions are chased automatically.context.Contextfor cancellation; callers control the deadline.library.LoggingTinterface (compatible with both*testing.Tand Ginkgo'sGinkgoTB()).Usage
Test plan
go vet ./test/library/controlplane/...passes