Switch Tempo and OTEL 4.22 upstream jobs from GCP to baremetal-lab-ipi and updated downstream testing jobs post RHOSDT 3.10 release. #80806
Conversation
Provision credential-free baremetal clusters instead of GCP IPI for the qe-agent post-step so the Claude CLI cannot access cloud credentials stored in kube-system on GCP clusters.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughTwo upstream OCP 4.22 amd64 jobs—grafana-tempo-operator and opentelemetry-operator—migrate from GCP observability clusters to equinix bare-metal infrastructure with updated capabilities, environment variables, and workflows. Stage and downstream product configurations update cloud profiles and OpenShift versions across 4.12–4.22. QE Agent documentation is refreshed to enforce non-cloud cluster requirements and introduce an autonomous skills-based test-failure triage framework. Distributed-tracing downstream tests add parameterized branch handling for operator repository checkouts. ChangesUpstream bare-metal infrastructure migration
QE Agent documentation and skills framework
Product version and cloud profile migrations
Distributed-tracing downstream test parameterization
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 15✅ Passed checks (15 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
/pj-rehearse pull-ci-openshift-open-telemetry-opentelemetry-operator-main-upstream-ocp-4.22-amd64-opentelemetry-upstream-tests |
|
@IshwarKanse: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse pull-ci-openshift-open-telemetry-opentelemetry-operator-main-upstream-ocp-4.22-amd64-opentelemetry-upstream-tests |
|
@IshwarKanse: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse ack |
|
@IshwarKanse: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
Update the blast radius section to document the requirement for non-cloud provisioned clusters. Add a skills README documenting the required structure and conventions for new skill files.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
ci-operator/step-registry/openshift-observability/qe-agent/README.md (1)
188-192: ⚡ Quick winDuplicate section: consolidate "Cluster requirement" and "Required".
Lines 188–192 ("Cluster requirement: non-cloud provisioned clusters only") and lines 233–235 ("Required: non-cloud provisioned clusters") convey the same constraint with nearly identical wording. Consolidate them into a single section to avoid maintainability drift and reduce cognitive load.
Example consolidation:
- Keep the detailed explanation at lines 188–192 (more complete rationale: RBAC additive limitation,
kube-systemcredential risk).- Remove lines 233–235.
- Cross-reference the full explanation from the brief warning at line 145.
Also applies to: 233-235
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ci-operator/step-registry/openshift-observability/qe-agent/README.md` around lines 188 - 192, Remove the duplicate "Required: non-cloud provisioned clusters" section that repeats the same constraint as the earlier "Cluster requirement: non-cloud provisioned clusters only" section. Keep the more detailed explanation in the "Cluster requirement" section which includes the complete rationale about RBAC limitations and credential risks, and delete the redundant "Required" section to eliminate maintainability drift and improve clarity.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@ci-operator/step-registry/openshift-observability/qe-agent/README.md`:
- Around line 188-192: Remove the duplicate "Required: non-cloud provisioned
clusters" section that repeats the same constraint as the earlier "Cluster
requirement: non-cloud provisioned clusters only" section. Keep the more
detailed explanation in the "Cluster requirement" section which includes the
complete rationale about RBAC limitations and credential risks, and delete the
redundant "Required" section to eliminate maintainability drift and improve
clarity.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 2ff625b3-24aa-482b-af6e-009910743973
📒 Files selected for processing (2)
ci-operator/step-registry/openshift-observability/qe-agent/README.mdci-operator/step-registry/openshift-observability/qe-agent/skills/README.md
✅ Files skipped from review due to trivial changes (1)
- ci-operator/step-registry/openshift-observability/qe-agent/skills/README.md
Recommend running Skillsaw and Agent Eval Harness before submitting new or modified skills to catch security and quality issues early.
|
/pj-rehearse ack |
|
@IshwarKanse: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse pull-ci-openshift-grafana-tempo-operator-main-upstream-ocp-4.22-amd64-tempo-upstream-tests |
|
@IshwarKanse: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse pull-ci-openshift-grafana-tempo-operator-main-upstream-ocp-4.22-amd64-tempo-upstream-tests |
|
@IshwarKanse: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
pj-rehearse pull-ci-openshift-grafana-tempo-operator-main-upstream-ocp-4.22-amd64-tempo-upstream-tests |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@ci-operator/step-registry/distributed-tracing/tests/opentelemetry/downstream/distributed-tracing-tests-opentelemetry-downstream-commands.sh`:
- Around line 110-111: The OTEL_CSV_NAME variable retrieved from the oc get csv
command on line 110 is used unquoted in the oc patch command on line 111, which
can cause parsing issues if the variable is empty or contains unexpected values.
Add a validation check immediately after line 110 to fail fast if OTEL_CSV_NAME
is empty, using a condition like checking if the variable is zero-length. Then
quote the variable reference on line 111 where it is used in the oc patch
command to prevent word splitting and ensure proper argument parsing.
In
`@ci-operator/step-registry/distributed-tracing/tests/opentelemetry/downstream/distributed-tracing-tests-opentelemetry-downstream-ref.yaml`:
- Line 18: The parameter documentation on line 18 incorrectly refers to "stage
tests" when describing the purpose of the branch checkout. Since this is a
downstream test reference file (as indicated by the file path and name), update
the text to say "downstream tests" instead of "stage tests" to accurately
describe what this branch parameter is used for.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 03761690-d3f8-4702-9636-3b809b2fc97f
⛔ Files ignored due to path filters (4)
ci-operator/jobs/openshift/grafana-tempo-operator/openshift-grafana-tempo-operator-main-periodics.yamlis excluded by!ci-operator/jobs/**ci-operator/jobs/openshift/grafana-tempo-operator/openshift-grafana-tempo-operator-main-presubmits.yamlis excluded by!ci-operator/jobs/**ci-operator/jobs/openshift/open-telemetry-opentelemetry-operator/openshift-open-telemetry-opentelemetry-operator-main-periodics.yamlis excluded by!ci-operator/jobs/**ci-operator/jobs/openshift/open-telemetry-opentelemetry-operator/openshift-open-telemetry-opentelemetry-operator-main-presubmits.yamlis excluded by!ci-operator/jobs/**
📒 Files selected for processing (11)
ci-operator/config/openshift/grafana-tempo-operator/openshift-grafana-tempo-operator-main__tempo-product-ocp-4.12-stage.yamlci-operator/config/openshift/grafana-tempo-operator/openshift-grafana-tempo-operator-main__tempo-product-ocp-4.20-downstream.yamlci-operator/config/openshift/grafana-tempo-operator/openshift-grafana-tempo-operator-main__tempo-product-ocp-4.22-downstream.yamlci-operator/config/openshift/open-telemetry-opentelemetry-operator/openshift-open-telemetry-opentelemetry-operator-main__opentelemetry-product-ocp-4.12-stage.yamlci-operator/config/openshift/open-telemetry-opentelemetry-operator/openshift-open-telemetry-opentelemetry-operator-main__opentelemetry-product-ocp-4.20-downstream.yamlci-operator/config/openshift/open-telemetry-opentelemetry-operator/openshift-open-telemetry-opentelemetry-operator-main__opentelemetry-product-ocp-4.21-downstream.yamlci-operator/config/openshift/open-telemetry-opentelemetry-operator/openshift-open-telemetry-opentelemetry-operator-main__opentelemetry-product-ocp-4.22-downstream.yamlci-operator/step-registry/distributed-tracing/tests/opentelemetry/downstream/distributed-tracing-tests-opentelemetry-downstream-commands.shci-operator/step-registry/distributed-tracing/tests/opentelemetry/downstream/distributed-tracing-tests-opentelemetry-downstream-ref.yamlci-operator/step-registry/distributed-tracing/tests/tempo/downstream/distributed-tracing-tests-tempo-downstream-commands.shci-operator/step-registry/distributed-tracing/tests/tempo/downstream/distributed-tracing-tests-tempo-downstream-ref.yaml
✅ Files skipped from review due to trivial changes (1)
- ci-operator/config/openshift/open-telemetry-opentelemetry-operator/openshift-open-telemetry-opentelemetry-operator-main__opentelemetry-product-ocp-4.21-downstream.yaml
|
[REHEARSALNOTIFIER]
Interacting with pj-rehearseComment: Once you are satisfied with the results of the rehearsals, comment: |
|
@IshwarKanse: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/pj-rehearse ack |
|
@IshwarKanse: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/lgtm |
Summary
cucushift-installer-rehearse-gcp-ipi(GCP) tobaremetal-lab-ipi(Equinix bare metal)Changes
cluster_profile:gcp-observability→equinix-ocp-metal-qeworkflow:cucushift-installer-rehearse-gcp-ipi→baremetal-lab-ipicucushift-installer-rehearse-gcp-ipi-deprovision→baremetal-lab-postcapabilities: [intranet]and baremetal env vars (AUX_HOST,RESERVE_BOOTSTRAP,architecture,masters,workers)Affected configs
openshift-grafana-tempo-operator-main__upstream-ocp-4.22-amd64.yamlopenshift-open-telemetry-opentelemetry-operator-main__upstream-ocp-4.22-amd64.yamlSummary by CodeRabbit
This PR updates OpenShift CI upstream test rehearsals for the Grafana Tempo and OpenTelemetry Operator jobs (OCP 4.22, amd64) to run on bare metal clusters on Equinix instead of GCP infrastructure, to avoid relying on in-cluster cloud credential access.
What changed in CI/infrastructure (Tempo + OpenTelemetry Operator)
gcp-observability→equinix-ocp-metal-qecucushift-installer-rehearse-gcp-ipi→baremetal-lab-ipicucushift-installer-rehearse-gcp-ipi-deprovision→baremetal-lab-postintranetrequirementAUX_HOST: openshift-qe-metal-ci.arm.eng.rdu2.redhat.comRESERVE_BOOTSTRAP: "false"architecture: amd64masters: "3"andworkers: "2"ref: ipi-install-rbacAGENT_SKILL: TEMPOAGENT_SKILL: OTELDocumentation/step-registry updates for non-cloud execution
openshift-observability-qe-agentREADME to emphasize it is for non-cloud provisioned clusters only, calling out the cloud credential access risk for cloud-provisioned clusters.openshift-observability-qe-agent/skills/README.mdto define the required skill YAML format and conventions for autonomous test-failure triage (including tooling requirements and expected output artifacts).