OCPBUGS-90560: Resolve GCP boot image from worker machines#1512
OCPBUGS-90560: Resolve GCP boot image from worker machines#1512nader-ziada wants to merge 1 commit into
Conversation
|
@nader-ziada: This pull request references Jira Issue OCPBUGS-90560, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository: openshift/coderabbit/.coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (5)
🚧 Files skipped from review as they are similar to previous changes (5)
WalkthroughAdds a shared ChangesGCP Boot Image Resolution and Configurable Defaulting
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 14 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (14 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.12.2)Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/webhooks/machine_webhook_test.go`:
- Line 6805: The TestResolveGCPBootImage test cases have hardcoded expected and
mismatch image values that are specific to x86-64 architecture, but the
resolveGCPBootImage function filters results by the arch parameter. Make the
test architecture-aware by parameterizing the expected image assertions based on
the architecture being tested (such as by deriving expected images from the arch
field in each test case) so that assertions pass for both x86-64 and ARM64
architectures. This should be applied to all the test cases mentioned at the
specified line ranges.
- Line 6827: In the machine_webhook_test.go file, the json.Marshal calls at
lines 6827, 6857, 6887, 6911, and 6941 are currently ignoring errors by using
blank identifier (_). Instead of discarding the error, capture it in a variable
and check for non-nil errors. When json.Marshal fails, the test should fail
immediately using the testing.T helper (commonly using t.Fatalf or t.Errorf) to
prevent invalid ProviderSpec payloads from being silently used in the test
fixtures. This ensures test reliability and follows the project guideline of
never ignoring error returns in Go files.
In `@pkg/webhooks/machine_webhook.go`:
- Around line 354-366: The function resolveGCPBootImage currently treats all
List API failures as "no image found" by returning an empty string, which masks
RBAC and API outages. Modify resolveGCPBootImage to return both a string and an
error instead of just a string. When the c.List() call fails in the error
handling block, return the actual error along with an empty string rather than
silently returning an empty string. Reserve the empty string return for
successful list operations that find no matching worker machines. Additionally,
replace context.Background() with a context that supports cancellation and
timeouts, such as one derived from a timeout or cancellation mechanism
appropriate to your codebase standards.
- Around line 378-379: The loop iterating over providerSpec.Disks dereferences
each disk element without checking if it is nil, which will cause a panic if the
slice contains nil elements. Add a nil guard check inside the for loop that
iterates over providerSpec.Disks to ensure each disk is not nil before accessing
its Boot and Image fields.
- Around line 378-380: In the disk iteration loop where you check disk.Boot and
disk.Image, replace the strings.Contains call with strings.HasSuffix to ensure
the architecture suffix matches exactly at the end of the image path rather than
anywhere within it. This prevents accidentally selecting the wrong image variant
if the suffix text appears elsewhere in the image path.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 4245399b-3c94-4d86-a8ab-a45017c1c659
📒 Files selected for processing (5)
cmd/machineset/main.gopkg/webhooks/machine_webhook.gopkg/webhooks/machine_webhook_test.gopkg/webhooks/machineset_webhook.gopkg/webhooks/machineset_webhook_test.go
Instead of hardcoding a stale RHCOS 4.14 boot disk image for GCP defaults, resolve the image dynamically from existing worker machines at webhook startup. Falls back to the hardcoded default when no workers are found. Filters by architecture (x86-64/aarch64) to avoid selecting the wrong image on mixed-arch clusters. Introduces ResolveDefaulterConfig() so both Machine and MachineSet defaulters share a single client and boot image resolution call.
fef4618 to
8389bf4
Compare
|
/retest |
|
/pipeline auto |
|
/test ? |
|
/test e2e-gcp-ovn |
|
/retest |
|
@nader-ziada: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
@theobarberbany all the tests passed, is there other specific tests to kick off? |
Instead of hardcoding a stale RHCOS 4.14 boot disk image for GCP defaults, resolve the image dynamically from existing worker machines at webhook startup. Falls back to the hardcoded default when no workers are found. Filters by architecture (x86-64/aarch64) to avoid selecting the wrong image on mixed-arch clusters.
Introduces ResolveDefaulterConfig() so both Machine and MachineSet defaulters share a single client and boot image resolution call.
Summary by CodeRabbit