Skip to content

fix: add metadata.name length and format validation to all CRDs#1113

Open
IshwarKanse wants to merge 2 commits into
rhobs:mainfrom
IshwarKanse:fix/observabilityinstaller-name-length-validation
Open

fix: add metadata.name length and format validation to all CRDs#1113
IshwarKanse wants to merge 2 commits into
rhobs:mainfrom
IshwarKanse:fix/observabilityinstaller-name-length-validation

Conversation

@IshwarKanse

@IshwarKanse IshwarKanse commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Summary

Resolves https://redhat.atlassian.net/browse/COO-1261

The app.kubernetes.io/part-of label value is set from the owner resource's .metadata.name via util.AddCommonLabels. Kubernetes enforces a hard 63-character limit on label values, so CR names exceeding that length caused repeated reconciliation failures with no path to recovery.

This PR fixes the issue at two layers (defence in depth):

Layer 1 — CRD admission validation (primary guard)

Added two CEL x-kubernetes-validations rules at the root openAPIV3Schema level for all three affected CRDs (ObservabilityInstaller, MonitoringStack, ThanosQuerier):

  • size(self.metadata.name) <= 63 — enforces the Kubernetes label-value length limit
  • self.metadata.name.matches('^[a-z0-9]([-a-z0-9]*[a-z0-9])?$') — enforces RFC 1123 DNS label format

Rules are applied to both bundle/manifests/ and deploy/crds/common/ CRD YAMLs and backed by +kubebuilder:validation:XValidation markers in the Go types so they survive future generate-crds runs.

Layer 2 — Runtime guard in AddCommonLabels (secondary / defensive)

Changed util.AddCommonLabels signature from (client.Object, string) client.Object to (client.Object, string) (client.Object, error). The function now returns a hard error for names > 63 characters instead of silently producing invalid label values.

Error propagation (all controllers)

To surface the new error cleanly, introduced a ReconcilerBuilder accumulator in pkg/reconciler/reconciler.go. It short-circuits on the first error so component builder functions (stackComponentReconcilers, thanosComponentReconcilers, pluginComponentReconcilers, operatorComponentReconcilers, observability reconcilers) can call b.Add(reconciler.NewUpdater(...)) without per-call error handling. All reconciler constructors that call AddCommonLabels now return (Reconciler, error).

Pre-existing bug fix (CodeRabbit)

Fixed a missing hyphen in the UIPlugin ClusterRoleBinding name:
plugin.Name+"cluster-monitoring-view"plugin.Name+"-cluster-monitoring-view"

Files changed

File Change
pkg/controllers/util/common.go AddCommonLabels returns (client.Object, error); rejects names > 63 chars
pkg/controllers/util/common_test.go New unit tests for AddCommonLabels
pkg/reconciler/reconciler.go ReconcilerBuilder accumulator; all constructors return errors
pkg/reconciler/create_update_reconciler.go Propagates error from AddCommonLabels
pkg/controllers/monitoring/monitoring-stack/components.go Uses ReconcilerBuilder; returns error
pkg/controllers/monitoring/monitoring-stack/controller.go Handles error from stackComponentReconcilers
pkg/controllers/monitoring/thanos-querier/components.go Uses ReconcilerBuilder; returns error
pkg/controllers/monitoring/thanos-querier/controller.go Handles error from thanosComponentReconcilers
pkg/controllers/uiplugin/components.go Uses ReconcilerBuilder; returns error; fixes ClusterRoleBinding name
pkg/controllers/uiplugin/controller.go Handles error from pluginComponentReconcilers and NewMerger
pkg/controllers/operator/components.go Uses ReconcilerBuilder; returns error
pkg/controllers/operator/controller.go Handles error from operatorComponentReconcilers
pkg/controllers/observability/reconcilers.go Handles errors from NewCreateUpdateReconciler and NewUpdater
pkg/apis/monitoring/v1alpha1/types.go CEL markers on MonitoringStack and ThanosQuerier
pkg/apis/monitoring/v1alpha1/types_test.go New CEL unit tests for MonitoringStack and ThanosQuerier
pkg/apis/observability/v1alpha1/types.go CEL markers on ObservabilityInstaller (first commit)
pkg/apis/observability/v1alpha1/types_test.go CEL unit tests for ObservabilityInstaller (first commit)
bundle/manifests/monitoring.rhobs_monitoringstacks.yaml Added x-kubernetes-validations
bundle/manifests/monitoring.rhobs_thanosqueriers.yaml Added x-kubernetes-validations
bundle/manifests/observability.openshift.io_observabilityinstallers.yaml Added x-kubernetes-validations (first commit)
deploy/crds/common/monitoring.rhobs_monitoringstacks.yaml Added x-kubernetes-validations
deploy/crds/common/monitoring.rhobs_thanosqueriers.yaml Added x-kubernetes-validations
deploy/crds/common/observability.openshift.io_observabilityinstallers.yaml Added x-kubernetes-validations (first commit)

Test plan

  • New unit tests — pkg/controllers/util/common_test.go: covers valid names (≤ 63 chars), boundary (exactly 63), over-length (64, 108), label preservation, and nil-label initialisation
  • New CEL unit tests — pkg/apis/monitoring/v1alpha1/types_test.go: extracts rules from source annotations via regex and evaluates them with cel-go for both MonitoringStack and ThanosQuerier (length and format)
  • New CEL unit tests — pkg/apis/observability/v1alpha1/types_test.go: same for ObservabilityInstaller
  • All unit tests pass: go test ./pkg/... and cd pkg/apis && go test ./...
  • go build and go vet clean across the module
  • Full e2e suite run on OCP 4.22 nightly cluster — all tests pass; TestUIPlugin/Cluster_health_analyzer timed out due to alert-firing latency on a fresh cluster (pre-existing, unrelated to this change)
  • CodeRabbit review: 0 new findings

@openshift-ci

openshift-ci Bot commented Jun 3, 2026

Copy link
Copy Markdown

Hi @IshwarKanse. Thanks for your PR.

I'm waiting for a rhobs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Tip

We noticed you've done this a few times! Consider joining the org to skip this step and gain /lgtm and other bot rights. We recommend asking approvers on your previous PRs to sponsor you.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@coderabbitai

coderabbitai Bot commented Jun 3, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Review limit reached

@IshwarKanse, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 57 minutes and 36 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: b1e34c92-a83d-4efa-9afe-78e11f0e4a67

📥 Commits

Reviewing files that changed from the base of the PR and between eb33497 and e1635c0.

📒 Files selected for processing (23)
  • bundle/manifests/monitoring.rhobs_monitoringstacks.yaml
  • bundle/manifests/monitoring.rhobs_thanosqueriers.yaml
  • bundle/manifests/observability.openshift.io_observabilityinstallers.yaml
  • deploy/crds/common/monitoring.rhobs_monitoringstacks.yaml
  • deploy/crds/common/monitoring.rhobs_thanosqueriers.yaml
  • deploy/crds/common/observability.openshift.io_observabilityinstallers.yaml
  • pkg/apis/monitoring/v1alpha1/types.go
  • pkg/apis/monitoring/v1alpha1/types_test.go
  • pkg/apis/observability/v1alpha1/types.go
  • pkg/apis/observability/v1alpha1/types_test.go
  • pkg/controllers/monitoring/monitoring-stack/components.go
  • pkg/controllers/monitoring/monitoring-stack/controller.go
  • pkg/controllers/monitoring/thanos-querier/components.go
  • pkg/controllers/monitoring/thanos-querier/controller.go
  • pkg/controllers/observability/reconcilers.go
  • pkg/controllers/operator/components.go
  • pkg/controllers/operator/controller.go
  • pkg/controllers/uiplugin/components.go
  • pkg/controllers/uiplugin/controller.go
  • pkg/controllers/util/common.go
  • pkg/controllers/util/common_test.go
  • pkg/reconciler/create_update_reconciler.go
  • pkg/reconciler/reconciler.go
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci

openshift-ci Bot commented Jun 3, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: IshwarKanse
Once this PR has been reviewed and has the lgtm label, please assign danielmellado for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@IshwarKanse IshwarKanse force-pushed the fix/observabilityinstaller-name-length-validation branch from 6130c69 to e66cece Compare June 3, 2026 10:39
@IshwarKanse IshwarKanse force-pushed the fix/observabilityinstaller-name-length-validation branch from f9e6340 to f735c65 Compare June 3, 2026 11:09
@simonpasquier

Copy link
Copy Markdown
Contributor

I'm surprised that you can create the ObservabilityInstaller custom resource in the first place since its name breaks the Kubernetes limits already.

@IshwarKanse

Copy link
Copy Markdown
Contributor Author

I'm surprised that you can create the ObservabilityInstaller custom resource in the first place since its name breaks the Kubernetes limits already.
Its created but the operator fails to reconcile it with errors in the operator logs as documented in the issue https://redhat.atlassian.net/browse/COO-1261

@simonpasquier

Copy link
Copy Markdown
Contributor

I think that's a more systemic issue than just the ObservabilityInstaller CRD which we need to address globally.

@IshwarKanse

Copy link
Copy Markdown
Contributor Author

@simonpasquier Yes, I added the fix currently for ObservabilityInstaller but this needs to be fixed globally as it affects CRD validation on all affected types. Do you want me to add the validation for all the affected types as well. ?

We could fix util.AddCommonLabels to truncate long names — cap the label value at 63 characters (e.g. truncate to 57 chars + ashort hash suffix to keep uniqueness). This fixes all callers in one place but means the label no longer directly reflects the resource name, which would cause confusion for anyone using label selectors like app.kubernetes.io/part-of=

Enforcing the limit at admission time with a clear error message seems like the right UX — users know immediately what to fix and why. The label always remains trustworthy.

@IshwarKanse IshwarKanse force-pushed the fix/observabilityinstaller-name-length-validation branch from d508c13 to e1635c0 Compare June 5, 2026 10:35
@IshwarKanse IshwarKanse changed the title fix: add metadata.name length and format validation to ObservabilityInstaller CRD fix: add metadata.name length and format validation to all CRDs Jun 5, 2026
@IshwarKanse

Copy link
Copy Markdown
Contributor Author

@simonpasquier I updated the PR to fix this issue for all types.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants