Skip to content

PR for the next 0.68.x release#1979

Open
AndrewChubatiuk wants to merge 22 commits intorelease-0.68from
release-0.68-next-release
Open

PR for the next 0.68.x release#1979
AndrewChubatiuk wants to merge 22 commits intorelease-0.68from
release-0.68-next-release

Conversation

@AndrewChubatiuk
Copy link
Copy Markdown
Contributor

@AndrewChubatiuk AndrewChubatiuk commented Mar 17, 2026

PR for next 0.68.x release


Summary by cubic

Makes PVC operations safer by waiting for claims to bind and fully resize, and refactors probes and app defaults into CommonAppsParams with Proxy Protocol–aware health checks, more predictable reconcile, and quieter logs.

  • New Features

    • Added PVC/VM wait envs: VM_PVC_WAIT_READY_INTERVAL, VM_PVC_WAIT_READY_TIMEOUT, VM_WAIT_READY_INTERVAL; documented in docs/env.md.
    • Introduced per-app default TerminationGracePeriodSeconds via env (e.g. VM_VLOGSDEFAULT_TERMINATION_GRACE_PERIOD_SECONDS).
    • Added statefulRollingUpdateStrategyBehavior for VMAgent in stateful mode.
    • Consolidated deployment, probe, and security settings into CommonAppsParams; unified probe builder respects httpListenAddr.useProxyProtocol (comma-separated supported).
  • Bug Fixes

    • PVC reconcile and StatefulSet PVC expansion now wait for Bound phase, target capacity, and non-Resizing; added conflict-retry, post-create wait, and return errors on failed updates.
    • Preserve HPA-managed replicas during Deployment/StatefulSet reconcile; normalize Service defaults to avoid needless updates.
    • Proxy Protocol health checks fixed and covered by tests.
    • VMAlertmanager no longer ignores tracingConfig when no AlertmanagerConfig CRs exist.
    • Prevent infinite rollouts by forcing maxUnavailable=1 when set to 0 and replicas=1; added tests.
    • Align shardCount to int32 in APIs/CRDs; render SHARD_NUM placeholders when shardCount > 0.
    • Controllers: requeue on context cancellation unless it’s a graceful shutdown; unified error handling across CRs.
    • Load balancer image tag for VL/VT/VM no longer derives from clusterVersion.
    • VMDistributed: prioritize any unhealthy zone first during zone updates; filter VMAuth targets by owner reference.
    • Reconcile ConfigMaps/Secrets when data keys change; added VMRule rebalance tests.
    • Watch DaemonSets for VMAgent/VLAgent so DaemonSet mode updates trigger reconcile.
    • VMAgent: apply scrape class relabelings before job ones.
    • VL/VT cluster: do not ignore extraStorageNode when default storage is disabled.
    • Logging: make frequent VMRule/VMScrape selection logs verbose; waitForStatus reports errors periodically.

Written for commit 14c05a0. Summary will update on new commits.

@AndrewChubatiuk AndrewChubatiuk changed the base branch from master to release-0.68 March 17, 2026 13:37
@AndrewChubatiuk AndrewChubatiuk changed the title Release 0.68 next release PR for next 0.68.x release Mar 17, 2026
@AndrewChubatiuk AndrewChubatiuk changed the title PR for next 0.68.x release PR for the next 0.68.x release Mar 17, 2026
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 issues found across 15 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="docs/env.md">

<violation number="1" location="docs/env.md:237">
P3: Narrow this description: `VM_WAIT_READY_INTERVAL` does not apply to all VM CRs, only to the `waitForStatus` loop used for VMAgent, VMCluster, and VMAuth.</violation>
</file>

<file name="internal/controller/operator/factory/reconcile/statefulset_pvc_expand_test.go">

<violation number="1" location="internal/controller/operator/factory/reconcile/statefulset_pvc_expand_test.go:163">
P2: This helper switch makes PVC expansion tests auto-complete by mutating `Status.Capacity` during `Update`, so the resize/wait path is no longer tested realistically.</violation>
</file>

<file name="internal/controller/operator/factory/reconcile/statefulset_pvc_expand.go">

<violation number="1" location="internal/controller/operator/factory/reconcile/statefulset_pvc_expand.go:127">
P1: This wait uses the pre-update PVC size, so resized claims can be treated as ready before expansion has completed.</violation>
</file>

<file name="internal/controller/operator/factory/reconcile/pvc.go">

<violation number="1" location="internal/controller/operator/factory/reconcile/pvc.go:67">
P2: `waitForPVCReady` treats an unprovisioned PVC as ready by returning success when `status.capacity` is empty. Keep polling instead, otherwise new PVCs bypass the new readiness wait entirely.</violation>
</file>

<file name="internal/controller/operator/factory/k8stools/interceptors.go">

<violation number="1" location="internal/controller/operator/factory/k8stools/interceptors.go:49">
P2: Creating VMAuth/VMCluster/VMAgent no longer persists the mocked status, so tests that read them back after `Create` will see empty status.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

| VM_PODWAITREADYTIMEOUT: `80s` <a href="#variables-vm-podwaitreadytimeout" id="variables-vm-podwaitreadytimeout">#</a><br>Defines single pod deadline to wait for transition to ready state |
| VM_PVC_WAIT_READY_INTERVAL: `5s` <a href="#variables-vm-pvc-wait-ready-interval" id="variables-vm-pvc-wait-ready-interval">#</a><br>Defines poll interval for PVC ready check |
| VM_PVC_WAIT_READY_TIMEOUT: `80s` <a href="#variables-vm-pvc-wait-ready-timeout" id="variables-vm-pvc-wait-ready-timeout">#</a><br>Defines poll timeout for PVC ready check |
| VM_WAIT_READY_INTERVAL: `5s` <a href="#variables-vm-wait-ready-interval" id="variables-vm-wait-ready-interval">#</a><br>Defines poll interval for VM CRs |
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot Mar 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3: Narrow this description: VM_WAIT_READY_INTERVAL does not apply to all VM CRs, only to the waitForStatus loop used for VMAgent, VMCluster, and VMAuth.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At docs/env.md, line 237:

<comment>Narrow this description: `VM_WAIT_READY_INTERVAL` does not apply to all VM CRs, only to the `waitForStatus` loop used for VMAgent, VMCluster, and VMAuth.</comment>

<file context>
@@ -230,7 +230,10 @@
+| VM_PODWAITREADYTIMEOUT: `80s` <a href="#variables-vm-podwaitreadytimeout" id="variables-vm-podwaitreadytimeout">#</a><br>Defines single pod deadline to wait for transition to ready state |
+| VM_PVC_WAIT_READY_INTERVAL: `5s` <a href="#variables-vm-pvc-wait-ready-interval" id="variables-vm-pvc-wait-ready-interval">#</a><br>Defines poll interval for PVC ready check |
+| VM_PVC_WAIT_READY_TIMEOUT: `80s` <a href="#variables-vm-pvc-wait-ready-timeout" id="variables-vm-pvc-wait-ready-timeout">#</a><br>Defines poll timeout for PVC ready check |
+| VM_WAIT_READY_INTERVAL: `5s` <a href="#variables-vm-wait-ready-interval" id="variables-vm-wait-ready-interval">#</a><br>Defines poll interval for VM CRs |
 | VM_FORCERESYNCINTERVAL: `60s` <a href="#variables-vm-forceresyncinterval" id="variables-vm-forceresyncinterval">#</a><br>configures force resync interval for VMAgent, VMAlert, VMAlertmanager and VMAuth. |
 | VM_ENABLESTRICTSECURITY: `false` <a href="#variables-vm-enablestrictsecurity" id="variables-vm-enablestrictsecurity">#</a><br>EnableStrictSecurity will add default `securityContext` to pods and containers created by operator Default PodSecurityContext include: 1. RunAsNonRoot: true 2. RunAsUser/RunAsGroup/FSGroup: 65534 '65534' refers to 'nobody' in all the used default images like alpine, busybox. If you're using customize image, please make sure '65534' is a valid uid in there or specify SecurityContext. 3. FSGroupChangePolicy: &onRootMismatch If KubeVersion>=1.20, use `FSGroupChangePolicy="onRootMismatch"` to skip the recursive permission change when the root of the volume already has the correct permissions 4. SeccompProfile:      type: RuntimeDefault Use `RuntimeDefault` seccomp profile by default, which is defined by the container runtime, instead of using the Unconfined (seccomp disabled) mode. Default container SecurityContext include: 1. AllowPrivilegeEscalation: false 2. ReadOnlyRootFilesystem: true 3. Capabilities:      drop:        - all turn off `EnableStrictSecurity` by default, see https://github.com/VictoriaMetrics/operator/issues/749 for details |
</file context>
Suggested change
| VM_WAIT_READY_INTERVAL: `5s` <a href="#variables-vm-wait-ready-interval" id="variables-vm-wait-ready-interval">#</a><br>Defines poll interval for VM CRs |
| VM_WAIT_READY_INTERVAL: `5s` <a href="#variables-vm-wait-ready-interval" id="variables-vm-wait-ready-interval">#</a><br>Defines poll interval for status checks of VMAgent, VMCluster and VMAuth CRs |
Fix with Cubic

@AndrewChubatiuk AndrewChubatiuk force-pushed the release-0.68-next-release branch from 6afecdc to 1efa39b Compare March 17, 2026 14:13
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 8 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="test/e2e/vmsingle_test.go">

<violation number="1" location="test/e2e/vmsingle_test.go:514">
P2: The new proxy-protocol e2e case has an empty verify block, so it doesn't actually validate that `httpListenAddr.useProxyProtocol=true` is applied.</violation>
</file>

<file name="test/e2e/vlsingle_test.go">

<violation number="1" location="test/e2e/vlsingle_test.go:177">
P2: This e2e case doesn't assert anything about proxy protocol, so it will pass even if the operator ignores `httpListenAddr.useProxyProtocol`.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

"httpListenAddr.useProxyProtocol": "true",
}
},
verify: func(cr *vmv1beta1.VMSingle) {},
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot Mar 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: The new proxy-protocol e2e case has an empty verify block, so it doesn't actually validate that httpListenAddr.useProxyProtocol=true is applied.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At test/e2e/vmsingle_test.go, line 514:

<comment>The new proxy-protocol e2e case has an empty verify block, so it doesn't actually validate that `httpListenAddr.useProxyProtocol=true` is applied.</comment>

<file context>
@@ -503,6 +503,17 @@ var _ = Describe("test vmsingle Controller", Label("vm", "single"), func() {
+								"httpListenAddr.useProxyProtocol": "true",
+							}
+						},
+						verify: func(cr *vmv1beta1.VMSingle) {},
+					},
+				),
</file context>
Suggested change
verify: func(cr *vmv1beta1.VMSingle) {},
verify: func(cr *vmv1beta1.VMSingle) {
var createdDeploy appsv1.Deployment
Expect(k8sClient.Get(ctx, types.NamespacedName{Namespace: namespace, Name: cr.PrefixedName()}, &createdDeploy)).ToNot(HaveOccurred())
Expect(createdDeploy.Spec.Template.Spec.Containers).ToNot(BeEmpty())
Expect(createdDeploy.Spec.Template.Spec.Containers[0].Args).To(ContainElement("-httpListenAddr.useProxyProtocol=true"))
},
Fix with Cubic

RetentionPeriod: "1",
},
},
func(cr *vmv1.VLSingle) {},
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot Mar 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: This e2e case doesn't assert anything about proxy protocol, so it will pass even if the operator ignores httpListenAddr.useProxyProtocol.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At test/e2e/vlsingle_test.go, line 177:

<comment>This e2e case doesn't assert anything about proxy protocol, so it will pass even if the operator ignores `httpListenAddr.useProxyProtocol`.</comment>

<file context>
@@ -159,6 +159,23 @@ var _ = Describe("test vlsingle Controller", Label("vl", "single", "vlsingle"),
+							RetentionPeriod: "1",
+						},
+					},
+					func(cr *vmv1.VLSingle) {},
+				),
 			)
</file context>
Suggested change
func(cr *vmv1.VLSingle) {},
func(cr *vmv1.VLSingle) {
createdChildObjects := types.NamespacedName{Namespace: namespace, Name: cr.PrefixedName()}
var createdDeploy appsv1.Deployment
Expect(k8sClient.Get(ctx, createdChildObjects, &createdDeploy)).ToNot(HaveOccurred())
Expect(createdDeploy.Spec.Template.Spec.Containers).To(HaveLen(1))
Expect(createdDeploy.Spec.Template.Spec.Containers[0].Args).To(ContainElement("-httpListenAddr.useProxyProtocol=true"))
},
Fix with Cubic

Ensure that we configure correct healthchecks for components with proxy protocol enabled
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 101 files (changes from recent commits).

Note: This PR contains a large number of files. cubic only reviews up to 75 files per PR, so some files may not have been reviewed.

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="go.mod">

<violation number="1" location="go.mod:129">
P2: Avoid replacing a core dependency with a personal fork in release code; this creates supply-chain and long-term maintenance risk for reproducible builds.</violation>
</file>

<file name="internal/controller/operator/factory/vmalertmanager/vmalertmanager_reconcile_test.go">

<violation number="1" location="internal/controller/operator/factory/vmalertmanager/vmalertmanager_reconcile_test.go:181">
P2: This does not simulate a status-only change; it switches the test into last-applied-spec reconciliation instead.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.


replace github.com/VictoriaMetrics/operator/api => ./api

replace github.com/caarlos0/env/v11 => github.com/AndrewChubatiuk/env/v11 v11.0.0-20260302065400-14d0354881b6
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot Mar 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Avoid replacing a core dependency with a personal fork in release code; this creates supply-chain and long-term maintenance risk for reproducible builds.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At go.mod, line 129:

<comment>Avoid replacing a core dependency with a personal fork in release code; this creates supply-chain and long-term maintenance risk for reproducible builds.</comment>

<file context>
@@ -125,3 +125,5 @@ require (
 
 replace github.com/VictoriaMetrics/operator/api => ./api
+
+replace github.com/caarlos0/env/v11 => github.com/AndrewChubatiuk/env/v11 v11.0.0-20260302065400-14d0354881b6
</file context>
Fix with Cubic

Co-authored-by: Vadim Rutkovsky <vadim@vrutkovs.eu>
@AndrewChubatiuk AndrewChubatiuk force-pushed the release-0.68-next-release branch from 9675f15 to de3c010 Compare March 18, 2026 23:24
@AndrewChubatiuk AndrewChubatiuk force-pushed the release-0.68-next-release branch 4 times, most recently from 8803078 to 677a81f Compare March 22, 2026 06:24
@AndrewChubatiuk AndrewChubatiuk force-pushed the release-0.68-next-release branch from 677a81f to df38d2c Compare March 24, 2026 15:32
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 3 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="docs/CHANGELOG.md">

<violation number="1" location="docs/CHANGELOG.md:16">
P2: Custom agent: **Changelog Review Agent**

Changelog entries must include the required user-centric before/after explanation; these new entries only state implementation details and omit how behavior changed for users, so they don’t meet the mandated structure.</violation>

<violation number="2" location="docs/CHANGELOG.md:18">
P3: The issue number label and linked URL mismatch (`#1970` points to `/issues/1983`), which makes the changelog reference inaccurate.

(Based on your team's feedback about keeping docs links accurate.) [FEEDBACK_USED]</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.


## tip

* FEATURE: [vmagent](https://docs.victoriametrics.com/operator/resources/vmagent/): introduce statefulRollingUpdateStrategyBehavior to allow managing VMAgent update strategy in a statefulMode. See [#1987](https://github.com/VictoriaMetrics/operator/issues/1987).
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Custom agent: Changelog Review Agent

Changelog entries must include the required user-centric before/after explanation; these new entries only state implementation details and omit how behavior changed for users, so they don’t meet the mandated structure.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At docs/CHANGELOG.md, line 16:

<comment>Changelog entries must include the required user-centric before/after explanation; these new entries only state implementation details and omit how behavior changed for users, so they don’t meet the mandated structure.</comment>

<file context>
@@ -13,6 +13,11 @@ aliases:
 
 ## tip
 
+* FEATURE: [vmagent](https://docs.victoriametrics.com/operator/resources/vmagent/): introduce statefulRollingUpdateStrategyBehavior to allow managing VMAgent update strategy in a statefulMode. See [#1987](https://github.com/VictoriaMetrics/operator/issues/1987).
+
+* BUGFIX: [vmoperator](https://docs.victoriametrics.com/operator/): wait till PVC resize finished. See [#1970](https://github.com/VictoriaMetrics/operator/issues/1983).
</file context>
Fix with Cubic

Signed-off-by: Vadim Rutkovsky <vadim@vrutkovs.eu>
Co-authored-by: Vadim Rutkovsky <vadim@vrutkovs.eu>
@AndrewChubatiuk AndrewChubatiuk force-pushed the release-0.68-next-release branch from ab62768 to 060ec35 Compare March 30, 2026 07:09
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 3 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="docs/CHANGELOG.md">

<violation number="1" location="docs/CHANGELOG.md:18">
P1: Custom agent: **Changelog Review Agent**

These new BUGFIX entries violate the changelog structure rule’s required user-centric explanation section: they do not describe before/after behavior and explicit user-visible impact.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

vrutkovs and others added 2 commits March 30, 2026 11:51
* fix: add cancellation reason for graceful shutdown, retry other requests

* make waitForEmptyPQ return no error

---------

Co-authored-by: Andrii Chubatiuk <andrew.chubatiuk@gmail.com>
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 54 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="api/operator/v1beta1/vmuser_types.go">

<violation number="1" location="api/operator/v1beta1/vmuser_types.go:269">
P1: Avoid embedding raw VMUser spec JSON in parse errors; it can leak credentials via reconcile error events/status.</violation>
</file>

<file name="internal/controller/operator/factory/vmdistributed/zone.go">

<violation number="1" location="internal/controller/operator/factory/vmdistributed/zone.go:199">
P2: Wrap `ctx.Err()` when returning this timeout/cancellation error so callers can preserve and inspect the root cause.</violation>
</file>

<file name="api/operator/v1beta1/vmrule_types.go">

<violation number="1" location="api/operator/v1beta1/vmrule_types.go:247">
P2: This `VMRule` unmarshaler hides all object decode failures by always returning nil. Keep parse-error swallowing at `VMRuleSpec` level so top-level/metadata decode errors still fail fast.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

return fmt.Errorf("zone=%s: failed to wait till VMCluster=%s queue is empty: %w", item, nsnCluster.String(), err)
zs.waitForEmptyPQ(ctx, rclient, defaultMetricsCheckInterval, i)
if ctx.Err() != nil {
return fmt.Errorf("zone=%s: failed to wait till VMCluster=%s queue is empty", item, nsnCluster.String())
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Wrap ctx.Err() when returning this timeout/cancellation error so callers can preserve and inspect the root cause.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At internal/controller/operator/factory/vmdistributed/zone.go, line 199:

<comment>Wrap `ctx.Err()` when returning this timeout/cancellation error so callers can preserve and inspect the root cause.</comment>

<file context>
@@ -195,8 +194,9 @@ func (zs *zones) upgrade(ctx context.Context, rclient client.Client, cr *vmv1alp
-			return fmt.Errorf("zone=%s: failed to wait till VMCluster=%s queue is empty: %w", item, nsnCluster.String(), err)
+		zs.waitForEmptyPQ(ctx, rclient, defaultMetricsCheckInterval, i)
+		if ctx.Err() != nil {
+			return fmt.Errorf("zone=%s: failed to wait till VMCluster=%s queue is empty", item, nsnCluster.String())
 		}
 
</file context>
Suggested change
return fmt.Errorf("zone=%s: failed to wait till VMCluster=%s queue is empty", item, nsnCluster.String())
return fmt.Errorf("zone=%s: failed to wait till VMCluster=%s queue is empty: %w", item, nsnCluster.String(), ctx.Err())
Fix with Cubic

// UnmarshalJSON implements json.Unmarshaler interface
func (r *VMRule) UnmarshalJSON(src []byte) error {
type rcfg VMRule
if err := json.Unmarshal(src, (*rcfg)(r)); err != nil {
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: This VMRule unmarshaler hides all object decode failures by always returning nil. Keep parse-error swallowing at VMRuleSpec level so top-level/metadata decode errors still fail fast.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At api/operator/v1beta1/vmrule_types.go, line 247:

<comment>This `VMRule` unmarshaler hides all object decode failures by always returning nil. Keep parse-error swallowing at `VMRuleSpec` level so top-level/metadata decode errors still fail fast.</comment>

<file context>
@@ -230,6 +241,16 @@ type VMRule struct {
+// UnmarshalJSON implements json.Unmarshaler interface
+func (r *VMRule) UnmarshalJSON(src []byte) error {
+	type rcfg VMRule
+	if err := json.Unmarshal(src, (*rcfg)(r)); err != nil {
+		r.Spec.ParsingError = fmt.Sprintf("cannot parse vmrule config: %s, err: %s", string(src), err)
+		return nil
</file context>
Fix with Cubic

AndrewChubatiuk and others added 3 commits March 30, 2026 14:38
Any zone which is not Operating - that is Failed on Extending - should be picked up first. The
reason for this is that timed out zone update won't change the zone status to Failed and it would be
considered functioning.

This commit also adds unit tests for zone sorting
…positive number (#2002)

Co-authored-by: Vadim Rutkovsky <vadim@vrutkovs.eu>
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 19 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="file">

<violation number="1" location="file:7">
P0: Do not commit a plaintext license key in the manifest; source it from a secure secret management flow at deploy time.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.

* test: add rule rebalance tests

* fix: reconcile objects when configmap/secret gets removed or changes a list of keys
@AndrewChubatiuk AndrewChubatiuk force-pushed the release-0.68-next-release branch from 7e2fb03 to 3f2bdea Compare March 30, 2026 17:12
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 3 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="api/operator/v1/vlagent_types.go">

<violation number="1" location="api/operator/v1/vlagent_types.go:299">
P2: CSV metadata now lists only DaemonSet, but VLAgent also manages StatefulSets. Add StatefulSet to the `gen-csv` resource annotations to keep OLM metadata accurate.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.

VMAgent / VLAgents may run in daemonset mode, so the controller should be reconciling them too
@AndrewChubatiuk AndrewChubatiuk force-pushed the release-0.68-next-release branch from b96a741 to 9c2f004 Compare March 31, 2026 05:27
djluck and others added 6 commits March 31, 2026 10:13
* Improving logging telemetry in the VMDistributed path:
- Common log entries are now marked as "debug" level
- waitForStatus now reports any errors periodically rather than waiting until the timeout occurs

* Making changes suggested

* applied suggestions

---------

Co-authored-by: Andrii Chubatiuk <andrew.chubatiuk@gmail.com>
Co-authored-by: Vadim Rutkovsky <vadim@vrutkovs.eu>
…led (#2013)

Co-authored-by: Vadim Rutkovsky <vadim@vrutkovs.eu>
…rage is not enabled (#1911)

Co-authored-by: Vadim Rutkovsky <vadim@vrutkovs.eu>
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 7 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="internal/controller/operator/factory/vmdistributed/vmauth.go">

<violation number="1" location="internal/controller/operator/factory/vmdistributed/vmauth.go:17">
P2: Include the OwnerReference UID in the match to avoid attaching resources from a previous VMDistributed instance with the same name.</violation>
</file>

<file name="docs/CHANGELOG.md">

<violation number="1" location="docs/CHANGELOG.md:22">
P1: Custom agent: **Changelog Review Agent**

Changelog entry lacks the required user‑centric before/after explanation and user‑visible impact; the rule’s required structure is not met.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.

* BUGFIX: [vmalertmanager](https://docs.victoriametrics.com/operator/resources/vmalertmanager/): fixed ignored tracing config, when no alertmanagerconfig CRs collected. See [#1983](https://github.com/VictoriaMetrics/operator/issues/1983).
* BUGFIX: [vmagent](https://docs.victoriametrics.com/operator/resources/vmagent/): apply scrape class relabellings before job ones. See [#1997](https://github.com/VictoriaMetrics/operator/issues/1997).
* BUGFIX: [vmanomaly](https://docs.victoriametrics.com/operator/resources/vmanomaly/) and [vmagent](https://docs.victoriametrics.com/operator/resources/vmagent/): render %SHARD_NUM% placeholder when shard count is greater than 0. See [#2001](https://github.com/VictoriaMetrics/operator/issues/2001).
* BUGFIX: [vlcluster](https://docs.victoriametrics.com/operator/resources/vlcluster/) and [vtcluster](https://docs.victoriametrics.com/operator/resources/vtcluster/): do not ignore ExtraStorageNodes for select, when default storage is disabled. See [#1910](https://github.com/VictoriaMetrics/operator/issues/1910).
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Custom agent: Changelog Review Agent

Changelog entry lacks the required user‑centric before/after explanation and user‑visible impact; the rule’s required structure is not met.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At docs/CHANGELOG.md, line 22:

<comment>Changelog entry lacks the required user‑centric before/after explanation and user‑visible impact; the rule’s required structure is not met.</comment>

<file context>
@@ -19,6 +19,7 @@ aliases:
 * BUGFIX: [vmalertmanager](https://docs.victoriametrics.com/operator/resources/vmalertmanager/): fixed ignored tracing config, when no alertmanagerconfig CRs collected. See [#1983](https://github.com/VictoriaMetrics/operator/issues/1983).
 * BUGFIX: [vmagent](https://docs.victoriametrics.com/operator/resources/vmagent/): apply scrape class relabellings before job ones. See [#1997](https://github.com/VictoriaMetrics/operator/issues/1997).
 * BUGFIX: [vmanomaly](https://docs.victoriametrics.com/operator/resources/vmanomaly/) and [vmagent](https://docs.victoriametrics.com/operator/resources/vmagent/): render %SHARD_NUM% placeholder when shard count is greater than 0. See [#2001](https://github.com/VictoriaMetrics/operator/issues/2001).
+* BUGFIX: [vlcluster](https://docs.victoriametrics.com/operator/resources/vlcluster/) and [vtcluster](https://docs.victoriametrics.com/operator/resources/vtcluster/): do not ignore ExtraStorageNodes for select, when default storage is disabled. See [#1910](https://github.com/VictoriaMetrics/operator/issues/1910).
 
 ## [v0.68.3](https://github.com/VictoriaMetrics/operator/releases/tag/v0.68.3)
</file context>
Suggested change
* BUGFIX: [vlcluster](https://docs.victoriametrics.com/operator/resources/vlcluster/) and [vtcluster](https://docs.victoriametrics.com/operator/resources/vtcluster/): do not ignore ExtraStorageNodes for select, when default storage is disabled. See [#1910](https://github.com/VictoriaMetrics/operator/issues/1910).
* BUGFIX: [vlcluster](https://docs.victoriametrics.com/operator/resources/vlcluster/) and [vtcluster](https://docs.victoriametrics.com/operator/resources/vtcluster/): previously, `extraStorageNodes` were ignored for select traffic when default storage was disabled, so queries could miss those nodes; now the operator includes `extraStorageNodes`, restoring expected query availability. See [#1910](https://github.com/VictoriaMetrics/operator/issues/1910).
Fix with Cubic

func hasOwnerReference(owners []metav1.OwnerReference, owner *metav1.OwnerReference) bool {
for i := range owners {
o := &owners[i]
if o.APIVersion == owner.APIVersion && o.Kind == owner.Kind && o.Name == owner.Name {
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Include the OwnerReference UID in the match to avoid attaching resources from a previous VMDistributed instance with the same name.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At internal/controller/operator/factory/vmdistributed/vmauth.go, line 17:

<comment>Include the OwnerReference UID in the match to avoid attaching resources from a previous VMDistributed instance with the same name.</comment>

<file context>
@@ -11,14 +11,24 @@ import (
+func hasOwnerReference(owners []metav1.OwnerReference, owner *metav1.OwnerReference) bool {
+	for i := range owners {
+		o := &owners[i]
+		if o.APIVersion == owner.APIVersion && o.Kind == owner.Kind && o.Name == owner.Name {
+			return true
+		}
</file context>
Suggested change
if o.APIVersion == owner.APIVersion && o.Kind == owner.Kind && o.Name == owner.Name {
if o.APIVersion == owner.APIVersion && o.Kind == owner.Kind && o.Name == owner.Name && o.UID == owner.UID {
Fix with Cubic

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 4 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="internal/controller/operator/factory/k8stools/interceptors.go">

<violation number="1" location="internal/controller/operator/factory/k8stools/interceptors.go:66">
P2: Avoid resetting CreationTimestamp on update; it should represent the original creation time. Otherwise update calls will make objects appear newly created and can break age-based logic in tests.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.

@AndrewChubatiuk AndrewChubatiuk force-pushed the release-0.68-next-release branch 3 times, most recently from 668bb8b to 256faa1 Compare March 31, 2026 18:35
@AndrewChubatiuk AndrewChubatiuk force-pushed the release-0.68-next-release branch from 256faa1 to 14c05a0 Compare March 31, 2026 19:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants