Skip to content

[OSPRH-29575] Clean up certificate handling#105

Merged
openshift-merge-bot[bot] merged 1 commit into
openstack-lightspeed:lcore-migrationfrom
lpiwowar:lpiwowar/certs
Jun 2, 2026
Merged

[OSPRH-29575] Clean up certificate handling#105
openshift-merge-bot[bot] merged 1 commit into
openstack-lightspeed:lcore-migrationfrom
lpiwowar:lpiwowar/certs

Conversation

@lpiwowar
Copy link
Copy Markdown
Contributor

@lpiwowar lpiwowar commented May 20, 2026

Issue

There were multiple issues with how the operator handled certificates:

  1. The lightspeed-stack container used REQUESTS_CA_BUNDLE and SSL_CERT_FILE environment variables, which bypassed the system-configured certificates.

  2. When a user provided a custom CA certificate, it was expected under the cert.crt key in their ConfigMap. This was undocumented and the required key name was not obvious.

  3. PostgresDB appeared to be configured for mTLS because ssl_ca_file was set in postgres.conf and the openshift-service-ca.crt was mounted into the PostgresDB pod. This created a false sense of mTLS being in place, but with the default pg_hba.conf, client certificate verification [1] is not enabled. Neither OGX [3][4] nor Lightspeed Stack [5] supports providing client certificates to PostgresDB.

  4. PostgresDB SSL connection settings were configured for OGX even though they have no effect. OGX does not support configuring the SSL mode for its PostgresDB connection [3][4], so the PostgresDB certificate verification cannot be strictly enforced on the OGX side (the default is "prefer" [2], which does not enforce certificate verification and can fall back to unencrypted communication). OGX uses a non-strict config mode, so unrecognized options are silently ignored. Securing the connection between OGX container and Postgres DB container should be done as part of follow up work (OSPRH-30503).

  5. The operator did not watch for changes to ConfigMaps. When the content of the CA bundle ConfigMap was updated, the operator did not automatically reconcile.

  6. Not a bug strictly speaking, but Lightspeed Stack used ssl_mode "require" when it could have used "verify-full", which checks both that the certificate is signed by a trusted CA and that the server hostname matches the CN field in the certificate.

Proposed Solution

This commit simplifies certificate handling with the following changes:

  • Introduce a single combined CA bundle ConfigMap (openstack-lightspeed-combined-ca-bundle) containing the system CAs, user-provided CA certificates from the OpenStackLightspeed CRD, kube-root-ca.crt, and openshift-service-ca.crt. This bundle is mounted into all containers in the lightspeed-stack-deployment pod, eliminating the need for REQUESTS_CA_BUNDLE and SSL_CERT_FILE.

  • When a user specifies a ConfigMap with custom CA certificates, iterate over all keys, validate that each holds a valid certificate, and append it to the combined CA bundle (resolves #2).

  • Stop mounting openshift-service-ca.crt into the Postgres pod and remove ssl_ca_file from postgres.conf. These gave a false sense of client certificate validation. Actually enforcing it requires configuring pg_hba.conf [1], and neither OGX [3][4][6] nor Lightspeed Stack [5] currently supports providing client certificates.

  • Remove ssl_mode, ca_cert_path, and gss_encmode from storage.backends.postgres_backend in ogx_config.yaml. These options are not supported by OGX [3][4][6] and gave a false sense of SSL being configured.

  • Add a Watch() on ConfigMaps to the reconciler so that whenever a user updates the CA bundle ConfigMap, the reconcile loop runs automatically.

  • Configure Lightspeed Stack with ssl_mode "verify-full" for its PostgreSQL connection, ensuring both CA trust and hostname verification.

[1] https://www.postgresql.org/docs/current/ssl-tcp.html#SSL-CLIENT-CERTIFICATES
[2] https://magicstack.github.io/asyncpg/current/api/index.html#asyncpg.connection.connect
[3] https://github.com/ogx-ai/ogx/blob/34d7901/src/ogx/core/storage/datatypes.py#L200
[4] https://github.com/ogx-ai/ogx/blob/34d7901/src/ogx/core/storage/sqlstore/sqlalchemy_sqlstore.py#L125
[5] https://github.com/lightspeed-core/lightspeed-stack/blob/7503ebd/src/models/config.py#L181
[6] ogx-ai/ogx#5978

Summary by CodeRabbit

  • New Features

    • Consolidated CA certificate management: operator now builds and serves a single combined CA bundle used by services and sidecars.
    • Reconciler watches CA ConfigMaps and triggers updates when relevant CA bundles change.
  • Behavior Changes

    • PostgreSQL connections now use stricter certificate verification (ssl_mode: verify-full) and point to the system CA bundle path.
  • Tests

    • Added comprehensive tests covering CA bundle parsing, deduplication, expiry handling, encoding, and reconciliation.

Review Change Stack

@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented May 20, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 20, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 90a76b87-ed1f-4b86-a1ab-c0aaabb4d759

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR consolidates operator/system, kube-root, OpenShift service CA, and optional user CAs into a single reconciled ConfigMap and mounts it into deployments; it updates constants, errors, controller watches, deployment wiring, Postgres TLS config paths/SSL mode, and test fixtures.

Changes

Combined CA Bundle System

Layer / File(s) Summary
CA Bundle Configuration and Error Definitions
internal/controller/constants.go, internal/controller/errors.go
Adds combined-CA constants (ConfigMap name/key, volume/mount/path, system CA path, kube-root CA name), Postgres constants, changes PostgresDefaultSSLMode"verify-full", and new CA-related error variables.
CA Bundle Data Model and Core Operations
internal/controller/ca_bundle.go
Defines internal cert structures (DER hash, expiry), reads system CA bundle, parses/validates PEM (CERTIFICATE-only, skip expired), deduplicates by SHA-256 of DER, encodes PEM, merges ConfigMap-backed CA sources, and reconciles a managed combined CA ConfigMap with controller ownership.
CA Bundle Reconciliation Test Suite
internal/controller/ca_bundle_test.go
Adds TestMain helpers and comprehensive unit/integration tests for PEM parsing/encoding, expiry handling, deduplication, and reconcile behavior (merge and failure modes) using a fake client.
Controller Watch Setup for CA Bundle Changes
internal/controller/openstacklightspeed_controller.go
Adds ConfigMap watch filtered by resource-version changes and a notifier mapping changed ConfigMaps to OpenStackLightspeed reconcile requests when Spec.TLSCACertBundle matches.
Deployment and Configuration Integration
internal/controller/lcore_deployment.go, internal/controller/lcore_config.go, internal/controller/lcore_reconciler.go, internal/controller/postgres_deployment.go, internal/controller/llama_stack_config.go
Replaces per-source CA mounts with a combined CA volume/mount helper, mounts combined bundle into sidecars, removes additional-CA env vars, adds combined-CA annotation for rollout, updates config to use combined CA mount path, removes Postgres CA volume, removes explicit Postgres TLS fields from Llama Stack backend, and adds CombinedCABundle reconciliation task.
Removal of Legacy CA System and Test Updates
internal/controller/common.go, internal/controller/assets/postgres.conf, test/kuttl/*
Removes legacy Postgres CA helpers and additional-CA reconciler, replaces ssl_ca_file with explanatory comments in postgres.conf, and updates KUTTL fixtures/assertions to include the combined CA bundle ConfigMap and mounts.

Sequence Diagram

sequenceDiagram
  participant Operator as OpenStackLightspeed Operator
  participant KubeAPI as Kubernetes API
  participant ConfigMapSources as ConfigMaps (kube-root-ca, service-ca, user CA)
  participant CombinedCM as combined-ca-bundle ConfigMap
  participant Deployment as LCore Deployment

  Operator->>KubeAPI: Watch OpenStackLightspeed CRs & ConfigMaps
  KubeAPI->>Operator: Notify on ConfigMap resourceVersion change
  Operator->>ConfigMapSources: Read system CA, kube-root CA, service CA, optional user CA
  ConfigMapSources-->>Operator: Return CA PEM data
  Operator->>Operator: Parse PEMs, skip expired, dedupe by SHA256(DER)
  Operator->>Operator: Merge and encode combined PEM
  Operator->>CombinedCM: Create/patch combined-ca-bundle ConfigMap
  CombinedCM-->>Operator: ConfigMap created/updated
  Operator->>Deployment: Ensure pod templates mount combined CA at configured path
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • openstack-lightspeed/operator#107: Both PRs modify Postgres pod-template construction (buildPostgresPodTemplateSpec); this one removes CA volumes while that PR adjusts data volume mounting.
  • openstack-lightspeed/operator#94: Introduced initial lightspeed-stack reconciliation and CA/TLS wiring; this PR refactors that wiring into a combined CA bundle reconciliation.

Suggested labels

lgtm

Suggested reviewers

  • umago

Poem

🐰 Hopped in with a bundle bright,
Collected certs from left and right,
Expired ones tossed, duplicates shorn,
One ConfigMap now greets the morn,
Pods sip TLS with unified delight.

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 48.72% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Linked Issues check ❓ Inconclusive The linked issue #2 'Cleanup README' is a non-coding documentation task and is not related to the code changes (certificate handling, CA bundle reconciliation, TLS configuration) implemented in this PR. Verify the linked issue reference is correct; the #2 issue appears unrelated to certificate handling cleanup. Confirm that the PR addresses the intended OSPRH-29575 issue objectives.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly references the main objective (OSPRH-29575) and summarizes the primary focus on certificate handling cleanup, which aligns with the extensive changes across certificate configuration, CA bundle management, and TLS setup throughout the changeset.
Out of Scope Changes check ✅ Passed All changes are focused on certificate handling cleanup, CA bundle management, and TLS configuration as described in the PR objectives; no out-of-scope modifications detected.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@lpiwowar
Copy link
Copy Markdown
Contributor Author

/test all

@lpiwowar lpiwowar changed the title Support arbitrary ConfigMap key names for TLS CA bundle Consolidate CA certificates into a single combined CA bundle May 21, 2026
@lpiwowar
Copy link
Copy Markdown
Contributor Author

/test all

@lpiwowar
Copy link
Copy Markdown
Contributor Author

/test all

@lpiwowar lpiwowar changed the title Consolidate CA certificates into a single combined CA bundle [OSPRH-29575] Clean up certificate handling May 27, 2026
@lpiwowar lpiwowar force-pushed the lpiwowar/certs branch 2 times, most recently from 01d0e1c to ef39cc1 Compare May 27, 2026 14:05
@lpiwowar
Copy link
Copy Markdown
Contributor Author

/test all

@lpiwowar
Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 27, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
internal/controller/ca_bundle.go (1)

71-72: 💤 Low value

Consider extracting "CERTIFICATE" to a constant.

The string "CERTIFICATE" appears multiple times in this file. Extracting it to a package-level constant would satisfy the linter and improve maintainability.

♻️ Suggested constant definition
+const pemTypeCertificate = "CERTIFICATE"
+
 type caCert struct {

Then replace occurrences:

-		if block.Type != "CERTIFICATE" {
+		if block.Type != pemTypeCertificate {
 		block := &pem.Block{
-			Type:  "CERTIFICATE",
+			Type:  pemTypeCertificate,
 			Bytes: c.cert.Raw,
 		}

Also applies to: 109-109

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/controller/ca_bundle.go` around lines 71 - 72, Extract the repeated
literal "CERTIFICATE" into a package-level constant (e.g., const pemCertificate
= "CERTIFICATE") and replace all literal uses in this file—such as the
block.Type comparison in the error path that currently uses block.Type !=
"CERTIFICATE" and any other occurrences (including the fmt.Errorf message and
the occurrence around line 109)—with that constant; ensure the error message
still prints block.Type but references the constant name in the code instead of
the raw string.
internal/controller/openstacklightspeed_controller.go (1)

306-326: ⚡ Quick win

Consider also triggering reconciliation when system CA ConfigMaps change.

The handler only triggers reconciliation when a user-provided CA ConfigMap (matching Spec.TLSCACertBundle) changes. However, reconcileCombinedCABundleConfigMap also merges openshift-service-ca.crt and kube-root-ca.crt into the combined bundle. If those system ConfigMaps rotate (e.g., service CA renewal), the combined bundle won't be updated until another reconciliation occurs.

Consider extending the handler to also trigger on changes to the system CA ConfigMaps:

♻️ Suggested enhancement
 func (r *OpenStackLightspeedReconciler) NotifyOpenStackLightspeedsByCAConfigMap(ctx context.Context, obj client.Object) []ctrl.Request {
+	// System CA ConfigMaps affect all instances - trigger reconciliation for all
+	if obj.GetName() == OpenShiftServiceCAConfigMap || obj.GetName() == KubeRootCAConfigMap {
+		return r.NotifyAllOpenStackLightspeeds(ctx, obj)
+	}
+
+	// User-provided CA ConfigMaps - only trigger for matching instances
 	var lightspeedList apiv1beta1.OpenStackLightspeedList
 	if err := r.List(ctx, &lightspeedList, client.InNamespace(obj.GetNamespace())); err != nil {
 		return nil
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/controller/openstacklightspeed_controller.go` around lines 306 -
326, The current NotifyOpenStackLightspeedsByCAConfigMap only enqueues
reconcilations when a user-specified CA ConfigMap matches Spec.TLSCACertBundle;
extend it to also detect system CA ConfigMap changes (names
"openshift-service-ca.crt" and "kube-root-ca.crt") and, when the incoming
obj.GetName() matches either of those, enqueue ctrl.Requests for all
OpenStackLightspeed items in the list (same way as the loop currently builds
requests) so reconcileCombinedCABundleConfigMap will run and rebuild the
combined bundle; update NotifyOpenStackLightspeedsByCAConfigMap to check
obj.GetName() against those system names and append requests for every item when
a match is found.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/controller/ca_bundle_test.go`:
- Around line 590-592: The test assertion expects "service CA" but
reconcileCombinedCABundleConfigMap wraps errors with the ConfigMap name
(openshift-service-ca.crt); update the assertion in ca_bundle_test.go to check
for the actual ConfigMap name or the exact wrapped message from
reconcileCombinedCABundleConfigMap (e.g., assert strings.Contains(err.Error(),
"openshift-service-ca.crt") or match the full wrapped error), referencing the
reconcileCombinedCABundleConfigMap function to ensure the test matches its error
text.
- Around line 628-630: The test assertion is checking for "kube root CA" but the
actual error message contains "kube-root-ca.crt"; update the assertion in the
test (the strings.Contains(err.Error(), "kube root CA") check) to look for
"kube-root-ca.crt" (or a more robust substring like "kube-root-ca" or a
case-insensitive match) so the test matches the real error text returned by err.

---

Nitpick comments:
In `@internal/controller/ca_bundle.go`:
- Around line 71-72: Extract the repeated literal "CERTIFICATE" into a
package-level constant (e.g., const pemCertificate = "CERTIFICATE") and replace
all literal uses in this file—such as the block.Type comparison in the error
path that currently uses block.Type != "CERTIFICATE" and any other occurrences
(including the fmt.Errorf message and the occurrence around line 109)—with that
constant; ensure the error message still prints block.Type but references the
constant name in the code instead of the raw string.

In `@internal/controller/openstacklightspeed_controller.go`:
- Around line 306-326: The current NotifyOpenStackLightspeedsByCAConfigMap only
enqueues reconcilations when a user-specified CA ConfigMap matches
Spec.TLSCACertBundle; extend it to also detect system CA ConfigMap changes
(names "openshift-service-ca.crt" and "kube-root-ca.crt") and, when the incoming
obj.GetName() matches either of those, enqueue ctrl.Requests for all
OpenStackLightspeed items in the list (same way as the loop currently builds
requests) so reconcileCombinedCABundleConfigMap will run and rebuild the
combined bundle; update NotifyOpenStackLightspeedsByCAConfigMap to check
obj.GetName() against those system names and append requests for every item when
a match is found.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f7880f53-8b9a-4e6d-8e8e-c55751fd4542

📥 Commits

Reviewing files that changed from the base of the PR and between 0f8b424 and a176b84.

📒 Files selected for processing (20)
  • internal/controller/assets/postgres.conf
  • internal/controller/ca_bundle.go
  • internal/controller/ca_bundle_test.go
  • internal/controller/common.go
  • internal/controller/constants.go
  • internal/controller/errors.go
  • internal/controller/lcore_config.go
  • internal/controller/lcore_deployment.go
  • internal/controller/lcore_reconciler.go
  • internal/controller/llama_stack_config.go
  • internal/controller/openstacklightspeed_controller.go
  • internal/controller/postgres_deployment.go
  • test/kuttl/common/expected-configs/lightspeed-stack-update.yaml
  • test/kuttl/common/expected-configs/lightspeed-stack.yaml
  • test/kuttl/common/expected-configs/ogx_config-update.yaml
  • test/kuttl/common/expected-configs/ogx_config.yaml
  • test/kuttl/common/openstack-lightspeed-instance/assert-openstack-lightspeed-instance.yaml
  • test/kuttl/common/openstack-lightspeed-instance/errors-openstack-lightspeed-instance.yaml
  • test/kuttl/tests/update-openstacklightspeed/08-assert-openstacklightspeed-update.yaml
  • test/kuttl/tests/update-openstacklightspeed/11-assert-configmaps-update.yaml
💤 Files with no reviewable changes (5)
  • test/kuttl/common/expected-configs/ogx_config.yaml
  • test/kuttl/common/expected-configs/ogx_config-update.yaml
  • internal/controller/common.go
  • internal/controller/llama_stack_config.go
  • internal/controller/postgres_deployment.go

Comment thread internal/controller/ca_bundle_test.go Outdated
Comment thread internal/controller/ca_bundle_test.go Outdated
@lpiwowar
Copy link
Copy Markdown
Contributor Author

/test all

@lpiwowar
Copy link
Copy Markdown
Contributor Author

/test all

@lpiwowar
Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 28, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@lpiwowar
Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 28, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
internal/controller/ca_bundle_test.go (1)

434-638: ⚡ Quick win

Add reconcile coverage for Spec.TLSCACertBundle (including multi-key user ConfigMap).

Current reconcile tests don’t exercise the user-specified CA ConfigMap path, which is central to this change. Add a test that sets instance.Spec.TLSCACertBundle, provides a ConfigMap with multiple keys, and verifies all valid certs are merged (plus invalid-key failure case).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/controller/ca_bundle_test.go` around lines 434 - 638, Add a new unit
test that exercises reconcileCombinedCABundleConfigMap using an
OpenStackLightspeed instance with Spec.TLSCACertBundle set to a user ConfigMap
name; create that user ConfigMap with multiple data keys containing valid PEM
certs (and include a case with an invalid/non-cert key to assert failure), call
reconcileCombinedCABundleConfigMap, then fetch the resulting
CombinedCABundleConfigMapName and validate CombinedCABundleKey by parsing with
caBundle.getCertsFromPEM and asserting the expected number of
merged/deduplicated certs; reference functions/types:
reconcileCombinedCABundleConfigMap, OpenStackLightspeed.Spec.TLSCACertBundle,
CombinedCABundleConfigMapName, CombinedCABundleKey, and
caBundle.getCertsFromPEM.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/controller/ca_bundle.go`:
- Around line 63-101: The function getCertsFromPEM should reject trailing
non-PEM garbage: after the for loop that repeatedly calls pem.Decode(rest) and
before returning nil, check whether the remaining bytes in rest contain any
non-whitespace characters; if so return an error (e.g., fmt.Errorf("trailing
non-PEM data")) so inputs like "valid cert + junk bytes" fail validation. Locate
the loop that uses pem.Decode(rest) and the variable rest, and add this
post-loop validation to prevent accepting malformed ConfigMap values.

---

Nitpick comments:
In `@internal/controller/ca_bundle_test.go`:
- Around line 434-638: Add a new unit test that exercises
reconcileCombinedCABundleConfigMap using an OpenStackLightspeed instance with
Spec.TLSCACertBundle set to a user ConfigMap name; create that user ConfigMap
with multiple data keys containing valid PEM certs (and include a case with an
invalid/non-cert key to assert failure), call
reconcileCombinedCABundleConfigMap, then fetch the resulting
CombinedCABundleConfigMapName and validate CombinedCABundleKey by parsing with
caBundle.getCertsFromPEM and asserting the expected number of
merged/deduplicated certs; reference functions/types:
reconcileCombinedCABundleConfigMap, OpenStackLightspeed.Spec.TLSCACertBundle,
CombinedCABundleConfigMapName, CombinedCABundleKey, and
caBundle.getCertsFromPEM.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1830c078-881c-4c9e-ba9e-7a12b59a421c

📥 Commits

Reviewing files that changed from the base of the PR and between a176b84 and b3a96f5.

📒 Files selected for processing (20)
  • internal/controller/assets/postgres.conf
  • internal/controller/ca_bundle.go
  • internal/controller/ca_bundle_test.go
  • internal/controller/common.go
  • internal/controller/constants.go
  • internal/controller/errors.go
  • internal/controller/lcore_config.go
  • internal/controller/lcore_deployment.go
  • internal/controller/lcore_reconciler.go
  • internal/controller/llama_stack_config.go
  • internal/controller/openstacklightspeed_controller.go
  • internal/controller/postgres_deployment.go
  • test/kuttl/common/expected-configs/lightspeed-stack-update.yaml
  • test/kuttl/common/expected-configs/lightspeed-stack.yaml
  • test/kuttl/common/expected-configs/ogx_config-update.yaml
  • test/kuttl/common/expected-configs/ogx_config.yaml
  • test/kuttl/common/openstack-lightspeed-instance/assert-openstack-lightspeed-instance.yaml
  • test/kuttl/common/openstack-lightspeed-instance/errors-openstack-lightspeed-instance.yaml
  • test/kuttl/tests/update-openstacklightspeed/08-assert-openstacklightspeed-update.yaml
  • test/kuttl/tests/update-openstacklightspeed/11-assert-configmaps-update.yaml
💤 Files with no reviewable changes (5)
  • test/kuttl/common/expected-configs/ogx_config.yaml
  • test/kuttl/common/expected-configs/ogx_config-update.yaml
  • internal/controller/llama_stack_config.go
  • internal/controller/common.go
  • internal/controller/postgres_deployment.go
✅ Files skipped from review due to trivial changes (1)
  • test/kuttl/common/openstack-lightspeed-instance/errors-openstack-lightspeed-instance.yaml
🚧 Files skipped from review as they are similar to previous changes (9)
  • test/kuttl/tests/update-openstacklightspeed/11-assert-configmaps-update.yaml
  • test/kuttl/common/expected-configs/lightspeed-stack-update.yaml
  • internal/controller/lcore_config.go
  • test/kuttl/tests/update-openstacklightspeed/08-assert-openstacklightspeed-update.yaml
  • internal/controller/lcore_reconciler.go
  • test/kuttl/common/expected-configs/lightspeed-stack.yaml
  • internal/controller/openstacklightspeed_controller.go
  • internal/controller/constants.go
  • test/kuttl/common/openstack-lightspeed-instance/assert-openstack-lightspeed-instance.yaml

Comment thread internal/controller/ca_bundle.go
@lpiwowar lpiwowar marked this pull request as ready for review May 28, 2026 10:43
@openshift-ci openshift-ci Bot requested review from Akrog and umago May 28, 2026 10:43
umago
umago previously approved these changes May 29, 2026
Copy link
Copy Markdown
Contributor

@umago umago left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't fault the patch, it LGTM! Thanks for working on this and simplifying the process we handle these certificates.

Comment thread internal/controller/constants.go Outdated
There were multiple issues with how the operator handled certificates:

1) The lightspeed-stack pod used REQUESTS_CA_BUNDLE and SSL_CERT_FILE
   environment variables, which bypassed the system-configured
   certificates.

2) When a user provided a custom CA certificate, it was expected under
   the cert.crt key in their ConfigMap. This was undocumented and the
   required key name was not obvious.

3) PostgresDB appeared to be configured for mTLS because ssl_ca_file
   was set in postgres.conf and the openshift-service-ca.crt was
   mounted into the PostgresDB pod. This created a false sense of mTLS
   being in place, but with the default pg_hba.conf, client certificate
   verification [1] is not enabled. Neither OGX [3][4] nor Lightspeed
   Stack [5] supports providing client certificates to PostgresDB.

4) PostgresDB SSL connection settings were configured for OGX even
   though they have no effect. OGX does not support configuring the
   SSL mode for its PostgresDB connection [3][4], so the PostgresDB
   certificate verification cannot be strictly enforced on the OGX
   side (the default is "prefer" [2], which does not enforce
   certificate verification and can fall back to unencrypted
   communication). OGX uses a non-strict config mode, so unrecognized
   options are silently ignored.

5) The operator did not watch for changes to ConfigMaps. When the
   content of the CA bundle ConfigMap was updated, the operator did
   not automatically reconcile.

6) Not a bug strictly speaking, but Lightspeed Stack used ssl_mode
   "require" when it could have used "verify-full", which checks both
   that the certificate is signed by a trusted CA and that the server
   hostname matches the CN field in the certificate.

This commit simplifies certificate handling with the following changes:

- Introduce a single CA bundle ConfigMap
  (openstack-lightspeed-ca-bundle) containing the system CAs,
  user-provided CA certificates from the OpenStackLightspeed CRD,
  kube-root-ca.crt, and openshift-service-ca.crt. This bundle is
  mounted into all containers in the lightspeed-stack-deployment pod,
  eliminating the need for REQUESTS_CA_BUNDLE and SSL_CERT_FILE.

- When a user specifies a ConfigMap with custom CA certificates,
  iterate over all keys, validate that each holds a valid certificate,
  and append it to the CA bundle (resolves openstack-lightspeed#2).

- Stop mounting openshift-service-ca into the Postgres pod and remove
  ssl_ca_file from postgres.conf. These gave a false sense of client
  certificate validation; actually enforcing it requires configuring
  pg_hba.conf [1], and neither OGX [3][4] nor Lightspeed Stack [5]
  currently supports providing client certificates.

- Remove ssl_mode, ca_cert_path, and gss_encmode from
  storage.backends.postgres_backend in ogx_config.yaml. These options
  are not supported by OGX [3][4] and gave a false sense of SSL being
  configured.

- Add a Watch() on ConfigMaps to the reconciler so that whenever a
  user updates the CA bundle ConfigMap, the reconcile loop runs
  automatically.

- Configure Lightspeed Stack with ssl_mode "verify-full" for its
  PostgreSQL connection, ensuring both CA trust and hostname
  verification.

[1]
https://www.postgresql.org/docs/current/ssl-tcp.html#SSL-CLIENT-CERTIFICATES
[2]
https://magicstack.github.io/asyncpg/current/api/index.html#asyncpg.connection.connect
[3]
https://github.com/ogx-ai/ogx/blob/34d7901/src/ogx/core/storage/datatypes.py#L200
[4]
https://github.com/ogx-ai/ogx/blob/34d7901/src/ogx/core/storage/sqlalchemy_sqlstore.py#L125
[5]
https://github.com/lightspeed-core/lightspeed-stack/blob/7503ebd/src/models/config.py#L181

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Jun 2, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: lpiwowar, umago

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@umago
Copy link
Copy Markdown
Contributor

umago commented Jun 2, 2026

/lgtm

@openshift-ci openshift-ci Bot added the lgtm label Jun 2, 2026
@openshift-merge-bot openshift-merge-bot Bot merged commit 45676d8 into openstack-lightspeed:lcore-migration Jun 2, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants