Skip to content

Run Codex auth refreshers in target namespaces#1283

Open
kelos-bot[bot] wants to merge 2 commits into
mainfrom
kelos-task-1282
Open

Run Codex auth refreshers in target namespaces#1283
kelos-bot[bot] wants to merge 2 commits into
mainfrom
kelos-task-1282

Conversation

@kelos-bot
Copy link
Copy Markdown

@kelos-bot kelos-bot Bot commented Jun 3, 2026

What type of PR is this?

/kind feature

What this PR does / why we need it:

Runs managed Codex auth refresher CronJobs in the same namespace as the Secret they refresh.

The controller now creates namespace-local refresher access for each opted-in Secret:

  • kelos-codex-auth-refresher ServiceAccount in the target namespace
  • per-Secret Role scoped with resourceNames to the target Secret
  • per-Secret RoleBinding for the refresher ServiceAccount

It uses non-blocking Secret owner references for managed CronJobs, Roles, and RoleBindings so the controller does not require Secret finalizer update permissions. The refresher prompt is now Reply with the single word OK. Docs describe the target-namespace behavior without exposing unnecessary implementation detail.

Which issue(s) this PR is related to:

Fixes #1282

Special notes for your reviewer:

Verification run locally:

  • make verify
  • make test
  • make test-integration

Does this PR introduce a user-facing change?

Codex OAuth refresher CronJobs now run in the target Secret namespace with namespace-local, per-Secret RBAC.

@github-actions github-actions Bot added needs-triage kind/feature Categorizes issue or PR as related to a new feature needs-kind Indicates an issue or PR lacks a kind/* label needs-priority needs-actor release-note and removed needs-kind Indicates an issue or PR lacks a kind/* label labels Jun 3, 2026
@kelos-bot
Copy link
Copy Markdown
Author

kelos-bot Bot commented Jun 3, 2026

/kelos review

@kelos-bot
Copy link
Copy Markdown
Author

kelos-bot Bot commented Jun 3, 2026

🤖 Kelos Task Status

Task kelos-reviewer-issue-comment-555785b50ae8 has succeeded. ✅

Copy link
Copy Markdown
Author

@kelos-bot kelos-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Kelos Reviewer Agent @gjkim42

Review Summary

Verdict: COMMENT
Overall correctness: patch is incorrect
Scope: Moves managed Codex auth refresher CronJobs into the target Secret namespace with namespace-local ServiceAccount, Role, and RoleBinding cleanup.

Findings Overview

Priority Count File:Line Summary
P0 0 none
P1 1 internal/controller/codex_auth_refresher_controller.go:61 Secret owner refs require missing finalizers RBAC
P2 0 none
P3 0 none

Findings

Correctness

  • [P1] internal/controller/codex_auth_refresher_controller.go:61 — controllerutil.SetControllerReference defaults blockOwnerDeletion to true, and Kubernetes' OwnerReferencesPermissionEnforcement admission controller requires update on the referenced owner's finalizers subresource before a client can set that bit. The new children are owned by Secrets, but the controller RBAC only grants secrets get/list/watch/update, not secrets/finalizers, so creating or updating these CronJob/Role/RoleBinding objects can be rejected on default clusters and the refresher will never be installed. Either grant secrets/finalizers in the generated/controller RBAC or set controllerutil.WithBlockOwnerDeletion(false) for the Secret owner references.

Key takeaways

  • This would be a REQUEST_CHANGES review, but GitHub does not allow this bot account to request changes on its own PR.
  • The target-namespace RBAC model is otherwise aligned with the PR goal, but the new owner references need matching RBAC semantics before this can work reliably after upgrade.

Copy link
Copy Markdown
Collaborator

@gjkim42 gjkim42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/kelos pick-up

Comment thread docs/reference.md Outdated
```

For each labeled Secret with a non-empty `CODEX_AUTH_JSON` key, the controller manages a dedicated CronJob. The CronJob runs the Codex image under the controller's ServiceAccount, seeds `auth.json` from that one Secret, invokes Codex so the CLI performs its own refresh, and writes the refreshed file back to only the `CODEX_AUTH_JSON` key — other keys are preserved and the token is never logged. Removing the label or deleting the Secret removes its CronJob. Secrets without a `CODEX_AUTH_JSON` bundle, or whose bundle carries no `refresh_token` (e.g. API-key credentials), are skipped. Externally-managed Secrets (ExternalSecrets, Vault, sealed-secrets) will overwrite the refreshed value on their next sync and are not supported.
For each labeled Secret with a non-empty `CODEX_AUTH_JSON` key, the controller manages a dedicated CronJob in the Secret's namespace. The CronJob runs the Codex image under a namespace-local `kelos-codex-auth-refresher` ServiceAccount, and the controller creates per-Secret Role/RoleBinding objects that grant that ServiceAccount `get` and `update` on only the target Secret. The CronJob seeds `auth.json` from that one Secret, invokes Codex so the CLI performs its own refresh, and writes the refreshed file back to only the `CODEX_AUTH_JSON` key — other keys are preserved and the token is never logged. Removing the label or deleting the Secret removes its CronJob and per-Secret RBAC. Secrets without a `CODEX_AUTH_JSON` bundle, or whose bundle carries no `refresh_token` (e.g. API-key credentials), are skipped. Externally-managed Secrets (ExternalSecrets, Vault, sealed-secrets) will overwrite the refreshed value on their next sync and are not supported.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not be too specific for internal implementation.


type CodexAuthRefresherBuilder struct {
// Namespace is the controller namespace used to clean up legacy CronJobs
// created before refreshers moved into the target Secret namespace.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't have to care about the migration since the feature is not released yet.

}
logger.Info("Updated Codex auth refresher CronJob", "cronJob", current.Name, "secret", secret.Name, "namespace", secret.Namespace)
}
if err := r.deleteLegacyCronJob(ctx, secret.Namespace, secret.Name); err != nil {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't care about the legacy implementation and its migration.

@kelos-bot
Copy link
Copy Markdown
Author

kelos-bot Bot commented Jun 3, 2026

🤖 Kelos Task Status

Task kelos-pr-responder-pull-request-review-fa4fdbe12c17 has failed. ❌

@gjkim42
Copy link
Copy Markdown
Collaborator

gjkim42 commented Jun 3, 2026

Update the prompt as well to

'Reply with the single word OK.'

@gjkim42
Copy link
Copy Markdown
Collaborator

gjkim42 commented Jun 3, 2026

/kelos pick-up

@kelos-bot
Copy link
Copy Markdown
Author

kelos-bot Bot commented Jun 3, 2026

🤖 Kelos Task Status

Task kelos-pr-responder-issue-comment-41c70689a853 has succeeded. ✅

@kelos-bot
Copy link
Copy Markdown
Author

kelos-bot Bot commented Jun 3, 2026

🤖 Kelos Agent @gjkim42

/kelos review

Addressed the latest review feedback: removed unreleased legacy cleanup, softened docs, updated the Codex refresher prompt, and made Secret owner references non-blocking. Local make verify, make test, and make test-integration passed; CI is green. Ready for re-review.

@kelos-bot
Copy link
Copy Markdown
Author

kelos-bot Bot commented Jun 3, 2026

🤖 Kelos Task Status

Task kelos-reviewer-issue-comment-e7c5e0ac33ed has succeeded. ✅

@kelos-bot
Copy link
Copy Markdown
Author

kelos-bot Bot commented Jun 3, 2026

🤖 Kelos Agent @gjkim42

CI is green and the review feedback changes are pushed. I requested /kelos review; the Kelos: kelos-reviewer check has been in progress since 2026-06-03T15:09:18Z and has not posted a result yet, so there are no further code changes I can make in this run. Maintainers can resume with /kelos pick-up if that reviewer later reports findings.

@kelos-bot
Copy link
Copy Markdown
Author

kelos-bot Bot commented Jun 3, 2026

🤖 Kelos Task Status

Task kelos-reviewer-issue-comment-16edf7091b6a has succeeded. ✅

Copy link
Copy Markdown
Author

@kelos-bot kelos-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Kelos Reviewer Agent @gjkim42

Review Summary

Verdict: COMMENT
Overall correctness: patch is correct
Scope: Runs managed Codex auth refresher CronJobs and per-Secret RBAC in the target Secret namespace.

Findings Overview

Priority Count File:Line Summary
P0 0 none
P1 0 none
P2 1 internal/controller/codex_auth_refresher_controller.go:460 Managed ServiceAccount drift does not requeue reconciliation
P3 0 none

Findings

Correctness

  • [P2] internal/controller/codex_auth_refresher_controller.go:460 — ensureServiceAccount now creates and updates the shared kelos-codex-auth-refresher ServiceAccount, but SetupWithManager only watches Secrets, CronJobs, Roles, and RoleBindings. If the ServiceAccount is deleted in a target namespace while the refreshable Secret and per-Secret resources remain unchanged, no event requeues that Secret, so the existing CronJob keeps creating pods that fail because the referenced ServiceAccount is missing. Add a ServiceAccount watch or another namespace-to-secret requeue path so the controller restores it like the other managed access resources.

Copy link
Copy Markdown
Author

@kelos-bot kelos-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Kelos Reviewer Agent @gjkim42

Review Summary

Verdict: COMMENT
Overall correctness: patch is correct
Scope: Runs managed Codex auth refresher CronJobs and namespace-local per-Secret RBAC in the target Secret namespace.

Findings Overview

Priority Count File:Line Summary
P0 0 none
P1 0 none
P2 1 internal/controller/codex_auth_refresher_controller.go:460 Managed ServiceAccount drift does not requeue reconciliation
P3 0 none

Findings

Correctness

  • [P2] internal/controller/codex_auth_refresher_controller.go:460 — ensureServiceAccount creates and updates the shared kelos-codex-auth-refresher ServiceAccount, but SetupWithManager only watches Secrets, CronJobs, Roles, and RoleBindings. If that ServiceAccount is deleted in a target namespace while the opted-in Secret and per-Secret Role/RoleBinding remain unchanged, no watched event requeues the Secret, so the existing CronJob keeps creating pods that fail because the referenced ServiceAccount is missing. Add a ServiceAccount watch with a namespace-to-refreshable-Secret mapping, or another requeue path, so the controller restores this shared dependency like the other managed resources.

Key takeaways

  • No blocking P0/P1 issues found in the current patch.
  • The previously reported non-blocking ServiceAccount drift path still applies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Run Codex auth refresher CronJobs in target Secret namespace

1 participant