Skip to content

feat: automatic resource pool viewer access via OAuth2 provider linkage#1315

Merged
SalimKayal merged 3 commits into
mainfrom
salimkayal-feat-resource-pools-access-through-integration
Jun 11, 2026
Merged

feat: automatic resource pool viewer access via OAuth2 provider linkage#1315
SalimKayal merged 3 commits into
mainfrom
salimkayal-feat-resource-pools-access-through-integration

Conversation

@SalimKayal

@SalimKayal SalimKayal commented May 19, 2026

Copy link
Copy Markdown
Collaborator

Summary

Automatically authorize individual users as viewer on resource pools when they connect to a matching OAuth2 provider, without manual admin intervention and without hidden groups.

This closes the loop between connected_services (OAuth2 lifecycle) and crc (resource pools) by making the OAuth2 connection the source of truth for resource pool access.

Motivation & Context

Resource pools already support polymorphic members (users, groups, projects) via Authzed/SpiceDB, and connected_services already supports HPC integrations (FirecREST, Run:AI) through OAuth2. However, granting access to a resource pool tied to an OAuth provider previously required manual admin intervention.

This PR implements automatic authorization in both directions:

  1. Resource pool CRUD — when an admin creates or updates a resource pool with remote.provider_id, all currently-connected users are granted viewer access. When the provider is changed or removed, old access is revoked.
  2. OAuth2 lifecycle — when a user connects to a provider, they are automatically granted viewer on all existing resource pools linked to that provider. When they disconnect, that access is revoked.

Key Design Decisions

No Database Migrations

We rely entirely on existing schema:

  • ResourcePoolORM.remote_provider_id already links a resource pool to an OAuth2 client.
  • OAuth2ConnectionORM already tracks user_id, client_id, and status (connected | pending).
  • The Authz/SpiceDB schema (v10) already defines relation viewer: user on resource_pool.

Direct User Authorization (No Hidden Groups)

Rather than creating synthetic/hidden groups, we authorize users directly via the viewer relation in SpiceDB. This is simpler and more transparent.

A known consequence: manual admin viewer grants and auto-grants are indistinguishable in SpiceDB. Disconnecting a provider unconditionally revokes viewer for its connected users — this is accepted because the OAuth connection is the source of truth.

Validation Before Mutation

Before inserting or updating a resource pool with a provider_id, we validate that the referenced OAuth2 client exists. If it doesn't, we raise MissingResourceError. Previously, arbitrary provider IDs were silently accepted.

Best-Effort Membership Sync with Soft Failure

Membership sync happens outside the main DB transaction (because MemberRepository operations write to SpiceDB via their own session). If a bulk grant fails (e.g., a connected user is missing from Keycloak), the resource pool DB row has already been committed. We handle this by syncing users individually with try/except and warning logs, ensuring one bad user record does not block access for everyone else.

Event-Driven Sync at OAuth Lifecycle Boundaries

We hook into existing flows rather than adding new endpoints:

Event Location Action
User connects to provider P connected_services/blueprints.py::authorize_callback Call _on_oauth2_connected(user_id, provider_id)
User disconnects from provider P connected_services/db.py::delete_oauth2_connection Call _on_oauth2_disconnected(user_id, provider_id)

Both calls are wrapped in try/except so that the primary OAuth operation never fails because of an Authz issue.

Changes

components/renku_data_services/crc/db.py

  • ResourcePoolRepository.__init__ now accepts an optional member_repo: MemberRepository.
  • Added _get_connected_user_ids(provider_id) to query OAuth2ConnectionORM for users with status == connected.
  • insert_resource_pool: validates provider existence, then auto-grants connected users after successful insert.
  • update_resource_pool: captures old provider, validates new provider, then syncs memberships (revoke old → grant new) after the transaction.
  • delete_resource_pool: no changes needed — Authz cascade already cleans up all relations.

components/renku_data_services/connected_services/db.py

  • ConnectedServicesRepository.__init__ now accepts an optional member_repo: MemberRepository.
  • Added _on_oauth2_connected(user_id, client_id): queries all RPs linked to the provider and grants the user as viewer on each.
  • Added _on_oauth2_disconnected(user_id, client_id): queries all RPs linked to the provider and revokes the user's viewer access from each.
  • delete_oauth2_connection: calls _on_oauth2_disconnected before deletion and again after flush as best-effort.

components/renku_data_services/connected_services/blueprints.py

  • authorize_callback: after successful fetch_token, extracts user_id and provider_id from client.connection and calls _on_oauth2_connected with failure isolation.

components/renku_data_services/connected_services/models.py & orm.py

  • OAuth2Connection dataclass and ORM dump() now expose user_id and client_id. Previously only provider_id was exposed.

bases/renku_data_services/data_api/dependencies.py & test/utils.py

  • Wired member_repo into both ResourcePoolRepository and ConnectedServicesRepository construction.

Tests

  • test/components/renku_data_services/db/test_sqlalchemy_pool_repo.py
    • Added integration tests for insert with remote, update swap, update remove, delete cleanup, nonexistent provider validation, and resilience against missing Keycloak users.
  • test/components/renku_data_services/connected_services/test_db.py
    • Added integration tests for connect adds user, disconnect removes user, no-matching-RP no-op, multi-user access, and isolated disconnect.
  • test/bases/renku_data_services/data_api/test_resource_pools.py
    • Added end-to-end tests: test_post_resource_pool_with_remote_grants_connected_users, test_delete_resource_pool_removes_access, test_patch_resource_pool_remote_change_swaps_access.
  • test/bases/renku_data_services/data_api/test_connected_services.py
    • Added end-to-end tests: test_oauth_callback_adds_user_to_rp, test_delete_oauth_connection_removes_rp_access.

Behavioral Changes (Notable)

Stricter validation of remote.provider_id

POST /api/data/resource_pools and PATCH /api/data/resource_pools/{id} now validate that the referenced provider_id exists in oauth2_clients. Non-existent IDs now return a 404-style error instead of silent acceptance.

OAuth2 callback now triggers resource pool access grants

After a user completes the OAuth2 callback (/api/data/oauth2/callback), they automatically receive viewer access to all resource pools linked to that provider.

OAuth2 disconnect now triggers resource pool access revocation

Deleting an OAuth2 connection (DELETE /api/data/oauth2/connections/{id}) unconditionally revokes viewer access from all resource pools linked to that provider.

OAuth2Connection now exposes user_id and client_id

This is an additive change — existing consumers are unaffected.

Risks & Mitigations

  1. Bulk operations outside transaction: insert_resource_pool and update_resource_pool return from their DB transaction before calling member_repo bulk operations. If the bulk operation fails, the RP has already been committed. This is acceptable for MVP because SpiceDB is the secondary store; failing to sync memberships is a soft failure that can be retried or fixed by re-updating the RP.
  2. Test flakiness around Authz: Tests that verify RP visibility via GET /resource_pools depend on Authz + DB consistency. The @Authz.authz_change decorator writes to SpiceDB synchronously, so tests should be deterministic as long as they use the same fixture instances.

PR Stack

@coveralls

coveralls commented May 21, 2026

Copy link
Copy Markdown

Coverage Report for CI Build 27291758994

Warning

Build has drifted: This PR's base is out of sync with its target branch, so coverage data may include unrelated changes.
Quick fix: rebase this PR. Learn more →

Coverage at 86.336% (no base build to compare)

Details

  • Coverage remained the same as the base build.
  • Patch coverage: 28 uncovered changes across 3 files (165 of 193 lines covered, 85.49%).
  • No coverage regressions found.

Uncovered Changes

File Changed Covered %
components/renku_data_services/crc/db.py 134 117 87.31%
components/renku_data_services/connected_services/db.py 38 29 76.32%
components/renku_data_services/connected_services/blueprints.py 7 5 71.43%
Total (6 files) 193 165 85.49%

Coverage Regressions

No coverage regressions found.


Coverage Stats

Coverage Status
Relevant Lines: 31338
Covered Lines: 27056
Line Coverage: 86.34%
Coverage Strength: 1.5 hits per line

💛 - Coveralls

@SalimKayal SalimKayal force-pushed the salimkayal-feat-resource-pools-access-through-integration branch from 52fa3bd to 2ac8da5 Compare May 29, 2026 15:01
@SalimKayal SalimKayal changed the title feat: resource pools access through integration feat: automatic resource pool viewer access via OAuth2 provider linkage May 29, 2026
@SalimKayal SalimKayal force-pushed the salimkayal-feat-resource-pools-access-through-integration branch from c98413c to 79b9eb6 Compare June 10, 2026 09:25
@SalimKayal SalimKayal marked this pull request as ready for review June 10, 2026 09:25
@SalimKayal SalimKayal requested review from a team and sgaist as code owners June 10, 2026 09:25
@SalimKayal SalimKayal requested review from leafty, olevski and sambuc June 10, 2026 09:25
@SalimKayal SalimKayal force-pushed the salimkayal-feat-resource-pools-access-through-integration branch from 79b9eb6 to 48540be Compare June 10, 2026 14:06
@SalimKayal

Copy link
Copy Markdown
Collaborator Author

Behavior is verifiable on staging.

…ct (#1317)

* feat: sync RP access on OAuth connect/disconnect

Add _on_oauth2_connected and _on_oauth2_disconnected helpers
that auto-grant/revoke viewer access to resource pools linked
to the provider. Wire member_repo through constructors.

5 new tests cover: connect grant, disconnect revoke, no-match
no-op, multi-user grant, isolated disconnect.

* feat: trigger RP sync in OAuth flows

Call _on_oauth2_connected after successful fetch_token in authorize
callback. Call _on_oauth2_disconnected before deleting connection row.
Add test verifying delete_oauth2_connection revokes RP access.

* fix: wire member_repo and fix types

- Fix forward reference in dependencies.py by initializing
  connected_services_repo without member_repo and setting it later
- Add user_id/client_id to OAuth2Connection model for type safety
- Guard blueprint callback against None values
- Remove debug logging from _on_oauth2_connected

* chore: logging

* fix: ensure session deletion, remove repetition

* chore: public naming for on_oauth2*

* refactor: remove need for `| None` annotation

* refactor: apply changes in rp memberships in one go

* feat: transmit sessions

* fix: one session per authz op

* fix: remove useless session
…OAuth2 provider linkage (#1314)

* chore: feature branch

* feat: inject MemberRepository into ResourcePoolRepository + add failing tests for RP remote provider sync

- Add member_repo parameter to ResourcePoolRepository constructor
- Wire member_repo in dependencies.py and test/utils.py
- Add TDD tests for auto-grant/revoke on RP CRUD operations

* feat: add helper to query connected users by provider_id

Adds _get_connected_user_ids to ResourcePoolRepository for fetching
users with a connected OAuth2 link to a given provider.

* feat: auto-sync viewers on RP remote provider update

When updating a resource pool's remote provider:
- Capture old provider_id before the transaction
- After the transaction, revoke connected users of the old provider
- Grant connected users of the new provider
- Log warnings on soft failure

* test: fix RP update tests by adding required default class

* test: fix sentinel value error by inserting connections in separate sessions

* style: format and sort imports in test_sqlalchemy_pool_repo.py

* fix: validate provider on RP update and handle stale users in sync

- Add OAuth2ClientORM existence check when updating resource pool remote
  provider_id, matching the validation already done on insert.
- Remove pre-transaction query for old_provider_id to eliminate race
  condition; capture it from the already-loaded ORM inside the tx.
- Grant/revoke auto-synced memberships individually instead of bulk,
  so one missing Keycloak user does not abort the whole operation.

* test: add regression tests for RP remote provider edge cases

- Test that updating a resource pool to a non-existent provider_id fails.
- Test that insert/update auto-grants skip users missing from Keycloak
  instead of failing the entire batch.
- Test that swapping remote providers skips stale connected users.

* squashme: formatting

* fix: use authz in tests after rebase

* refactor: batch authz ops

* refactor: remove duplicate session

* fix: allow missing for grant members on resource pools

* feat: sync resource pool viewer access on OAuth2 connect and disconnect (#1317)

* feat: sync RP access on OAuth connect/disconnect

Add _on_oauth2_connected and _on_oauth2_disconnected helpers
that auto-grant/revoke viewer access to resource pools linked
to the provider. Wire member_repo through constructors.

5 new tests cover: connect grant, disconnect revoke, no-match
no-op, multi-user grant, isolated disconnect.

* feat: trigger RP sync in OAuth flows

Call _on_oauth2_connected after successful fetch_token in authorize
callback. Call _on_oauth2_disconnected before deleting connection row.
Add test verifying delete_oauth2_connection revokes RP access.

* fix: wire member_repo and fix types

- Fix forward reference in dependencies.py by initializing
  connected_services_repo without member_repo and setting it later
- Add user_id/client_id to OAuth2Connection model for type safety
- Guard blueprint callback against None values
- Remove debug logging from _on_oauth2_connected

* chore: logging

* fix: ensure session deletion, remove repetition

* chore: public naming for on_oauth2*

* refactor: remove need for `| None` annotation

* refactor: apply changes in rp memberships in one go

* feat: transmit sessions

* fix: one session per authz op

* fix: remove useless session

* chore: feature branch

* refactor: use update_membership for clarity

* chore: raise ValidationError instead of MissingResourceError

* fix: reuse sessions when possible

* chore: add TODO for session reuse in AUTHZ
* test: add RP remote provider end-to-end tests

* test: add OAuth callback RP access end-to-end tests

* chore: remove useless test
@SalimKayal SalimKayal force-pushed the salimkayal-feat-resource-pools-access-through-integration branch from 48540be to 3efd2a7 Compare June 10, 2026 16:52

@leafty leafty left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🎉 👏 🥳

@SalimKayal SalimKayal enabled auto-merge (squash) June 11, 2026 08:32
@SalimKayal SalimKayal merged commit 8aac47e into main Jun 11, 2026
24 of 30 checks passed
@SalimKayal SalimKayal deleted the salimkayal-feat-resource-pools-access-through-integration branch June 11, 2026 08:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants