Skip to content

Feat/add oauth2 and jwt supports#34

Open
Thareesha98 wants to merge 8 commits into
wso2:mainfrom
Thareesha98:feat/add-oauth2-and-jwt-supports
Open

Feat/add oauth2 and jwt supports#34
Thareesha98 wants to merge 8 commits into
wso2:mainfrom
Thareesha98:feat/add-oauth2-and-jwt-supports

Conversation

@Thareesha98

Copy link
Copy Markdown

Purpose

The AFM spec recognizes five authentication types (bearer, basic, api-key, jwt,
oauth2) but never defines the required or optional fields for any of them — the spec only
states that "additional fields are authentication-type specific." This leaves several gaps:

  • No validation — a missing token on a bearer block is only discovered at request time.
  • Cross-runtime inconsistency — runtimes may expect different field names/structures for
    the same type.
  • Poor authoring experience — authors have no spec-defined reference for what each type needs.
  • Blocked typesjwt and oauth2 were recognized but had no runtime support because
    their field schemas were never agreed.

Resolves #35.

Goals

  • Define the required and optional fields for every authentication type.
  • Apply one uniform schema wherever authentication appears.
  • Validate the schema at parse time (fail fast at load, not at request time).
  • Resolve the issue's three open questions (JWT generation, api-key custom header, extensibility).
  • Unblock and implement jwt (runtime signing) and oauth2 (grant flows) in both
    reference runtimes, keeping them at full parity.

Approach

Where it applies. The schema is identical in all three places authentication can appear:
model.authentication, tools.mcp[].transport.authentication, and
interfaces[].subscription.authentication.

Common rules (enforced at parse time, in both runtimes).

  • type is required and matched case-insensitively against the recognized set; any other
    value is rejected with a clear error listing the supported types.
  • Each type's required fields must be present.
  • Unknown fields are rejected per type (closed schema), catching typos like tokn:.
    (oauth2 is the documented exception — its allowed fields depend on grant_type.)
  • Any string field may use ${env:VAR} substitution; secrets SHOULD use ${env:...}.

Field schemas

Type Required Optional
bearer token
basic username, password
api-key api_key header_name (default Authorization)
jwt issuer, audience, signing_key algorithm (default RS256), key_id, subject, custom_claims, expiry_seconds (default 300)
oauth2 grant_type + per-grant fields per-grant

oauth2 grants:

grant_type Required Optional
client_credentials token_url, client_id, client_secret scopes
password token_url, username, password, client_id, client_secret scopes
refresh_token refresh_url, refresh_token, client_id, client_secret scopes
jwt_bearer token_url, assertion client_id, client_secret, scopes
# jwt — runtime-signed (use bearer if you already hold a signed token)
authentication:
  type: jwt
  issuer: "afm-agent"
  audience: "https://api.example.com"
  signing_key: "${env:JWT_PRIVATE_KEY}"
  algorithm: "RS256"

# oauth2 — grant_type discriminates the required fields
authentication:
  type: oauth2
  grant_type: client_credentials
  token_url: "https://auth.example.com/oauth2/token"
  client_id: "${env:OAUTH_CLIENT_ID}"
  client_secret: "${env:OAUTH_CLIENT_SECRET}"
  scopes: ["read", "write"]

Implementation (both runtimes use one open model + a parse-time validator, for parity and
minimal churn).

  • Python (afm-core/afm-langchain): ClientAuthentication carries the typed fields and a
    @model_validator enforcing the rules above; build_httpx_auth wires each type to an
    httpx.Auth. jwt signs per request via PyJWT (HMAC uses signing_key as the shared secret;
    asymmetric algorithms read it as a PEM key file). oauth2 performs a real token exchange
    (sync and async), caches the token until expiry, sends client credentials as HTTP Basic,
    and uses the RFC 7523 URN for jwt_bearer. No new runtime dependency for oauth2.
  • Ballerina: parser.bal validates every authentication block at parse time; agent.bal
    maps jwthttp:JwtIssuerConfig and oauth2→the matching http:OAuth2*GrantConfig, so the
    HTTP client performs signing / token exchange / refresh natively.

Resolution of the open questions. (1) jwt = runtime signing; pre-signed tokens use
bearer. (2) api-key gains an optional header_name. (3) unknown type values are rejected
at parse time.

Known limitations (documented).

  • api-key is not applied to Ballerina MCP/webhook transport — the http client has no
    raw-header auth variant — so it returns a clear "not yet supported for transport" error
    (it remains supported for Ballerina model providers, and everywhere in Python).
  • The Python webhook subscriber does not yet apply subscription.authentication (pre-existing;
    affects all auth types) — tracked as a follow-up.

User stories

  • As an agent author, I get an immediate, clear error at load time when an authentication
    block has an unknown type, a missing required field, or a typo'd field — instead of a late,
    cryptic request-time failure.
  • As an agent author, I can connect to APIs that expect a non-standard header for an API key
    via header_name.
  • As an agent author, I can use jwt to have the runtime mint and sign a JWT from a key and
    claims, without pre-generating tokens.
  • As an agent author, I can use oauth2 (client_credentials / password / refresh_token /
    jwt_bearer) and have the runtime fetch, cache, and attach the access token automatically.
  • As a runtime maintainer, I get identical validation behavior across the Python and Ballerina
    implementations.

Release note

Standardized AFM authentication field schemas with parse-time validation across both reference

Release note

Standardized AFM authentication field schemas with parse-time validation across both reference
runtimes. Added an optional header_name for api-key, runtime JWT signing for jwt, and
OAuth2 grant flows (client_credentials, password, refresh_token, jwt_bearer) for
oauth2. Unknown authentication types and unknown/missing fields are now rejected at load time
with clear, consistent error messages.

Documentation

N/A — reference implementation. The authentication field schemas standardized here are defined
in the proposal on #35.

Standardize AFM authentication field schemas (issue wso2#35), Phase 1.

- parser.bal: validate every authentication block at parse time against
  its type's field schema, across all three sites (model, MCP transport,
  webhook subscription). Reject unknown types, enforce required fields,
  and reject fields that do not belong to the type. bearer/basic/api-key
  are fully validated; jwt/oauth2 are recognized with field validation
  deferred to their phase.
- agent.bal: add an explicit api-key case to mapToHttpClientAuth with a
  clear message. Ballerina's http client has no raw-header auth variant,
  so api-key is not yet supported for MCP/webhook transport (it remains
  supported for model providers).
- tests: cover schema validation, the api-key mapping case, and
  parse-level rejection.
Standardize AFM authentication field schemas (issue wso2#35), Phase 1.

- models.py: ClientAuthentication now validates at parse time - reject
  unknown types, enforce per-type required fields, and reject fields that
  do not belong to the type (catches typos and cross-type fields). Add an
  optional header_name for api-key. bearer/basic/api-key are fully
  validated; jwt/oauth2 are recognized with field validation deferred to
  their phase.
- mcp.py: wire header_name into ApiKeyAuth so api-key can target custom
  headers (e.g. X-API-Key) instead of always Authorization.
- tests: cover the validation rules and the custom header_name wiring.
Standardize AFM authentication field schemas (issue wso2#35), Phase 2.

The runtime now mints and signs a JWT from a key plus claims and sends it
as a bearer token, for MCP and webhook transports.

- agent.bal: map a jwt authentication block to http:JwtIssuerConfig
  (issuer->iss, subject->username/sub, key_id->kid header, custom_claims,
  expiry_seconds->expTime). HMAC algorithms (HS256/384/512) use signing_key
  as a shared secret; asymmetric algorithms use it as a PEM key file path.
- parser.bal: validate jwt fields at parse time (issuer, audience and
  signing_key required); jwt is no longer treated as deferred.
- tests: HMAC and RS256 mapping, plus jwt schema validation.
Standardize AFM authentication field schemas (issue wso2#35), Phase 2.

- models.py: add the jwt fields to ClientAuthentication (issuer, audience,
  signing_key, algorithm, key_id, subject, custom_claims, expiry_seconds)
  and fold jwt into the parse-time validator (issuer, audience and
  signing_key required; unknown jwt fields rejected). oauth2 stays deferred.
- tests: jwt valid/invalid schema cases, audience list, unknown field.
Standardize AFM authentication field schemas (issue wso2#35), Phase 2.

- mcp.py: add JwtAuth (httpx.Auth) that mints and signs a JWT per request
  via PyJWT and sends it as a bearer token, and wire it into
  build_httpx_auth. HMAC algorithms use signing_key as a shared secret;
  asymmetric algorithms read it as a PEM key file. Claims are assembled to
  match the Ballerina runtime (sets nbf; custom_claims merged last). Only
  oauth2 remains not yet supported.
- pyproject.toml / uv.lock: declare the pyjwt[crypto] dependency.
- tests: HMAC sign+decode round-trip, RS256 with a generated key file,
  default-RS256, and custom header_name.
Standardize AFM authentication field schemas (issue wso2#35), Phase 3.

- agent.bal: add buildOAuth2GrantConfig, mapping an oauth2 authentication
  block to the matching http:OAuth2*GrantConfig so the HTTP client performs
  the token exchange (and refresh) natively. Supports the client_credentials,
  password, refresh_token and jwt_bearer grants; field names are mapped from
  the AFM snake_case names to the connector camelCase names per grant.
- parser.bal: validate oauth2 at parse time. grant_type is a required
  discriminator that selects the required/optional fields; unknown grants and
  fields not allowed for the grant are rejected. All five auth types are now
  recognized and validated.
Standardize AFM authentication field schemas (issue wso2#35), Phase 3.

- models.py: validate oauth2 at parse time. grant_type is a required
  discriminator selecting the required/optional fields per grant
  (client_credentials, password, refresh_token, jwt_bearer); unknown grants
  and fields not allowed for the grant are rejected.
- mcp.py: add OAuth2Auth (httpx.Auth) that obtains an access token via a
  token exchange, caches it until expiry, and sends it as a bearer token.
  Implements both sync and async flows; client credentials are sent as HTTP
  Basic to match the Ballerina runtime, and jwt_bearer uses the RFC 7523
  grant URN with a user-supplied assertion. No new dependency. All five auth
  types are now supported (none remain not yet supported).
Standardize AFM authentication field schemas (issue wso2#35), Phase 3.

- cover parse-time validation for all four grants (valid configs, missing
  grant_type, unsupported grant, missing required fields, fields not allowed
  for the grant, case-insensitive grant_type) and the runtime token-exchange
  mapping, in both runtimes.
@Thareesha98 Thareesha98 requested a review from MaryamZi as a code owner June 14, 2026 17:05
@coderabbitai

coderabbitai Bot commented Jun 14, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

OAuth2 and JWT Authentication Support for AFM Specification

This PR adds parse-time validation and runtime support for OAuth2 and JWT authentication types across the AFM specification, complementing existing bearer, basic, and api-key authentication methods. The implementation ensures consistent validation across both Python and Ballerina reference runtimes.

Changes Overview

Authentication Schema Validation

Both runtimes now enforce standardized field requirements at parse time:

  • Bearer: Requires token
  • Basic: Requires username and password
  • API Key: Requires api_key with optional configurable header_name
  • JWT: Requires issuer, audience, signing_key; supports optional algorithm, key_id, subject, custom_claims, and expiry_seconds
  • OAuth2: Requires grant_type with grant-specific required fields for client_credentials, password, refresh_token, and jwt_bearer grant types

All string fields support environment variable substitution via ${env:VAR} syntax.

Ballerina Implementation

  • parser.bal: Added authentication validation layer that runs at parse time, rejecting unknown types, missing required fields, and invalid grant configurations with clear error messages
  • agent.bal: Extended mapToHttpClientAuth to translate validated authentication configurations into Ballerina native http:ClientAuthConfig implementations, including JWT issuer configs and OAuth2 grant configs

Python Implementation

  • models.py: Extended ClientAuthentication model with new optional credential fields and refactored validation logic to support JWT and OAuth2 with comprehensive field enforcement
  • mcp.py: Implemented JwtAuth and OAuth2Auth classes as HTTPX authentication handlers supporting synchronous and asynchronous token handling, including JWT signing and OAuth2 token exchange with caching

Testing

  • Extended test suites in both runtimes to cover all authentication types with positive and negative test cases, including validation of missing fields, unsupported grant types, and unknown authentication types

Known Limitations

  • API-key authentication is not supported for Ballerina MCP/webhook transport
  • Python webhook subscriber does not yet apply subscription-level authentication (pre-existing)

Implementation Impact

Parse-time validation now provides fail-fast error detection for authentication misconfigurations, improving developer experience and ensuring consistency across runtime implementations before any actual authentication attempts occur.

Walkthrough

JWT and OAuth2 authentication support is added to both the Ballerina and Python AFM interpreters. In the Python interpreter, ClientAuthentication gains new optional credential fields for JWT and OAuth2, with a rewritten spec-driven validator and a new _validate_oauth2_fields method. In the Ballerina interpreter, parseAfm now calls validateMetadataAuthentication, backed by new helper functions that enforce field contracts per auth type. On the execution side, Ballerina adds buildJwtIssuerConfig and buildOAuth2GrantConfig adapters; Python adds JwtAuth and OAuth2Auth as httpx.Auth implementations with token caching. Both sides extend their test suites with coverage for all new auth paths.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 1.79% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Feat/add oauth2 and jwt supports' clearly and concisely identifies the main feature additions in the changeset.
Description check ✅ Passed The description is comprehensive and well-structured, covering Purpose, Goals, Approach, User stories, Release note, Documentation, and Automation tests sections with clear, detailed explanations.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (4)
python-interpreter/packages/afm-langchain/tests/test_mcp.py (1)

188-212: ⚡ Quick win

Make JWT TTL assertion tolerant to second-boundary timing.

The strict exp - iat == 600 check can be intermittently brittle. Prefer a small tolerance window.

Proposed test adjustment
-        assert decoded["exp"] - decoded["iat"] == 600
+        ttl = decoded["exp"] - decoded["iat"]
+        assert 599 <= ttl <= 600
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@python-interpreter/packages/afm-langchain/tests/test_mcp.py` around lines 188
- 212, The assertion `assert decoded["exp"] - decoded["iat"] == 600` uses a
strict equality check which is brittle due to potential timing variations
between when the `iat` and `exp` timestamps are set. Replace this strict
equality check with a tolerance-based assertion that allows for a small time
window (e.g., checking that the difference is within approximately 600 seconds
plus or minus a small tolerance, such as 1-2 seconds) to make the test more
reliable and tolerant of second-boundary timing edge cases.
python-interpreter/packages/afm-langchain/src/afm_langchain/tools/mcp.py (2)

111-111: 💤 Low value

custom_claims can overwrite standard JWT claims.

Using claims.update(self.custom_claims) allows custom_claims to overwrite standard claims like iss, aud, exp, iat, or nbf. If this is intentional for flexibility, consider documenting it. Otherwise, consider updating only keys not already present.

Option to preserve standard claims
-        claims.update(self.custom_claims)
+        for key, value in self.custom_claims.items():
+            if key not in claims:
+                claims[key] = value
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@python-interpreter/packages/afm-langchain/src/afm_langchain/tools/mcp.py` at
line 111, The `claims.update(self.custom_claims)` call at line 111 allows custom
claims to overwrite standard JWT claims such as iss, aud, exp, iat, and nbf,
which could be unintended. Instead of using update which unconditionally
overwrites, modify the code to only add custom claims that are not already
present in the claims dictionary. This preserves the standard JWT claims while
still allowing custom claims to be added without conflicts.

156-157: ⚖️ Poor tradeoff

Token cache is not thread-safe.

The _token and _expires_at fields are accessed without synchronization. If multiple threads call sync_auth_flow concurrently when the cache is expired, they may all fetch tokens simultaneously. This causes redundant requests but not data corruption. Consider documenting this limitation or adding a lock if high-concurrency scenarios are expected.

Also applies to: 251-259

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@python-interpreter/packages/afm-langchain/src/afm_langchain/tools/mcp.py`
around lines 156 - 157, The `_token` and `_expires_at` instance variables are
accessed without synchronization, creating a race condition where multiple
threads calling `sync_auth_flow` concurrently when the cache is expired can
fetch tokens simultaneously. Either add thread-safe synchronization (such as a
threading.Lock) around the token cache access and update logic in the method
that reads and writes these fields, or add clear documentation to the class
explaining that this implementation is not thread-safe and should only be used
in single-threaded contexts. If choosing synchronization, ensure the lock
protects all read-modify-write operations on the token cache to prevent
redundant concurrent token requests.
ballerina-interpreter/agent.bal (1)

443-448: 💤 Low value

Minor inconsistency in password grant handling.

The password grant uses addOptional for clientId and clientSecret, but per parser.bal these are required fields for this grant type. The code works correctly since parse-time validation guarantees their presence, but for consistency with client_credentials (which assigns them directly), consider assigning them directly here as well.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ballerina-interpreter/agent.bal` around lines 443 - 448, In the "password"
grant case within the switch statement, the clientId and clientSecret fields are
being added to grantConfig using the addOptional function, but since these are
required fields for this grant type (validated at parse time), they should be
assigned directly to the grantConfig map instead. Replace the two addOptional
calls for clientId and clientSecret with direct assignments to grantConfig,
matching the pattern used in the client_credentials grant for consistency.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@ballerina-interpreter/parser.bal`:
- Around line 364-369: The required-field validation in the auth parsing logic
currently only checks if keys are present in auth.keys(), but does not validate
that those fields have non-null values. This allows null-valued fields to pass
parse-time validation even though they would be rejected later at runtime by
ensureType() in getAuthTokenOrApiKey(). Modify the provided array derivation to
filter out keys whose corresponding values are null, ensuring required fields
both exist and have non-null values. Apply this same filtering pattern at both
validation locations: in the requiredAuthFields loop check around lines 364-369
and in the OAuth2 grant validation around lines 429-434.

In `@python-interpreter/packages/afm-core/src/afm/models.py`:
- Around line 139-143: The current logic building the `provided` set at the
initialization filtering excludes `None` values, preventing detection of fields
explicitly set to null. Replace the set comprehension that skips `None` values
with `model_fields_set` to capture all fields that were explicitly provided by
the user. Then update the validation logic that checks for unsupported fields
(which currently relies on the `provided` set) to use this new approach based on
`model_fields_set`, ensuring that fields like `token` explicitly set to `null`
are properly detected and rejected. This change should be consistently applied
wherever the `provided` set is used for validation checks.

In `@python-interpreter/packages/afm-langchain/pyproject.toml`:
- Line 20: The pyjwt[crypto] dependency in the pyproject.toml file is pinned to
version 2.10.0, which contains security vulnerabilities including
CVE-2024-53861. Update the version constraint for pyjwt[crypto] from >=2.10.0 to
>=2.13.0 to address these security advisories and ensure a secure version is
used.

---

Nitpick comments:
In `@ballerina-interpreter/agent.bal`:
- Around line 443-448: In the "password" grant case within the switch statement,
the clientId and clientSecret fields are being added to grantConfig using the
addOptional function, but since these are required fields for this grant type
(validated at parse time), they should be assigned directly to the grantConfig
map instead. Replace the two addOptional calls for clientId and clientSecret
with direct assignments to grantConfig, matching the pattern used in the
client_credentials grant for consistency.

In `@python-interpreter/packages/afm-langchain/src/afm_langchain/tools/mcp.py`:
- Line 111: The `claims.update(self.custom_claims)` call at line 111 allows
custom claims to overwrite standard JWT claims such as iss, aud, exp, iat, and
nbf, which could be unintended. Instead of using update which unconditionally
overwrites, modify the code to only add custom claims that are not already
present in the claims dictionary. This preserves the standard JWT claims while
still allowing custom claims to be added without conflicts.
- Around line 156-157: The `_token` and `_expires_at` instance variables are
accessed without synchronization, creating a race condition where multiple
threads calling `sync_auth_flow` concurrently when the cache is expired can
fetch tokens simultaneously. Either add thread-safe synchronization (such as a
threading.Lock) around the token cache access and update logic in the method
that reads and writes these fields, or add clear documentation to the class
explaining that this implementation is not thread-safe and should only be used
in single-threaded contexts. If choosing synchronization, ensure the lock
protects all read-modify-write operations on the token cache to prevent
redundant concurrent token requests.

In `@python-interpreter/packages/afm-langchain/tests/test_mcp.py`:
- Around line 188-212: The assertion `assert decoded["exp"] - decoded["iat"] ==
600` uses a strict equality check which is brittle due to potential timing
variations between when the `iat` and `exp` timestamps are set. Replace this
strict equality check with a tolerance-based assertion that allows for a small
time window (e.g., checking that the difference is within approximately 600
seconds plus or minus a small tolerance, such as 1-2 seconds) to make the test
more reliable and tolerant of second-boundary timing edge cases.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 80f2db5c-afa3-457c-97b2-2604a8994880

📥 Commits

Reviewing files that changed from the base of the PR and between 9c1fd1c and fc85500.

⛔ Files ignored due to path filters (1)
  • python-interpreter/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (8)
  • ballerina-interpreter/agent.bal
  • ballerina-interpreter/parser.bal
  • ballerina-interpreter/tests/main_test.bal
  • python-interpreter/packages/afm-core/src/afm/models.py
  • python-interpreter/packages/afm-core/tests/test_parser.py
  • python-interpreter/packages/afm-langchain/pyproject.toml
  • python-interpreter/packages/afm-langchain/src/afm_langchain/tools/mcp.py
  • python-interpreter/packages/afm-langchain/tests/test_mcp.py

Comment on lines +364 to +369
string[] provided = from string key in auth.keys() where key != "type" select key;

foreach string required in requiredAuthFields(authType) {
if provided.indexOf(required) is () {
return error(string `type '${authType}' requires '${required}' field`);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's locate and examine the parser.bal file
fd -t f "parser.bal" | head -20

Repository: wso2/reference-implementations-afm

Length of output: 111


🏁 Script executed:

# Check the file size to ensure we can read it
wc -l ballerina-interpreter/parser.bal

Repository: wso2/reference-implementations-afm

Length of output: 115


🏁 Script executed:

# Read the specific lines mentioned in the review (364-369 and 429-434)
sed -n '360,375p' ballerina-interpreter/parser.bal

Repository: wso2/reference-implementations-afm

Length of output: 613


🏁 Script executed:

# Also read the second location
sed -n '425,440p' ballerina-interpreter/parser.bal

Repository: wso2/reference-implementations-afm

Length of output: 691


🏁 Script executed:

# Search for the function definitions to understand context
rg -n "validateMetadataAuthentication|validateAuthentication|validateOAuth2" ballerina-interpreter/parser.bal | head -20

Repository: wso2/reference-implementations-afm

Length of output: 646


🏁 Script executed:

# Look for test files related to auth validation
fd -t f ".*test.*\.bal" | grep -i "auth\|parse" | head -20

Repository: wso2/reference-implementations-afm

Length of output: 60


🏁 Script executed:

# Find ClientAuthentication type definition
rg -n "type ClientAuthentication" ballerina-interpreter/

Repository: wso2/reference-implementations-afm

Length of output: 148


🏁 Script executed:

# Search for test files
find . -type f -name "*.bal" | xargs grep -l "test\|Test" | grep -i "auth\|parse" | head -10

Repository: wso2/reference-implementations-afm

Length of output: 60


🏁 Script executed:

# Look at the broader context of validateAuthentication function
sed -n '345,380p' ballerina-interpreter/parser.bal

Repository: wso2/reference-implementations-afm

Length of output: 1195


🏁 Script executed:

# Check if there's any null checking already
rg -n "is \(\)" ballerina-interpreter/parser.bal | head -20

Repository: wso2/reference-implementations-afm

Length of output: 678


🏁 Script executed:

# Look for auth field initialization patterns
rg -n "authentication.*=" ballerina-interpreter/parser.bal | head -20

Repository: wso2/reference-implementations-afm

Length of output: 136


🏁 Script executed:

# Check the ClientAuthentication record definition
sed -n '49,80p' ballerina-interpreter/types.bal

Repository: wso2/reference-implementations-afm

Length of output: 603


🏁 Script executed:

# Look for any existing parse tests
find . -type f -name "*.bal" -o -name "*.yaml" -o -name "*.json" | xargs grep -l "bearer\|basic\|api-key" 2>/dev/null | head -10

Repository: wso2/reference-implementations-afm

Length of output: 236


🏁 Script executed:

# Check for test resources or test data
find . -type d -name "*test*" | head -10

Repository: wso2/reference-implementations-afm

Length of output: 203


🏁 Script executed:

# Check the test files for auth validation testing
cat -n ballerina-interpreter/tests/agent_test.bal | head -50

Repository: wso2/reference-implementations-afm

Length of output: 2020


🏁 Script executed:

# Check main_test.bal
cat -n ballerina-interpreter/tests/main_test.bal | head -50

Repository: wso2/reference-implementations-afm

Length of output: 1932


🏁 Script executed:

# Look for YAML/JSON test files with authentication configs
find . -type f \( -name "*.yaml" -o -name "*.yml" -o -name "*.json" \) | xargs grep -l "bearer\|basic\|oauth2" 2>/dev/null | head -10

Repository: wso2/reference-implementations-afm

Length of output: 60


🏁 Script executed:

# Check if there are AFM YAML files with auth sections
find . -type f -name "*.yaml" -o -name "*.yml" | xargs grep -A5 "authentication:" 2>/dev/null | head -40

Repository: wso2/reference-implementations-afm

Length of output: 60


🏁 Script executed:

# Check for sample AFM files that might contain auth configs
find . -type f -name "*.afm*" -o -name "*.md" | xargs grep -l "authentication" 2>/dev/null | head -5

Repository: wso2/reference-implementations-afm

Length of output: 367


🏁 Script executed:

# Look at sample_agent.afm.md if it exists
find . -name "sample_agent.afm.md" -exec cat {} \;

Repository: wso2/reference-implementations-afm

Length of output: 3252


🏁 Script executed:

# Check requiredAuthFields and allowedAuthFields implementations
sed -n '380,410p' ballerina-interpreter/parser.bal

Repository: wso2/reference-implementations-afm

Length of output: 872


🏁 Script executed:

# Check what fields are actually required for each auth type
rg -n "requiredAuthFields|allowedAuthFields" -A 20 ballerina-interpreter/parser.bal | head -80

Repository: wso2/reference-implementations-afm

Length of output: 1945


🏁 Script executed:

# Check how auth records are created from YAML to understand null-field behavior
sed -n '200,250p' ballerina-interpreter/parser.bal

Repository: wso2/reference-implementations-afm

Length of output: 1898


🏁 Script executed:

# Look for the YAML parsing logic that creates ClientAuthentication records
rg -n "ClientAuthentication\|fromYaml\|fromJson" ballerina-interpreter/parser.bal -B2 -A5 | head -60

Repository: wso2/reference-implementations-afm

Length of output: 60


🏁 Script executed:

# Verify Ballerina record key behavior with null values - check related code
rg -n "\.keys()" ballerina-interpreter/parser.bal

Repository: wso2/reference-implementations-afm

Length of output: 260


🏁 Script executed:

# Check how the YAML/JSON is parsed into ClientAuthentication records
rg -n "parsing\|Parser\|yaml\|json" ballerina-interpreter/parser.bal | head -30

Repository: wso2/reference-implementations-afm

Length of output: 60


🏁 Script executed:

# Look for the from/map conversion that creates ClientAuthentication
sed -n '1,100p' ballerina-interpreter/parser.bal

Repository: wso2/reference-implementations-afm

Length of output: 3406


🏁 Script executed:

# Verify if there are any null value tests in the test files
grep -n "null\|:\s*\(\)" ballerina-interpreter/tests/main_test.bal | head -30

Repository: wso2/reference-implementations-afm

Length of output: 1427


🏁 Script executed:

# Check how the deserialized auth records are actually used in the runtime
grep -n "auth\[" ballerina-interpreter/parser.bal | head -20

Repository: wso2/reference-implementations-afm

Length of output: 131


🏁 Script executed:

# Look for any usage of auth fields after validation
rg -n "authentication\." ballerina-interpreter/ | head -30

Repository: wso2/reference-implementations-afm

Length of output: 60


🏁 Script executed:

# Check agent.bal to see how auth is used at runtime
head -100 ballerina-interpreter/agent.bal

Repository: wso2/reference-implementations-afm

Length of output: 3340


🏁 Script executed:

# Check the full ClientAuthentication type definition and its usage
sed -n '49,60p' ballerina-interpreter/types.bal

Repository: wso2/reference-implementations-afm

Length of output: 274


🏁 Script executed:

# Look for how auth values are actually used at runtime
rg -n "transport\.authentication\|model\.authentication" ballerina-interpreter/ -A3 -B1 | head -40

Repository: wso2/reference-implementations-afm

Length of output: 60


🏁 Script executed:

# Check the mapToHttpClientAuth function to see if it handles null values
rg -n "mapToHttpClientAuth" ballerina-interpreter/ -A20 | head -50

Repository: wso2/reference-implementations-afm

Length of output: 4060


🏁 Script executed:

# Find the mapToHttpClientAuth function implementation
rg -n "function mapToHttpClientAuth" ballerina-interpreter/ -A30 | head -50

Repository: wso2/reference-implementations-afm

Length of output: 2319


🏁 Script executed:

# Check specifically for how token/password fields are accessed
rg -n "auth\[\"token\"\]|auth\[\"password\"\]|auth\[\"username\"\]|auth\[\"client_" ballerina-interpreter/ -B2 -A2 | head -60

Repository: wso2/reference-implementations-afm

Length of output: 60


🏁 Script executed:

# Check if there's any test with null auth fields
rg -n "token.*\(\)|password.*\(\)|null" ballerina-interpreter/tests/main_test.bal | head -20

Repository: wso2/reference-implementations-afm

Length of output: 149


🏁 Script executed:

# Check if ClientAuthentication is actually a record with optional fields or how it's defined
sed -n '49,65p' ballerina-interpreter/types.bal

Repository: wso2/reference-implementations-afm

Length of output: 351


🏁 Script executed:

# Look at what fromJsonWithType does - check the imports and usage
head -20 ballerina-interpreter/parser.bal | grep import

Repository: wso2/reference-implementations-afm

Length of output: 127


🏁 Script executed:

# Check if there's a test with explicitly null auth fields that should fail
grep -n "null\|:\s*\(\)" ballerina-interpreter/tests/main_test.bal | grep -i "auth\|bearer\|basic" | head -20

Repository: wso2/reference-implementations-afm

Length of output: 1564


🏁 Script executed:

# Verify how the YAML is parsed and if null fields are even possible
sed -n '48,70p' ballerina-interpreter/parser.bal

Repository: wso2/reference-implementations-afm

Length of output: 711


🏁 Script executed:

# Check if there are any Ballerina tests that actually set auth fields to null
rg -n "null|():" ballerina-interpreter/tests/main_test.bal | grep -i "authentication\|bearer\|basic\|oauth" -B2 -A2 | head -40

Repository: wso2/reference-implementations-afm

Length of output: 2107


🏁 Script executed:

# Check the complete test for missing fields to understand validation behavior
sed -n '418,426p' ballerina-interpreter/tests/main_test.bal

Repository: wso2/reference-implementations-afm

Length of output: 402


🏁 Script executed:

# Search for any null-field handling in mapToHttpClientAuth or related functions
rg -n "is \(\)|!is \(\)" ballerina-interpreter/agent.bal | head -30

Repository: wso2/reference-implementations-afm

Length of output: 534


🏁 Script executed:

# Verify if YAML deserialization includes null fields or skips them
python3 << 'PY'
# Check if Ballerina's YAML parsing would include null fields
# by examining how records handle null in the context

# The key question: does fromJsonWithType include null fields in the record?
# In Ballerina, when a YAML field is null, does it:
# 1. Omit the field from the record?
# 2. Include the field with null value?

# Let's check the test to see if we can infer behavior
test_code = """
ClientAuthentication auth = {'type: "api-key"};
"""
print("Test shows creation without optional fields - suggests fields are optional")
print("This implies missing fields don't cause issues, only present ones with wrong values")
PY

Repository: wso2/reference-implementations-afm

Length of output: 237


🏁 Script executed:

# Check if there's documentation or examples showing how Ballerina handles null in YAML->record conversion
rg -n "null|null:" ballerina-interpreter/tests/ | head -20

Repository: wso2/reference-implementations-afm

Length of output: 191


🏁 Script executed:

# Look at how the error message "api_key not found" is generated
rg -n "not found in" ballerina-interpreter/

Repository: wso2/reference-implementations-afm

Length of output: 947


🏁 Script executed:

# Check the getApiKey function that generates that error
rg -n "function getApiKey" ballerina-interpreter/ -A10

Repository: wso2/reference-implementations-afm

Length of output: 923


🏁 Script executed:

# Look for any actual null field tests in validation tests
sed -n '418,453p' ballerina-interpreter/tests/main_test.bal

Repository: wso2/reference-implementations-afm

Length of output: 1299


🏁 Script executed:

# Search for tests that might create auth with explicit null fields
rg -n "token.*null|\(\)" ballerina-interpreter/tests/ | grep -i "auth\|bearer\|basic" | head -20

Repository: wso2/reference-implementations-afm

Length of output: 2428


🏁 Script executed:

# Check if there's documentation about how null values behave in Ballerina record deserialization
rg -n "cloneWithType" ballerina-interpreter/ -B3 -A3 | head -40

Repository: wso2/reference-implementations-afm

Length of output: 3101


🏁 Script executed:

# Look for how auth is actually constructed from YAML in edge cases
sed -n '320,340p' ballerina-interpreter/agent.bal

Repository: wso2/reference-implementations-afm

Length of output: 751


🏁 Script executed:

# Check the complete getAuthTokenOrApiKey to understand null handling at usage time
sed -n '321,335p' ballerina-interpreter/agent.bal

Repository: wso2/reference-implementations-afm

Length of output: 608


🏁 Script executed:

# Check if there are actual tests showing null field behavior in AFM YAML
find . -type f -name "*.md" -o -name "*.afm" | xargs grep -l "null" 2>/dev/null | head -5

Repository: wso2/reference-implementations-afm

Length of output: 60


🏁 Script executed:

# Look at how the OAuth2Config and other config types handle null/missing fields
rg -n "type OAuth2Config|type BearerTokenConfig|type CredentialsConfig" ballerina-interpreter/

Repository: wso2/reference-implementations-afm

Length of output: 142


🏁 Script executed:

# Check if there's any Ballerina documentation or test showing null field behavior in record deserialization
sed -n '1,50p' ballerina-interpreter/tests/agent_test.bal

Repository: wso2/reference-implementations-afm

Length of output: 1670


🏁 Script executed:

# Look for any test that actually creates an auth record with explicit null value
rg -n ":\s*\(\)\|null" ballerina-interpreter/tests/agent_test.bal | head -20

Repository: wso2/reference-implementations-afm

Length of output: 60


🏁 Script executed:

# Check if Ballerina's YAML module preserves null values or omits them
rg -n "data.yaml|fromJson" ballerina-interpreter/parser.bal -B2 -A2 | head -30

Repository: wso2/reference-implementations-afm

Length of output: 399


🏁 Script executed:

# Check the OAuth2Config and other config types to understand field definitions
sed -n '422,440p' ballerina-interpreter/agent.bal

Repository: wso2/reference-implementations-afm

Length of output: 712


🏁 Script executed:

# Create a test to understand how Ballerina handles null in record deserialization
python3 << 'PY'
import json

# Simulate what happens in Ballerina:
# 1. YAML with null field: {type: "bearer", token: null}
# 2. Deserialize to ClientAuthentication record

# In Ballerina, when deserializing to a typed record:
# - If a field is declared as string, null cannot be assigned
# - The deserialization would fail OR omit the null field
# - Extra fields go into ...rest as map<anydata>

# The key question: does auth.keys() include null-valued fields?
# In Ballerina maps, a key with null value IS included in keys()
# But ClientAuthentication is a record, not a raw map

# The validation uses:
# 1. auth.keys() - gets keys from the record's dynamic fields
# 2. indexOf() - checks key presence only

# The real issue: if auth has {token: ()} (null), keys() includes "token"
# But then auth.get("token") returns null

print("In Ballerina:")
print("- auth.keys() includes keys even if value is null")
print("- The parser validation only checks key presence")
print("- Runtime validation auth.get(key).ensureType() would fail on null")
print("")
print("The concern is valid: parse-time validation allows null values to pass")
print("Runtime would catch it, but parse-time should be stricter")
PY

Repository: wso2/reference-implementations-afm

Length of output: 388


🏁 Script executed:

# Look for how fields are actually used at parse time vs runtime
rg -n "ensureType" ballerina-interpreter/ -B2 -A2

Repository: wso2/reference-implementations-afm

Length of output: 2939


🏁 Script executed:

# Check the complete validation flow - from YAML to validation
sed -n '27,35p' ballerina-interpreter/parser.bal

Repository: wso2/reference-implementations-afm

Length of output: 353


Required-field validation currently checks key presence, not non-nil values.

At lines 364–369 and 429–434, provided is derived from auth.keys(), allowing required fields that are explicitly set to null to pass parse-time validation. The runtime validation via ensureType() in getAuthTokenOrApiKey() (line 334) would later reject null values, but parse-time validation should catch this earlier for fail-fast feedback.

Add filtering to exclude null-valued fields from the required-field check:

Proposed approach
    string[] provided = from string key in auth.keys() where key != "type" select key;
+   string[] providedWithValue = [];
+   foreach string key in provided {
+       if auth[key] !is () {
+           providedWithValue.push(key);
+       }
+   }

    foreach string required in requiredAuthFields(authType) {
-       if provided.indexOf(required) is () {
+       if providedWithValue.indexOf(required) is () {
            return error(string `type '${authType}' requires '${required}' field`);
        }
    }

Apply the same pattern at line 429–434 for OAuth2 grant validation.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
string[] provided = from string key in auth.keys() where key != "type" select key;
foreach string required in requiredAuthFields(authType) {
if provided.indexOf(required) is () {
return error(string `type '${authType}' requires '${required}' field`);
}
string[] provided = from string key in auth.keys() where key != "type" select key;
string[] providedWithValue = [];
foreach string key in provided {
if auth[key] !is () {
providedWithValue.push(key);
}
}
foreach string required in requiredAuthFields(authType) {
if providedWithValue.indexOf(required) is () {
return error(string `type '${authType}' requires '${required}' field`);
}
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ballerina-interpreter/parser.bal` around lines 364 - 369, The required-field
validation in the auth parsing logic currently only checks if keys are present
in auth.keys(), but does not validate that those fields have non-null values.
This allows null-valued fields to pass parse-time validation even though they
would be rejected later at runtime by ensureType() in getAuthTokenOrApiKey().
Modify the provided array derivation to filter out keys whose corresponding
values are null, ensuring required fields both exist and have non-null values.
Apply this same filtering pattern at both validation locations: in the
requiredAuthFields loop check around lines 364-369 and in the OAuth2 grant
validation around lines 429-434.

Source: Coding guidelines

Comment on lines +139 to +143
provided = {
name for name in _AUTH_CREDENTIAL_FIELDS if getattr(self, name) is not None
}
provided |= set(self.model_extra or {})

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

cd python-interpreter/packages/afm-core && find . -name "models.py" -type f | head -5

Repository: wso2/reference-implementations-afm

Length of output: 98


🏁 Script executed:

cd python-interpreter/packages/afm-core && wc -l src/afm/models.py

Repository: wso2/reference-implementations-afm

Length of output: 100


🏁 Script executed:

cd python-interpreter/packages/afm-core && sed -n '130,200p' src/afm/models.py

Repository: wso2/reference-implementations-afm

Length of output: 2722


🏁 Script executed:

cd python-interpreter/packages/afm-core && sed -n '1,50p' src/afm/models.py

Repository: wso2/reference-implementations-afm

Length of output: 1468


🏁 Script executed:

cd python-interpreter/packages/afm-core && sed -n '50,140p' src/afm/models.py

Repository: wso2/reference-implementations-afm

Length of output: 2642


🏁 Script executed:

cd python-interpreter/packages/afm-core && grep -n "_AUTH_CREDENTIAL_FIELDS\|_AUTH_ALLOWED_FIELDS\|_AUTH_REQUIRED_FIELDS" src/afm/models.py | head -20

Repository: wso2/reference-implementations-afm

Length of output: 431


🏁 Script executed:

cd python-interpreter/packages/afm-core && grep -n "class ClientAuthentication" src/afm/models.py

Repository: wso2/reference-implementations-afm

Length of output: 121


🏁 Script executed:

cd python-interpreter/packages/afm-core && grep -n "model_fields_set" src/afm/models.py

Repository: wso2/reference-implementations-afm

Length of output: 60


🏁 Script executed:

cd python-interpreter/packages/afm-core && grep -i "pydantic" pyproject.toml setup.py setup.cfg requirements.txt 2>/dev/null | head -20

Repository: wso2/reference-implementations-afm

Length of output: 114


🏁 Script executed:

cd python-interpreter/packages/afm-core && cat pyproject.toml | grep -A 5 "pydantic"

Repository: wso2/reference-implementations-afm

Length of output: 211


🏁 Script executed:

cd python-interpreter/packages/afm-core && python3 -c "from pydantic import BaseModel; m = BaseModel(); print(hasattr(m, 'model_fields_set'))"

Repository: wso2/reference-implementations-afm

Length of output: 788


🏁 Script executed:

cd python-interpreter/packages/afm-core && python3 <<'PY'
from pydantic import BaseModel

class TestModel(BaseModel):
    field1: str | None = None
    field2: str | None = None

# Test 1: Create with field1 explicitly set to None
m1 = TestModel(field1=None, field2="value")
print("model_fields_set with explicit None:", m1.model_fields_set)

# Test 2: Create without field1 provided at all
m2 = TestModel(field2="value")
print("model_fields_set without field1:", m2.model_fields_set)

# Test 3: Create with field1 set to a value
m3 = TestModel(field1="test", field2="value")
print("model_fields_set with field1 value:", m3.model_fields_set)
PY

Repository: wso2/reference-implementations-afm

Length of output: 237


🏁 Script executed:

cd python-interpreter/packages/afm-core && python3 <<'PY'
# Now test the scenario from the review comment
from pydantic import BaseModel, ConfigDict

_JWT_ALLOWED_FIELDS = {
    "issuer",
    "audience",
    "signing_key",
    "algorithm",
    "key_id",
    "subject",
    "custom_claims",
    "expiry_seconds",
}

_AUTH_ALLOWED_FIELDS = {
    "jwt": _JWT_ALLOWED_FIELDS,
}

_AUTH_CREDENTIAL_FIELDS = (
    "issuer",
    "audience",
    "signing_key",
    "algorithm",
    "key_id",
    "subject",
    "custom_claims",
    "expiry_seconds",
)

class TestAuth(BaseModel):
    model_config = ConfigDict(extra="allow")
    
    type: str
    issuer: str | None = None
    audience: str | None = None
    signing_key: str | None = None
    algorithm: str | None = None
    key_id: str | None = None
    subject: str | None = None
    custom_claims: dict | None = None
    expiry_seconds: int | None = None

# Scenario: type='jwt' but token set to null (unsupported field)
# and an unsupported field 'unsupported_field' set to null
test = TestAuth(type="jwt", issuer="test", audience="aud", signing_key="key", unsupported_field=None)

# Current implementation (non-None check)
provided_current = {
    name for name in _AUTH_CREDENTIAL_FIELDS if getattr(test, name) is not None
}
provided_current |= set(test.model_extra or {})

print("Current 'provided' set:", provided_current)
print("model_fields_set:", test.model_fields_set)
print("model_extra:", test.model_extra)

# Check what happens with unknown field validation
unknown = provided_current - _AUTH_ALLOWED_FIELDS["jwt"]
print("Unknown fields (current):", unknown)

# Proposed implementation
provided_keys = {
    name for name in _AUTH_CREDENTIAL_FIELDS if name in test.model_fields_set
}
provided_keys |= set(test.model_extra or {})

print("\nProposed 'provided_keys' set:", provided_keys)
unknown_proposed = provided_keys - _AUTH_ALLOWED_FIELDS["jwt"]
print("Unknown fields (proposed):", unknown_proposed)
PY

Repository: wso2/reference-implementations-afm

Length of output: 474


🏁 Script executed:

cd python-interpreter/packages/afm-core && python3 <<'PY'
# Test the exact scenario from the review: token field (which is not JWT-allowed) set to null
from pydantic import BaseModel, ConfigDict

_JWT_ALLOWED_FIELDS = {
    "issuer",
    "audience",
    "signing_key",
    "algorithm",
    "key_id",
    "subject",
    "custom_claims",
    "expiry_seconds",
}

_AUTH_ALLOWED_FIELDS = {
    "jwt": _JWT_ALLOWED_FIELDS,
}

_AUTH_CREDENTIAL_FIELDS = (
    "token",  # This is a credential field but NOT allowed for JWT
    "issuer",
    "audience",
    "signing_key",
    "algorithm",
    "key_id",
    "subject",
    "custom_claims",
    "expiry_seconds",
)

class TestAuth(BaseModel):
    model_config = ConfigDict(extra="allow")
    
    type: str
    token: str | None = None
    issuer: str | None = None
    audience: str | None = None
    signing_key: str | None = None
    algorithm: str | None = None
    key_id: str | None = None
    subject: str | None = None
    custom_claims: dict | None = None
    expiry_seconds: int | None = None

# Scenario from review: type='jwt' with token: null (unsupported for JWT)
test = TestAuth(type="jwt", token=None, issuer="test", audience="aud", signing_key="key")

# Current implementation
provided_current = {
    name for name in _AUTH_CREDENTIAL_FIELDS if getattr(test, name) is not None
}
provided_current |= set(test.model_extra or {})

print("Scenario: JWT type with token=null (token not allowed for JWT)")
print("Current 'provided' set:", provided_current)
print("model_extra:", test.model_extra)
print("getattr(test, 'token'):", getattr(test, 'token'))

# Check what happens with unknown field validation
unknown = provided_current - _AUTH_ALLOWED_FIELDS["jwt"]
print("Unknown fields (current):", unknown)
print("-> Would 'token' be caught as unknown?", 'token' in unknown)

print("\n" + "="*60)
print("This shows the issue: token is NOT in the 'provided' set")
print("because getattr(test, 'token') is None, so it's excluded")
print("And model_extra is empty, so nothing adds it back")
PY

Repository: wso2/reference-implementations-afm

Length of output: 551


🏁 Script executed:

cd python-interpreter/packages/afm-core && sed -n '102,170p' src/afm/models.py

Repository: wso2/reference-implementations-afm

Length of output: 2552


🏁 Script executed:

cd python-interpreter/packages/afm-core && python3 <<'PY'
# Final validation: test that model_fields_set correctly tracks explicit None assignments
from pydantic import BaseModel, ConfigDict

_AUTH_CREDENTIAL_FIELDS = ("token", "issuer", "audience", "signing_key")

_JWT_ALLOWED_FIELDS = {"issuer", "audience", "signing_key"}

class TestAuth(BaseModel):
    model_config = ConfigDict(extra="allow")
    type: str
    token: str | None = None
    issuer: str | None = None
    audience: str | None = None
    signing_key: str | None = None

# Test both approaches with explicit None on unsupported field
auth = TestAuth(type="jwt", token=None, issuer="iss", audience="aud", signing_key="key")

print("=== Current approach (fails to catch token=None) ===")
provided_current = {
    name for name in _AUTH_CREDENTIAL_FIELDS if getattr(auth, name) is not None
}
provided_current |= set(auth.model_extra or {})
unknown_current = provided_current - _JWT_ALLOWED_FIELDS
print(f"provided: {provided_current}")
print(f"unknown: {unknown_current}")
print(f"Would reject 'token=None'? {bool(unknown_current)}")

print("\n=== Proposed approach (catches explicit None) ===")
provided_with_value = {
    name for name in _AUTH_CREDENTIAL_FIELDS if getattr(auth, name) is not None
}
provided_keys = {
    name for name in _AUTH_CREDENTIAL_FIELDS if name in auth.model_fields_set
}
provided_keys |= set(auth.model_extra or {})
unknown_proposed = provided_keys - _JWT_ALLOWED_FIELDS
print(f"provided_with_value: {provided_with_value}")
print(f"provided_keys: {provided_keys}")
print(f"unknown: {unknown_proposed}")
print(f"Would reject 'token=None'? {bool(unknown_proposed)}")
PY

Repository: wso2/reference-implementations-afm

Length of output: 450


Use model_fields_set to catch unsupported typed fields explicitly set to null.

The current logic at line 139 excludes None values when building the provided set, which prevents detection of unsupported fields assigned to null. For example, a JWT auth config with token: null bypasses the unsupported-field check at line 154 because token is not included in provided. Use model_fields_set to distinguish between fields with non-null values (for required checks) and fields that were explicitly provided (for allowed/unsupported checks).

Proposed fix
-        provided = {
+        provided_with_value = {
             name for name in _AUTH_CREDENTIAL_FIELDS if getattr(self, name) is not None
         }
-        provided |= set(self.model_extra or {})
+        provided_keys = {
+            name for name in _AUTH_CREDENTIAL_FIELDS if name in self.model_fields_set
+        }
+        provided_keys |= set(self.model_extra or {})

         if auth_type == "oauth2":
-            self._validate_oauth2_fields(provided)
+            self._validate_oauth2_fields(provided_with_value, provided_keys)
             return self

-        missing = _AUTH_REQUIRED_FIELDS[auth_type] - provided
+        missing = _AUTH_REQUIRED_FIELDS[auth_type] - provided_with_value
         if missing:
             fields = ", ".join(f"'{name}'" for name in sorted(missing))
             suffix = "field" if len(missing) == 1 else "fields"
             raise ValueError(f"type '{auth_type}' requires {fields} {suffix}")

-        unknown = provided - _AUTH_ALLOWED_FIELDS[auth_type]
+        unknown = provided_keys - _AUTH_ALLOWED_FIELDS[auth_type]
         if unknown:
             fields = ", ".join(f"'{name}'" for name in sorted(unknown))
             suffix = "field" if len(unknown) == 1 else "fields"
             raise ValueError(f"type '{auth_type}' does not support {fields} {suffix}")

-    def _validate_oauth2_fields(self, provided: set[str]) -> None:
+    def _validate_oauth2_fields(
+        self, provided_with_value: set[str], provided_keys: set[str]
+    ) -> None:
         if self.grant_type is None:
             raise ValueError("type 'oauth2' requires 'grant_type' field")
@@
-        missing = spec["required"] - provided
+        missing = spec["required"] - provided_with_value
@@
-        unknown = provided - allowed
+        unknown = provided_keys - allowed

Also applies to: 154-158, 181-188

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@python-interpreter/packages/afm-core/src/afm/models.py` around lines 139 -
143, The current logic building the `provided` set at the initialization
filtering excludes `None` values, preventing detection of fields explicitly set
to null. Replace the set comprehension that skips `None` values with
`model_fields_set` to capture all fields that were explicitly provided by the
user. Then update the validation logic that checks for unsupported fields (which
currently relies on the `provided` set) to use this new approach based on
`model_fields_set`, ensuring that fields like `token` explicitly set to `null`
are properly detected and rejected. This change should be consistently applied
wherever the `provided` set is used for validation checks.

Source: Coding guidelines

"langchain-anthropic>=1.3.1",
"mcp>=1.26.0",
"langchain-mcp-adapters>=0.2.1",
"pyjwt[crypto]>=2.10.0",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check available PyJWT versions on PyPI
curl -s https://pypi.org/pypi/PyJWT/json | jq -r '.releases | keys | .[]' | sort -V | tail -10

# Check for security advisories
gh api graphql -f query='
{
  securityVulnerabilities(first: 5, ecosystem: PIP, package: "pyjwt") {
    nodes {
      advisory {
        summary
        severity
        publishedAt
      }
      vulnerableVersionRange
      firstPatchedVersion {
        identifier
      }
    }
  }
}'

Repository: wso2/reference-implementations-afm

Length of output: 321


🌐 Web query:

PyJWT 2.10.0 security vulnerabilities CVE advisories

💡 Result:

PyJWT version 2.10.0 was introduced with at least one confirmed security vulnerability and is affected by several subsequent advisories affecting the 2.x line [1][2][3]. Security Vulnerabilities Affecting PyJWT 2.10.0: 1. CVE-2024-53861 (GHSA-75c5-xw7c-p5pm): This vulnerability involves incorrect string comparison when checking the "iss" (issuer) claim [4][2]. The bug was introduced in version 2.10.0 and allows partial matching of the issuer field (e.g., "acb" could be accepted for "abc") [4][2]. This was patched in version 2.10.1 [4][5]. 2. Vulnerabilities Fixed in 2.13.0 (Also affecting 2.10.0): Because version 2.10.0 is an older release, it is affected by a range of vulnerabilities discovered and patched in later versions, including those addressed in the 2.13.0 release (May 2026) [3][6]: - Algorithm Confusion (CVE-2026-48526 / GHSA-xgmm-8j9v-c9wx): A vulnerability where a public-key JWK accepted as an HMAC secret could enable forged HS256 tokens [7][6]. - Algorithm Allow-list Bypass (GHSA-jq35-7prp-9v3f): An issue where the caller's algorithm allow-list could be bypassed when decoding with a PyJWK or PyJWKClient key [6][8]. - SSRF in PyJWKClient (GHSA-993g-76c3-p5m4): Lack of URI scheme validation in PyJWKClient allowed unauthorized access to local files or other schemes via urllib (e.g., file://, ftp://) [6][9]. - Denial of Service (GHSA-w7vc-732c-9m39): An unauthenticated DoS amplifier related to base64 decoding of unused payload segments when b64=false [6][10]. Recommendation: Users of PyJWT 2.10.0 are strongly advised to upgrade to the latest version (2.13.0 or newer) to mitigate these and other security risks [3][6][11]. There are no known workarounds for several of these vulnerabilities [2]. Always review the official changelog and GitHub security advisories for the most comprehensive list of patches [11][10].

Citations:


Upgrade PyJWT dependency to address security advisories.

Version 2.10.0 is affected by multiple security advisories, including CVE-2024-53861 (introduced in that version and patched in 2.10.1) and several others addressed in later releases. Upgrade to version 2.13.0 or newer.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@python-interpreter/packages/afm-langchain/pyproject.toml` at line 20, The
pyjwt[crypto] dependency in the pyproject.toml file is pinned to version 2.10.0,
which contains security vulnerabilities including CVE-2024-53861. Update the
version constraint for pyjwt[crypto] from >=2.10.0 to >=2.13.0 to address these
security advisories and ensure a secure version is used.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR standardizes AFM authentication schemas across the Python and Ballerina reference runtimes, adds parse-time validation for all supported auth types, and implements runtime support for jwt and oauth2 authentication (plus api-key header customization in Python, and explicit transport limitation in Ballerina).

Changes:

  • Add jwt signing and oauth2 grant handling to the Python MCP HTTP client auth wiring (and tests).
  • Introduce strict, type-aware authentication schema validation at parse/load time in both runtimes.
  • Extend Ballerina transport auth mapping to support jwt/oauth2 and explicitly reject api-key for MCP/webhook transport.

Reviewed changes

Copilot reviewed 8 out of 9 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
python-interpreter/uv.lock Adds pyjwt[crypto] to the locked dependency set.
python-interpreter/packages/afm-langchain/pyproject.toml Adds pyjwt[crypto]>=2.10.0 dependency.
python-interpreter/packages/afm-langchain/src/afm_langchain/tools/mcp.py Implements JwtAuth + OAuth2Auth and wires them via build_httpx_auth.
python-interpreter/packages/afm-langchain/tests/test_mcp.py Adds unit tests for api-key header override, JWT signing, and OAuth2 token request behavior/caching.
python-interpreter/packages/afm-core/src/afm/models.py Expands ClientAuthentication fields and adds strict per-type/grant validation.
python-interpreter/packages/afm-core/tests/test_parser.py Adds validation tests ensuring auth errors are raised at parse time.
ballerina-interpreter/parser.bal Adds parse-time validation for authentication blocks across metadata locations.
ballerina-interpreter/agent.bal Maps oauth2/jwt auth into Ballerina http client auth configs; rejects api-key for transport.
ballerina-interpreter/tests/main_test.bal Adds tests for new mapping + validation behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +402 to +410
string? keyId = jwtConfig?.key_id;
if keyId is string {
issuerConfig["keyId"] = keyId;
}
string? subject = jwtConfig?.subject;
if subject is string {
issuerConfig["username"] = subject;
}
map<json>? customClaims = jwtConfig?.custom_claims;
Comment on lines +439 to +460
map<json> grantConfig = {tokenUrl: cfg?.token_url, clientId: cfg?.client_id, clientSecret: cfg?.client_secret};
addScopes(grantConfig, cfg?.scopes);
return wrapOAuth2(grantConfig.cloneWithType(http:OAuth2ClientCredentialsGrantConfig));
}
"password" => {
map<json> grantConfig = {tokenUrl: cfg?.token_url, username: cfg?.username, password: cfg?.password};
addOptional(grantConfig, "clientId", cfg?.client_id);
addOptional(grantConfig, "clientSecret", cfg?.client_secret);
addScopes(grantConfig, cfg?.scopes);
return wrapOAuth2(grantConfig.cloneWithType(http:OAuth2PasswordGrantConfig));
}
"refresh_token" => {
map<json> grantConfig = {refreshUrl: cfg?.refresh_url, refreshToken: cfg?.refresh_token, clientId: cfg?.client_id, clientSecret: cfg?.client_secret};
addScopes(grantConfig, cfg?.scopes);
return wrapOAuth2(grantConfig.cloneWithType(http:OAuth2RefreshTokenGrantConfig));
}
"jwt_bearer" => {
map<json> grantConfig = {tokenUrl: cfg?.token_url, assertion: cfg?.assertion};
addOptional(grantConfig, "clientId", cfg?.client_id);
addOptional(grantConfig, "clientSecret", cfg?.client_secret);
addScopes(grantConfig, cfg?.scopes);
return wrapOAuth2(grantConfig.cloneWithType(http:OAuth2JwtBearerGrantConfig));
Comment on lines +81 to +99
self.issuer = issuer
self.audience = audience
self.signing_key = signing_key
self.algorithm = algorithm
self.key_id = key_id
self.subject = subject
self.custom_claims = custom_claims or {}
self.expiry_seconds = expiry_seconds

def _resolve_key(self) -> str:
if self.algorithm.upper() in _HMAC_JWT_ALGORITHMS:
return self.signing_key
try:
return Path(self.signing_key).read_text()
except OSError as e:
raise MCPAuthenticationError(
f"Could not read JWT signing key file '{self.signing_key}': {e}"
) from e

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants