Support public search for Cosmetology#1305
Support public search for Cosmetology#1305landonshumway-ia wants to merge 7 commits intocsg-org:mainfrom
Conversation
📝 WalkthroughWalkthroughRemoves summary-report notification email fields across models, schemas, docs, and tests; adds a new public, unauthenticated license search Lambda with OpenSearch integration, licenseNumber filtering, and cursor-based pagination; wires search stack into CDK stacks and updates tests and API models accordingly. Changes
Sequence DiagramsequenceDiagram
participant Client as Client
participant APIG as ApiGateway
participant Handler as PublicSearchHandler
participant OS as OpenSearch
participant DB as DynamoDB
Client->>APIG: POST /v1/compacts/{compact}/providers/query (licenseNumber/jurisdiction/familyName)
APIG->>Handler: public_search_api_handler(event, context)
Handler->>Handler: _parse_and_validate_public_query_body()
Handler->>Handler: _decode_public_cursor() (if lastKey)
Handler->>Handler: _build_public_license_search_body()
Handler->>OS: search(index: compact_{compact}_providers)
OS-->>Handler: hits[] with nested inner_hits (licenses)
Handler->>Handler: sanitize hits via PublicLicenseSearchResponseSchema
Handler->>Handler: _encode_public_cursor() → next lastKey
Handler-->>APIG: {providers, pageSize, lastKey, query}
APIG-->>Client: 200 OK (paginated results)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Also added custom cursor logic for tracking accurate page size of licenses.
4ad6962 to
5f375cd
Compare
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (3)
backend/cosmetology-app/lambdas/python/search/tests/function/test_public_search_providers.py (1)
15-31: Preserve explicit empty request bodies in the event helper.Line [29] currently turns
{}intoNone. Use an explicitis not Nonecheck so tests can intentionally send an empty JSON object when needed.🔧 Proposed fix
- 'body': json.dumps(body) if body else None, + 'body': json.dumps(body) if body is not None else None,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/cosmetology-app/lambdas/python/search/tests/function/test_public_search_providers.py` around lines 15 - 31, The helper _create_public_api_event currently converts an explicit empty dict body ({}) into None by using a truthy check; change the body serialization to use an explicit "is not None" check so that body=json.dumps(body) if body is not None else None, ensuring tests can send an intentional empty JSON object; update the 'body' field generation inside _create_public_api_event to use this explicit check and leave all other fields unchanged.backend/cosmetology-app/lambdas/python/search/handlers/public_search.py (2)
74-78: Consider logging when resume_provider_id is not found.If
resume_provider_idfrom the cursor doesn't match any provider in the current result set (e.g., due to data changes between pagination requests), the resume logic silently processes all providers without skipping. This could lead to duplicate results across pages.Adding a warning log after the main loop when
resume_provider_idis still set (not None) would help with debugging pagination inconsistencies in production.🔍 Proposed enhancement
prev_sort = last_sort last_sort = hit.get('sort') + if resume_provider_id is not None: + logger.warning( + 'Resume provider not found in results, possible data change during pagination', + resume_provider_id=resume_provider_id, + ) + last_key = None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/cosmetology-app/lambdas/python/search/handlers/public_search.py` around lines 74 - 78, Add a warning log when the resume cursor's provider id is never found in the current result set: after the loop that inspects each hit (where inner_hits, provider_id, skip, resume_offset and resume_provider_id are used and possibly reset), check if resume_provider_id is still not None and if so emit a warning via the existing logger (include the unresolved resume_provider_id and resume_offset) so pagination inconsistencies are visible in production.
94-98: Redundantpop('jurisdiction')call.The
PublicLicenseSearchResponseSchema.rename_jurisdiction_to_license_jurisdiction@post_loadmethod already pops'jurisdiction'from the data dict. This line is a no-op.♻️ Proposed fix
try: sanitized = license_schema.load(license_source) - sanitized.pop('jurisdiction', None) providers.append(sanitized) consumed_from_this_provider += 1🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/cosmetology-app/lambdas/python/search/handlers/public_search.py` around lines 94 - 98, Remove the redundant pop call: after deserializing with license_schema.load(license_source) (in the try block that appends to providers and increments consumed_from_this_provider), delete the line sanitized.pop('jurisdiction', None) because PublicLicenseSearchResponseSchema.rename_jurisdiction_to_license_jurisdiction already removes 'jurisdiction'; leave the rest of the block (providers.append(sanitized) and consumed_from_this_provider += 1) unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@backend/cosmetology-app/lambdas/python/common/cc_common/data_model/schema/provider/api.py`:
- Line 235: The schema field licenseNumber currently validates with
Length(min=1, max=100) which conflicts with the API contract's maxLength of 500;
update the validation on the licenseNumber field (the String(...) call in
provider/api.py) to use max=500 so incoming payloads allowed by the request
schema are accepted (keep min=1 and allow_none=False as-is).
In `@backend/cosmetology-app/lambdas/python/search/handlers/public_search.py`:
- Around line 19-21: The inline comment above the OpenSearch client
instantiation is inconsistent: it says "20 seconds" but the code sets
timeout=25; update the comment that precedes opensearch_client =
OpenSearchClient(timeout=25) to state "25 seconds" (or otherwise match the
actual timeout value set on OpenSearchClient) so the comment and the
instantiation of OpenSearchClient are consistent.
In `@backend/cosmetology-app/stacks/api_stack/v1_api/api_model.py`:
- Around line 113-118: The query schema's licenseNumber field in api_model.py
allows max_length=500 which is inconsistent with the public query/response
schemas capped at 100; update the JsonSchema for 'licenseNumber' used in the
query model (the entry with key 'licenseNumber' in the query schema block) to
use max_length=100 (and ensure min_length stays 1) so all related models
(query.licenseNumber and the public response/query licenseNumber schemas) share
the same 1..100 constraint and avoid contract drift.
In `@backend/cosmetology-app/stacks/search_persistent_stack/search_handler.py`:
- Around line 80-99: PublicSearchProvidersFunction is being created with the
shared lambda_role which grants export-bucket and other internal privileges;
create a dedicated IAM role (e.g., public_search_lambda_role) for the public
handler with only the minimal permissions it needs (CloudWatch Logs,
network/VPC, and specific OpenSearch access) and do not attach the export-bucket
or internal datastore policies; then pass that new role into the PythonFunction
call for PublicSearchProvidersFunction instead of lambda_role and ensure any
code that previously relied on the shared role’s extra policies is updated or
kept out of the public function.
---
Nitpick comments:
In `@backend/cosmetology-app/lambdas/python/search/handlers/public_search.py`:
- Around line 74-78: Add a warning log when the resume cursor's provider id is
never found in the current result set: after the loop that inspects each hit
(where inner_hits, provider_id, skip, resume_offset and resume_provider_id are
used and possibly reset), check if resume_provider_id is still not None and if
so emit a warning via the existing logger (include the unresolved
resume_provider_id and resume_offset) so pagination inconsistencies are visible
in production.
- Around line 94-98: Remove the redundant pop call: after deserializing with
license_schema.load(license_source) (in the try block that appends to providers
and increments consumed_from_this_provider), delete the line
sanitized.pop('jurisdiction', None) because
PublicLicenseSearchResponseSchema.rename_jurisdiction_to_license_jurisdiction
already removes 'jurisdiction'; leave the rest of the block
(providers.append(sanitized) and consumed_from_this_provider += 1) unchanged.
In
`@backend/cosmetology-app/lambdas/python/search/tests/function/test_public_search_providers.py`:
- Around line 15-31: The helper _create_public_api_event currently converts an
explicit empty dict body ({}) into None by using a truthy check; change the body
serialization to use an explicit "is not None" check so that
body=json.dumps(body) if body is not None else None, ensuring tests can send an
intentional empty JSON object; update the 'body' field generation inside
_create_public_api_event to use this explicit check and leave all other fields
unchanged.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (47)
backend/cosmetology-app/bin/run_python_tests.pybackend/cosmetology-app/docs/internal/api-specification/latest-oas30.jsonbackend/cosmetology-app/docs/internal/postman/postman-collection.jsonbackend/cosmetology-app/lambdas/nodejs/email-notification-service/README.mdbackend/cosmetology-app/lambdas/nodejs/lib/models/compact.tsbackend/cosmetology-app/lambdas/nodejs/lib/models/email-notification-service-event.tsbackend/cosmetology-app/lambdas/nodejs/lib/models/jurisdiction.tsbackend/cosmetology-app/lambdas/nodejs/tests/email-notification-service.test.tsbackend/cosmetology-app/lambdas/nodejs/tests/lib/compact-configuration-client.test.tsbackend/cosmetology-app/lambdas/nodejs/tests/lib/email/encumbrance-notification-service.test.tsbackend/cosmetology-app/lambdas/nodejs/tests/lib/email/investigation-notification-service.test.tsbackend/cosmetology-app/lambdas/nodejs/tests/lib/jurisdiction-client.test.tsbackend/cosmetology-app/lambdas/nodejs/tests/sample-records.tsbackend/cosmetology-app/lambdas/python/common/cc_common/data_model/schema/compact/__init__.pybackend/cosmetology-app/lambdas/python/common/cc_common/data_model/schema/compact/api.pybackend/cosmetology-app/lambdas/python/common/cc_common/data_model/schema/compact/record.pybackend/cosmetology-app/lambdas/python/common/cc_common/data_model/schema/jurisdiction/__init__.pybackend/cosmetology-app/lambdas/python/common/cc_common/data_model/schema/jurisdiction/api.pybackend/cosmetology-app/lambdas/python/common/cc_common/data_model/schema/jurisdiction/record.pybackend/cosmetology-app/lambdas/python/common/cc_common/data_model/schema/provider/api.pybackend/cosmetology-app/lambdas/python/common/common_test/test_data_generator.pybackend/cosmetology-app/lambdas/python/common/tests/resources/dynamo/compact.jsonbackend/cosmetology-app/lambdas/python/common/tests/resources/dynamo/jurisdiction.jsonbackend/cosmetology-app/lambdas/python/compact-configuration/handlers/compact_configuration.pybackend/cosmetology-app/lambdas/python/compact-configuration/tests/function/test_compact_configuration.pybackend/cosmetology-app/lambdas/python/search/handlers/public_search.pybackend/cosmetology-app/lambdas/python/search/tests/__init__.pybackend/cosmetology-app/lambdas/python/search/tests/function/__init__.pybackend/cosmetology-app/lambdas/python/search/tests/function/test_public_search_providers.pybackend/cosmetology-app/pipeline/backend_stage.pybackend/cosmetology-app/stacks/api_lambda_stack/public_lookup_api.pybackend/cosmetology-app/stacks/api_stack/__init__.pybackend/cosmetology-app/stacks/api_stack/api.pybackend/cosmetology-app/stacks/api_stack/v1_api/api.pybackend/cosmetology-app/stacks/api_stack/v1_api/api_model.pybackend/cosmetology-app/stacks/api_stack/v1_api/public_lookup_api.pybackend/cosmetology-app/stacks/search_persistent_stack/search_handler.pybackend/cosmetology-app/tests/app/test_api/test_public_lookup_api.pybackend/cosmetology-app/tests/resources/snapshots/GET_COMPACT_CONFIGURATION_RESPONSE_SCHEMA.jsonbackend/cosmetology-app/tests/resources/snapshots/GET_JURISDICTION_CONFIGURATION_RESPONSE_SCHEMA.jsonbackend/cosmetology-app/tests/resources/snapshots/PUBLIC_QUERY_PROVIDERS_REQUEST_SCHEMA.jsonbackend/cosmetology-app/tests/resources/snapshots/PUBLIC_QUERY_PROVIDERS_RESPONSE_SCHEMA.jsonbackend/cosmetology-app/tests/resources/snapshots/PUT_COMPACT_CONFIGURATION_REQUEST_SCHEMA.jsonbackend/cosmetology-app/tests/resources/snapshots/PUT_JURISDICTION_CONFIGURATION_REQUEST_SCHEMA.jsonbackend/cosmetology-app/tests/resources/snapshots/QUERY_PROVIDERS_REQUEST_SCHEMA.jsonbackend/cosmetology-app/tests/smoke/compact_configuration_smoke_tests.pybackend/cosmetology-app/tests/smoke/rollback_license_upload_smoke_tests.py
💤 Files with no reviewable changes (26)
- backend/cosmetology-app/tests/resources/snapshots/GET_JURISDICTION_CONFIGURATION_RESPONSE_SCHEMA.json
- backend/cosmetology-app/lambdas/python/common/tests/resources/dynamo/jurisdiction.json
- backend/cosmetology-app/lambdas/nodejs/lib/models/email-notification-service-event.ts
- backend/cosmetology-app/lambdas/nodejs/tests/email-notification-service.test.ts
- backend/cosmetology-app/lambdas/python/common/cc_common/data_model/schema/compact/api.py
- backend/cosmetology-app/tests/resources/snapshots/GET_COMPACT_CONFIGURATION_RESPONSE_SCHEMA.json
- backend/cosmetology-app/lambdas/python/common/common_test/test_data_generator.py
- backend/cosmetology-app/tests/smoke/rollback_license_upload_smoke_tests.py
- backend/cosmetology-app/tests/resources/snapshots/PUT_JURISDICTION_CONFIGURATION_REQUEST_SCHEMA.json
- backend/cosmetology-app/tests/resources/snapshots/PUT_COMPACT_CONFIGURATION_REQUEST_SCHEMA.json
- backend/cosmetology-app/lambdas/python/compact-configuration/handlers/compact_configuration.py
- backend/cosmetology-app/lambdas/python/common/cc_common/data_model/schema/jurisdiction/record.py
- backend/cosmetology-app/lambdas/nodejs/lib/models/compact.ts
- backend/cosmetology-app/tests/smoke/compact_configuration_smoke_tests.py
- backend/cosmetology-app/lambdas/nodejs/tests/lib/jurisdiction-client.test.ts
- backend/cosmetology-app/lambdas/nodejs/email-notification-service/README.md
- backend/cosmetology-app/lambdas/nodejs/tests/lib/compact-configuration-client.test.ts
- backend/cosmetology-app/lambdas/nodejs/lib/models/jurisdiction.ts
- backend/cosmetology-app/lambdas/python/common/cc_common/data_model/schema/compact/init.py
- backend/cosmetology-app/lambdas/python/common/tests/resources/dynamo/compact.json
- backend/cosmetology-app/lambdas/python/common/cc_common/data_model/schema/jurisdiction/init.py
- backend/cosmetology-app/lambdas/python/common/cc_common/data_model/schema/compact/record.py
- backend/cosmetology-app/lambdas/python/common/cc_common/data_model/schema/jurisdiction/api.py
- backend/cosmetology-app/docs/internal/api-specification/latest-oas30.json
- backend/cosmetology-app/lambdas/nodejs/tests/sample-records.ts
- backend/cosmetology-app/lambdas/python/compact-configuration/tests/function/test_compact_configuration.py
| jurisdiction = Jurisdiction(required=False, allow_none=False) | ||
| givenName = String(required=False, allow_none=False, validate=Length(min=1, max=100)) | ||
| familyName = String(required=False, allow_none=False, validate=Length(min=1, max=100)) | ||
| licenseNumber = String(required=False, allow_none=False, validate=Length(min=1, max=100)) |
There was a problem hiding this comment.
Align licenseNumber max length with the request contract.
Line 235 enforces max=100, but the request schema snapshots in this PR advertise maxLength: 500. That mismatch will reject payloads the API contract says are valid.
Proposed fix
- licenseNumber = String(required=False, allow_none=False, validate=Length(min=1, max=100))
+ licenseNumber = String(required=False, allow_none=False, validate=Length(min=1, max=500))📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| licenseNumber = String(required=False, allow_none=False, validate=Length(min=1, max=100)) | |
| licenseNumber = String(required=False, allow_none=False, validate=Length(min=1, max=500)) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@backend/cosmetology-app/lambdas/python/common/cc_common/data_model/schema/provider/api.py`
at line 235, The schema field licenseNumber currently validates with
Length(min=1, max=100) which conflicts with the API contract's maxLength of 500;
update the validation on the licenseNumber field (the String(...) call in
provider/api.py) to use max=500 so incoming payloads allowed by the request
schema are accepted (keep min=1 and allow_none=False as-is).
| # Instantiate the OpenSearch client outside the handler to cache the connection between invocations | ||
| # Set timeout to 20 seconds to give API gateway time to respond with response | ||
| opensearch_client = OpenSearchClient(timeout=25) |
There was a problem hiding this comment.
Comment/code mismatch: timeout value.
The comment states "20 seconds" but the code uses timeout=25. Update the comment to match the actual value.
📝 Proposed fix
# Instantiate the OpenSearch client outside the handler to cache the connection between invocations
-# Set timeout to 20 seconds to give API gateway time to respond with response
+# Set timeout to 25 seconds to give API gateway time to respond with response
opensearch_client = OpenSearchClient(timeout=25)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Instantiate the OpenSearch client outside the handler to cache the connection between invocations | |
| # Set timeout to 20 seconds to give API gateway time to respond with response | |
| opensearch_client = OpenSearchClient(timeout=25) | |
| # Instantiate the OpenSearch client outside the handler to cache the connection between invocations | |
| # Set timeout to 25 seconds to give API gateway time to respond with response | |
| opensearch_client = OpenSearchClient(timeout=25) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/cosmetology-app/lambdas/python/search/handlers/public_search.py`
around lines 19 - 21, The inline comment above the OpenSearch client
instantiation is inconsistent: it says "20 seconds" but the code sets
timeout=25; update the comment that precedes opensearch_client =
OpenSearchClient(timeout=25) to state "25 seconds" (or otherwise match the
actual timeout value set on OpenSearchClient) so the comment and the
instantiation of OpenSearchClient are consistent.
| 'licenseNumber': JsonSchema( | ||
| type=JsonSchemaType.STRING, | ||
| min_length=1, | ||
| max_length=500, | ||
| description='Filter for licenses with a specific license number', | ||
| ), |
There was a problem hiding this comment.
Align licenseNumber length constraints across related models.
query.licenseNumber here allows up to 500 chars, while the public query/response licenseNumber schemas cap at 100. This creates avoidable contract drift between request and public response surfaces.
🔧 Proposed fix
'licenseNumber': JsonSchema(
type=JsonSchemaType.STRING,
min_length=1,
- max_length=500,
+ max_length=100,
description='Filter for licenses with a specific license number',
),📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| 'licenseNumber': JsonSchema( | |
| type=JsonSchemaType.STRING, | |
| min_length=1, | |
| max_length=500, | |
| description='Filter for licenses with a specific license number', | |
| ), | |
| 'licenseNumber': JsonSchema( | |
| type=JsonSchemaType.STRING, | |
| min_length=1, | |
| max_length=100, | |
| description='Filter for licenses with a specific license number', | |
| ), |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/cosmetology-app/stacks/api_stack/v1_api/api_model.py` around lines
113 - 118, The query schema's licenseNumber field in api_model.py allows
max_length=500 which is inconsistent with the public query/response schemas
capped at 100; update the JsonSchema for 'licenseNumber' used in the query model
(the entry with key 'licenseNumber' in the query schema block) to use
max_length=100 (and ensure min_length stays 1) so all related models
(query.licenseNumber and the public response/query licenseNumber schemas) share
the same 1..100 constraint and avoid contract drift.
| self.public_handler = PythonFunction( | ||
| self, | ||
| 'PublicSearchProvidersFunction', | ||
| description='Public search handler for OpenSearch license queries', | ||
| index=os.path.join('handlers', 'public_search.py'), | ||
| lambda_dir='search', | ||
| handler='public_search_api_handler', | ||
| role=lambda_role, | ||
| log_retention=RetentionDays.ONE_MONTH, | ||
| environment={ | ||
| 'OPENSEARCH_HOST_ENDPOINT': opensearch_domain.domain_endpoint, | ||
| **stack.common_env_vars, | ||
| }, | ||
| timeout=Duration.seconds(29), | ||
| memory_size=2048, | ||
| vpc=vpc_stack.vpc, | ||
| vpc_subnets=vpc_subnets, | ||
| security_groups=[vpc_stack.lambda_security_group], | ||
| alarm_topic=alarm_topic, | ||
| ) |
There was a problem hiding this comment.
Avoid sharing the internal search role with the public search Lambda.
PublicSearchProvidersFunction currently reuses lambda_role, so it inherits export-bucket permissions granted for the internal handler. That violates least privilege for a public-facing function.
Proposed fix
self.public_handler = PythonFunction(
self,
'PublicSearchProvidersFunction',
description='Public search handler for OpenSearch license queries',
index=os.path.join('handlers', 'public_search.py'),
lambda_dir='search',
handler='public_search_api_handler',
- role=lambda_role,
log_retention=RetentionDays.ONE_MONTH,
environment={
'OPENSEARCH_HOST_ENDPOINT': opensearch_domain.domain_endpoint,
**stack.common_env_vars,
},
@@
)
opensearch_domain.grant_read(self.public_handler)
+
+ NagSuppressions.add_resource_suppressions_by_path(
+ stack,
+ f'{self.public_handler.role.node.path}/DefaultPolicy/Resource',
+ [
+ {
+ 'id': 'AwsSolutions-IAM5',
+ 'reason': 'Public search lambda needs wildcard index permissions for OpenSearch read operations.',
+ },
+ ],
+ )📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| self.public_handler = PythonFunction( | |
| self, | |
| 'PublicSearchProvidersFunction', | |
| description='Public search handler for OpenSearch license queries', | |
| index=os.path.join('handlers', 'public_search.py'), | |
| lambda_dir='search', | |
| handler='public_search_api_handler', | |
| role=lambda_role, | |
| log_retention=RetentionDays.ONE_MONTH, | |
| environment={ | |
| 'OPENSEARCH_HOST_ENDPOINT': opensearch_domain.domain_endpoint, | |
| **stack.common_env_vars, | |
| }, | |
| timeout=Duration.seconds(29), | |
| memory_size=2048, | |
| vpc=vpc_stack.vpc, | |
| vpc_subnets=vpc_subnets, | |
| security_groups=[vpc_stack.lambda_security_group], | |
| alarm_topic=alarm_topic, | |
| ) | |
| self.public_handler = PythonFunction( | |
| self, | |
| 'PublicSearchProvidersFunction', | |
| description='Public search handler for OpenSearch license queries', | |
| index=os.path.join('handlers', 'public_search.py'), | |
| lambda_dir='search', | |
| handler='public_search_api_handler', | |
| log_retention=RetentionDays.ONE_MONTH, | |
| environment={ | |
| 'OPENSEARCH_HOST_ENDPOINT': opensearch_domain.domain_endpoint, | |
| **stack.common_env_vars, | |
| }, | |
| timeout=Duration.seconds(29), | |
| memory_size=2048, | |
| vpc=vpc_stack.vpc, | |
| vpc_subnets=vpc_subnets, | |
| security_groups=[vpc_stack.lambda_security_group], | |
| alarm_topic=alarm_topic, | |
| ) | |
| opensearch_domain.grant_read(self.public_handler) | |
| NagSuppressions.add_resource_suppressions_by_path( | |
| stack, | |
| f'{self.public_handler.role.node.path}/DefaultPolicy/Resource', | |
| [ | |
| { | |
| 'id': 'AwsSolutions-IAM5', | |
| 'reason': 'Public search lambda needs wildcard index permissions for OpenSearch read operations.', | |
| }, | |
| ], | |
| ) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/cosmetology-app/stacks/search_persistent_stack/search_handler.py`
around lines 80 - 99, PublicSearchProvidersFunction is being created with the
shared lambda_role which grants export-bucket and other internal privileges;
create a dedicated IAM role (e.g., public_search_lambda_role) for the public
handler with only the minimal permissions it needs (CloudWatch Logs,
network/VPC, and specific OpenSearch access) and do not attach the export-bucket
or internal datastore policies; then pass that new role into the PythonFunction
call for PublicSearchProvidersFunction instead of lambda_role and ensure any
code that previously relied on the shared role’s extra policies is updated or
kept out of the public function.
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
backend/cosmetology-app/stacks/api_stack/v1_api/api_model.py (1)
113-118:⚠️ Potential issue | 🟠 MajorAlign
licenseNumberlength constraints across API model and runtime validation.Line 116 allows up to 500 chars, but runtime validation in
backend/cosmetology-app/lambdas/python/common/cc_common/data_model/schema/provider/api.py(Line 236) caps at 100. Requests in the 101–500 range will pass API model validation and fail in Lambda.🔧 Proposed fix
'licenseNumber': JsonSchema( type=JsonSchemaType.STRING, min_length=1, - max_length=500, + max_length=100, description='Filter for licenses with a specific license number', ),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/cosmetology-app/stacks/api_stack/v1_api/api_model.py` around lines 113 - 118, The API model's JsonSchema for 'licenseNumber' allows max_length=500 but runtime validation in cc_common.data_model.schema.provider.api (validator for licenseNumber) enforces max 100, causing mismatched behavior; update the JsonSchema definition (the 'licenseNumber' JsonSchema entry) to set max_length=100 to match the runtime validator (or alternatively raise the runtime cap to 500 if that is desired across the system) and ensure both JsonSchema and the validator in cc_common.data_model.schema.provider.api use the same max length value.backend/cosmetology-app/lambdas/python/search/handlers/public_search.py (1)
19-21:⚠️ Potential issue | 🟡 MinorFix timeout comment drift.
Line 20 says “20 seconds” while Line 21 configures
timeout=25.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/cosmetology-app/lambdas/python/search/handlers/public_search.py` around lines 19 - 21, Comment and code disagree about the OpenSearch client timeout: update the comment above the OpenSearchClient instantiation (referencing OpenSearchClient(timeout=25)) so it accurately states "25 seconds" (or alternatively change the timeout arg to 20 if you prefer the comment) — make the comment wording consistent with the configured timeout value.
🧹 Nitpick comments (1)
backend/cosmetology-app/stacks/api_lambda_stack/public_lookup_api.py (1)
55-58: Add a concrete cleanup owner/timeline for this temporary bridge.Since this keeps a legacy lambda/export alive during cutover, please attach a linked cleanup issue (or explicit release/date + owner) to ensure it cannot linger past migration.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/cosmetology-app/stacks/api_lambda_stack/public_lookup_api.py` around lines 55 - 58, Add a concrete cleanup owner and timeline to the temporary export for the legacy lambda: update the TODO near stack.export_value(self.query_providers_handler.function_arn) to include a JIRA/GitHub issue link or issue ID, the responsible owner (team or username), and a specific removal date/release (e.g., "remove by 2026-06-01 or release vX.Y"), and add a short comment that the export should be deleted after SearchPersistentStack.public_handler is fully live; ensure the comment references query_providers_handler.function_arn and stack.export_value so future readers can locate and remove the bridge.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/cosmetology-app/lambdas/python/search/handlers/public_search.py`:
- Around line 76-91: The cursor math is incorrect: when building the next cursor
inside the inner_hits loop update next_cursor_license_offset to reflect the
cumulative offset for the provider (e.g., resume_offset +
consumed_from_this_provider or consumed_from_this_provider if no resume_offset)
instead of only current-page consumption, and set next_cursor_search_after to
the current provider's sort value (use last_sort or hit.get('sort')) rather than
prev_sort so the resume uses the latest sort key; apply the same fixes to the
corresponding block around the other occurrence (the block at lines ~106-110)
and ensure consumed_from_this_provider is incremented consistently before
computing the next cursor.
- Around line 167-172: The code currently only checks presence of resume keys in
decoded but not their types; add explicit type validation after computing
has_resume: if has_resume ensure decoded['license_offset'] is an int (or
convertible to int) and decoded['resume_provider_sort'] is a list (and
optionally its items are of expected type), and if decoded.get('search_after')
is present ensure it is the expected type (e.g., list); for any type mismatch
raise CCInvalidRequestException('Invalid lastKey') so malformed lastKey values
return 400 instead of later causing 500; apply these checks where decoded is
validated (the block that sets has_resume and returns decoded in
public_search.py).
---
Duplicate comments:
In `@backend/cosmetology-app/lambdas/python/search/handlers/public_search.py`:
- Around line 19-21: Comment and code disagree about the OpenSearch client
timeout: update the comment above the OpenSearchClient instantiation
(referencing OpenSearchClient(timeout=25)) so it accurately states "25 seconds"
(or alternatively change the timeout arg to 20 if you prefer the comment) — make
the comment wording consistent with the configured timeout value.
In `@backend/cosmetology-app/stacks/api_stack/v1_api/api_model.py`:
- Around line 113-118: The API model's JsonSchema for 'licenseNumber' allows
max_length=500 but runtime validation in
cc_common.data_model.schema.provider.api (validator for licenseNumber) enforces
max 100, causing mismatched behavior; update the JsonSchema definition (the
'licenseNumber' JsonSchema entry) to set max_length=100 to match the runtime
validator (or alternatively raise the runtime cap to 500 if that is desired
across the system) and ensure both JsonSchema and the validator in
cc_common.data_model.schema.provider.api use the same max length value.
---
Nitpick comments:
In `@backend/cosmetology-app/stacks/api_lambda_stack/public_lookup_api.py`:
- Around line 55-58: Add a concrete cleanup owner and timeline to the temporary
export for the legacy lambda: update the TODO near
stack.export_value(self.query_providers_handler.function_arn) to include a
JIRA/GitHub issue link or issue ID, the responsible owner (team or username),
and a specific removal date/release (e.g., "remove by 2026-06-01 or release
vX.Y"), and add a short comment that the export should be deleted after
SearchPersistentStack.public_handler is fully live; ensure the comment
references query_providers_handler.function_arn and stack.export_value so future
readers can locate and remove the bridge.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
backend/cosmetology-app/lambdas/python/common/cc_common/data_model/schema/provider/api.pybackend/cosmetology-app/lambdas/python/search/handlers/public_search.pybackend/cosmetology-app/lambdas/python/search/tests/function/test_public_search_providers.pybackend/cosmetology-app/stacks/api_lambda_stack/public_lookup_api.pybackend/cosmetology-app/stacks/api_stack/v1_api/api_model.py
| skip = resume_offset if (resume_provider_id == provider_id) else 0 | ||
| if resume_provider_id == provider_id: | ||
| resume_provider_id = None | ||
| resume_offset = 0 | ||
| consumed_from_this_provider = 0 | ||
| for inner in inner_hits: | ||
| if skip > 0: | ||
| skip -= 1 | ||
| continue | ||
| if len(providers) >= page_size: | ||
| next_cursor_resume_provider_id = provider_id | ||
| next_cursor_resume_provider_sort = hit.get('sort') | ||
| next_cursor_license_offset = consumed_from_this_provider | ||
| last_sort = hit.get('sort') | ||
| next_cursor_search_after = prev_sort | ||
| break |
There was a problem hiding this comment.
Cursor progress tracking can duplicate pages or loop.
next_cursor_license_offset stores only current-page consumption (Line 88) instead of cumulative provider offset, and next_cursor_search_after uses a stale provider sort (Line 90). This breaks resume correctness when a provider spans multiple pages.
🛠️ Proposed fix
- for hit in hits:
+ for hit in hits:
+ prev_sort = last_sort
+ current_sort = hit.get('sort')
source = hit.get('_source', {})
provider_id = source.get('providerId')
@@
- skip = resume_offset if (resume_provider_id == provider_id) else 0
+ base_offset = resume_offset if (resume_provider_id == provider_id) else 0
+ skip = base_offset
if resume_provider_id == provider_id:
resume_provider_id = None
resume_offset = 0
@@
if len(providers) >= page_size:
next_cursor_resume_provider_id = provider_id
- next_cursor_resume_provider_sort = hit.get('sort')
- next_cursor_license_offset = consumed_from_this_provider
- last_sort = hit.get('sort')
+ next_cursor_resume_provider_sort = current_sort
+ next_cursor_license_offset = base_offset + consumed_from_this_provider
+ last_sort = current_sort
next_cursor_search_after = prev_sort
break
@@
- prev_sort = last_sort
- last_sort = hit.get('sort')
+ last_sort = current_sortAlso applies to: 106-110
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/cosmetology-app/lambdas/python/search/handlers/public_search.py`
around lines 76 - 91, The cursor math is incorrect: when building the next
cursor inside the inner_hits loop update next_cursor_license_offset to reflect
the cumulative offset for the provider (e.g., resume_offset +
consumed_from_this_provider or consumed_from_this_provider if no resume_offset)
instead of only current-page consumption, and set next_cursor_search_after to
the current provider's sort value (use last_sort or hit.get('sort')) rather than
prev_sort so the resume uses the latest sort key; apply the same fixes to the
corresponding block around the other occurrence (the block at lines ~106-110)
and ensure consumed_from_this_provider is incremented consistently before
computing the next cursor.
| has_resume = 'resume_provider_id' in decoded and 'license_offset' in decoded | ||
| if not has_resume and not decoded.get('search_after'): | ||
| raise CCInvalidRequestException('Invalid lastKey') | ||
| if has_resume and 'resume_provider_sort' not in decoded: | ||
| raise CCInvalidRequestException('Invalid lastKey') | ||
| return decoded |
There was a problem hiding this comment.
Validate decoded lastKey field types before using them.
Current checks only verify presence of resume keys. Malformed types (e.g., non-int license_offset, non-list resume_provider_sort) can pass decode and fail later in paging logic, returning a 500 instead of a 400.
🛡️ Proposed fix
has_resume = 'resume_provider_id' in decoded and 'license_offset' in decoded
if not has_resume and not decoded.get('search_after'):
raise CCInvalidRequestException('Invalid lastKey')
if has_resume and 'resume_provider_sort' not in decoded:
raise CCInvalidRequestException('Invalid lastKey')
+ if has_resume:
+ if not isinstance(decoded.get('resume_provider_id'), str):
+ raise CCInvalidRequestException('Invalid lastKey')
+ if not isinstance(decoded.get('resume_provider_sort'), list):
+ raise CCInvalidRequestException('Invalid lastKey')
+ if not isinstance(decoded.get('license_offset'), int) or decoded['license_offset'] < 0:
+ raise CCInvalidRequestException('Invalid lastKey')
+ if decoded.get('search_after') is not None and not isinstance(decoded.get('search_after'), list):
+ raise CCInvalidRequestException('Invalid lastKey')
return decoded🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/cosmetology-app/lambdas/python/search/handlers/public_search.py`
around lines 167 - 172, The code currently only checks presence of resume keys
in decoded but not their types; add explicit type validation after computing
has_resume: if has_resume ensure decoded['license_offset'] is an int (or
convertible to int) and decoded['resume_provider_sort'] is a list (and
optionally its items are of expected type), and if decoded.get('search_after')
is present ensure it is the expected type (e.g., list); for any type mismatch
raise CCInvalidRequestException('Invalid lastKey') so malformed lastKey values
return 400 instead of later causing 500; apply these checks where decoded is
validated (the block that sets has_resume and returns decoded in
public_search.py).
Cosmetology has slightly different requirements for public searching given the expected volume of practitioners with similar names. We need to support additional search parameters, such as license number. In order to accommodate these various patterns, we have determined to leverage the OpenSearch Domain to support public search, while keeping the request schema similar in shape to the public search request body used by JCC.
This requires us to support custom pagination tracking on the backend to ensure that all matching license records from a query are accounted for when paging through large numbers of matches.
This PR also includes the change to remove the financial summary related compact configuration fields, which were missed in the previous PR to remove all unneeded compact config fields.
Testing List
yarn test:unit:allshould run without errors or warningsyarn serveshould run without errors or warningsyarn buildshould run without errors or warningsbackend/compact-connect/tests/unit/test_api.pyrun compact-connect/bin/download_oas30.pyCloses #1295
Summary by CodeRabbit
New Features
Bug Fixes
Infrastructure
Tests & Docs