fix: update plg onboarding existing delivery type#2526
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
|
This PR will trigger a patch release when merged. |
There was a problem hiding this comment.
Hey @anshulk-public,
Strengths
src/controllers/plg/plg-onboarding.js:1118-1135: Step 5a placement between site resolution and canonical URL resolution is architecturally correct - detecting and correcting delivery type before downstream steps depend on it prevents cascading stale-config propagation.src/controllers/plg/plg-onboarding.js:1120-1131: The guard conditions are solid - skips when a preset is provided, ignores OTHER detection results, only clears config when an existing type was present, and checks for inequality before mutating.src/controllers/plg/plg-onboarding.js:1554-1556: Frescopa rejection is placed early in the validation chain (after domain format validation, before any DB work), which is the right ordering for short-circuit rejections. The case-insensitiveincludes()check prevents bypass via mixed-case input.test/controllers/plg/plg-onboarding.test.js:921-933: Frescopa tests cover both bare domain and subdomain cases with assertions on both status code and response message.test/controllers/plg/plg-onboarding.test.js:3373-3389: The delivery type correction test asserts on the state mutation (setDeliveryType) and the log message pattern, verifying observable behavior.
Issues
Important (Should Fix)
1. Config clearing without logging the cleared values - src/controllers/plg/plg-onboarding.js:1129-1130
When deliveryConfig and hlxConfig are set to null, the log message at line 1133 only records the delivery type transition, not what configuration was destroyed. This is a one-way door: once cleared, the old config values are unrecoverable without a database backup. If findDeliveryType produces an incorrect result (even once, even for one site), the site's accumulated configuration is permanently lost.
This aligns with @habansal's comment on line 1129. From an incident-response perspective, if this causes unexpected behavior in production, operators have no way to audit what was cleared.
Fix: Log the previous config values before clearing:
if (existingDeliveryType) {
log.info(`Clearing stale config for site ${site.getId()}`, {
previousDeliveryConfig: site.getDeliveryConfig(),
previousHlxConfig: site.getHlxConfig(),
});
site.setDeliveryConfig(null);
site.setHlxConfig(null);
}2. No error handling around findDeliveryType in Step 5a - src/controllers/plg/plg-onboarding.js:1121
If findDeliveryType(baseURL) throws (network timeout, DNS failure, unexpected response shape), it bubbles up and fails the entire onboarding flow. This call is now unconditional for every PLG onboarding without a preset type, making the blast radius larger than before. A transient failure in delivery type detection should not block the entire onboarding - the site already has a type (or null for new sites), and detection is a correction mechanism, not a prerequisite.
Fix: Wrap in try/catch that logs and continues:
let detectedDeliveryType;
try {
detectedDeliveryType = await findDeliveryType(baseURL);
} catch (e) {
log.warn(`Failed to detect delivery type for ${baseURL}: ${e.message}`);
}3. Missing test assertion for config clearing - test/controllers/plg/plg-onboarding.test.js:3373-3389
The test 'calls findDeliveryType for existing site and updates delivery type when detected type differs' asserts that setDeliveryType was called with the new value but does not assert that setDeliveryConfig(null) and setHlxConfig(null) were also called. This is the highest-risk behavior in the PR (destructive config wipe on existing sites), and it is currently unverified by tests. A regression that removes the clearing logic would pass silently.
Fix: Add assertions:
expect(existingSite.setDeliveryConfig).to.have.been.calledWith(null);
expect(existingSite.setHlxConfig).to.have.been.calledWith(null);Minor (Nice to Have)
4. Test comment references wrong step - test/controllers/plg/plg-onboarding.test.js:3367
The comment says "once in Step 5b.5 to validate delivery type" but the source code labels the block as "Step 5a". This will confuse anyone tracing test assertions back to the implementation.
Recommendations
- Add a test for the no-op path: when
findDeliveryTypereturns the same value asexistingDeliveryType, assert thatsetDeliveryTypeis NOT called. This documents the contract and protects against regressions where the inequality condition is accidentally removed. - Add a test for the new-site path: when
existingDeliveryTypeis falsy andfindDeliveryTypereturns a valid type, the delivery type should be set butsetDeliveryConfig(null)andsetHlxConfig(null)should NOT be called (theif (existingDeliveryType)guard prevents it).
Assessment
Ready to merge? With fixes
The delivery type auto-correction is architecturally sound in intent and the frescopa rejection follows established patterns cleanly. The three Important items are about production resilience: the missing config-value logging creates a recoverability gap, the unhandled findDeliveryType rejection puts a correction mechanism on the critical path without a safety net, and the missing test assertion leaves the highest-risk behavior unverified. All three are straightforward to address.
Next Steps
- Add logging of cleared config values before nulling them (addresses both this review and @habansal's inline feedback).
- Wrap
findDeliveryTypein a try/catch so detection failures do not block onboarding. - Add test assertions for the config-clearing behavior.
Skill: pr-review | Model: us.anthropic.claude-opus-4-6-v1[1m] | Duration: 1m 44s | Cost: $2.92 | Commit: 3816f0ba812d348bcdab578de915373bda6ea94d
If this code review was useful, please react with 👍. Otherwise, react with 👎.
- Skip Step 5a for newly created sites (delivery type was just set in Step 5) - Wrap findDeliveryType in try/catch so network failures don't block onboarding - Log previous deliveryConfig and hlxConfig before clearing so operators can audit what was wiped - Remove redundant inner existingDeliveryType guard (covered by !siteCreated at Step 5a entry) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…boarding Step 5a Instead of silently correcting delivery type (which risks data loss due to findDeliveryType unreliability), post a Slack alert to SLACK_DELIVERY_TYPE_ALERT_CHANNEL_ID so operators can investigate and correct manually. Also adds frescopa domain rejection. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ery type mismatch alert After accepting the refactored PLG onboarding structure from main (monolith split into plg-onboarding/ submodules), re-applied two changes to their correct locations: - Frescopa domain rejection → plg-onboarding.js (main controller, after isValidDomain) - Step 5a delivery type mismatch Slack alert → onboarding-flow.js (performAsoPlgOnboarding) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add test verifying that when postSlackMessage throws during the Step 5a delivery type mismatch alert, the error is logged and onboarding still completes successfully (200). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Hey @anshulk-public,
Verdict: Comment - clean rework that addresses all prior concerns; a few minor improvements noted below.
Changes: Rejects frescopa domains early in PLG onboarding and adds a Slack alert (no site mutation) when detected delivery type differs from stored type (3 files).
Non-blocking (5): minor issues and suggestions
- nit:
src/controllers/plg/plg-onboarding/onboarding-flow.js:660- consider logging whenchannelIdortokenis falsy so operators can distinguish "no mismatch" from "mismatch detected but alert skipped due to missing config" - nit:
src/controllers/plg/plg-onboarding/onboarding-flow.js:647-findDeliveryType(baseURL)is called a second time for existing sites that already resolved it in Step 2; consider reusing the cached result to avoid the extra network round-trip - nit:
src/controllers/plg/plg-onboarding.js:295-domain.toLowerCase().includes('frescopa')matches any domain containing the substring (e.g.notfrescopa.com); consider matching on domain-segment boundaries if only*.frescopa.*domains are intended - suggestion:
src/controllers/plg/plg-onboarding/onboarding-flow.js:643- extend the Step 5a comment to document whypresetDeliveryTypealso skips the check (admin explicitly overrode delivery type) - suggestion: document
SLACK_DELIVERY_TYPE_ALERT_CHANNEL_IDin deployment config or default toSLACK_PLG_ONBOARDING_CHANNEL_IDso the feature is operational without a separate infra change
Previously flagged, now resolved
- Config clearing without logging cleared values - no longer applicable (alert-only approach, no mutation)
- Missing error handling on findDeliveryType - now wrapped in try/catch with log.warn
- Missing test assertion for config clearing - no longer applicable; tests now verify site is NOT mutated
Skill: pr-review | Model: us.anthropic.claude-opus-4-6-v1[1m] | Duration: 2m 0s | Cost: $3.93 | Commit: 1be012f379957a89fd72de24615afa1429723908
If this code review was useful, please react with 👍. Otherwise, react with 👎.
There was a problem hiding this comment.
Hey @anshulk-public,
Verdict: Approve - all prior blocking concerns addressed; alert-only approach is sound.
Changes: Rejects frescopa domains early in PLG onboarding and adds a Slack alert (no site mutation) when detected delivery type differs from stored type (3 files).
Note: CI build is currently pending - ensure it passes before merge. The non-blocking nits from the prior review (frescopa substring broadness, redundant findDeliveryType call for RUM-fail path) are still applicable if you want to address them in a follow-up.
Previously flagged, now resolved
- Config clearing without logging cleared values - no longer applicable (alert-only approach, no mutation)
- Missing error handling on findDeliveryType - now wrapped in try/catch with log.warn
- Missing test assertion for config clearing - no longer applicable; tests verify site is NOT mutated
Skill: pr-review | Model: us.anthropic.claude-opus-4-6-v1[1m] | Duration: 1m 28s | Cost: $3.35 | Commit: db7eb56f050e9fd4b58e1fe4e0afaa4dac5be3fe
If this code review was useful, please react with 👍. Otherwise, react with 👎.
…match alerts Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
## [1.568.1](v1.568.0...v1.568.1) (2026-06-11) ### Bug Fixes * update plg onboarding existing delivery type ([#2526](#2526)) ([b1b90ac](b1b90ac))
|
🎉 This PR is included in version 1.568.1 🎉 The release is available on GitHub release Your semantic-release bot 📦🚀 |
Summary
Reject frescopa domains: Any PLG onboarding request for a domain containing
frescopa(e.g.frescopa.com,shop.frescopa.com) is rejected early with a400response and a Slack notification posted toSLACK_PLG_ONBOARDING_CHANNEL_ID, consistent with howinternal-organdpaid-customerrejections are handled.Alert on delivery type mismatch (Step 5a): After a site is resolved or created,
findDeliveryTypeis called for existing sites (when no preset delivery type is provided) to detect the site's actual delivery type. If the detected type differs from what is stored and is notOTHER, a Slack alert is posted toSLACK_DELIVERY_TYPE_ALERT_CHANNEL_IDwith the site ID, domain, org, and both delivery type values. The site is not mutated — operators can investigate and correct manually sincefindDeliveryTypeis not 100% reliable.Jira:
Changes
src/controllers/plg/plg-onboarding.js'frescopa-domain'entry toPLG_REJECTION_MESSAGESonboard()afterisValidDomainvalidation, before the org lookupsrc/controllers/plg/plg-onboarding/onboarding-flow.jsperformAsoPlgOnboardingbetween site resolution (steps.siteResolved) and canonical URL resolution (Step 5b)findDeliveryTypein a try/catch so network failures do not block onboardingSLACK_DELIVERY_TYPE_ALERT_CHANNEL_IDincluding site ID, domain, org ID, org name/IMS org ID, stored type, and detected typeNew environment variable
SLACK_DELIVERY_TYPE_ALERT_CHANNEL_IDPlease ensure your pull request adheres to the following guidelines:
describe here the problem you're solving.
If the PR is changing the API specification:
yet. Ideally, return a 501 status code with a message explaining the feature is not implemented yet.
If the PR is changing the API implementation or an entity exposed through the API:
If the PR is introducing a new audit type:
Related Issues
Thanks for contributing!