routing: fence payment lifecycles by sequence number#10772
routing: fence payment lifecycles by sequence number#10772yyforyongyu wants to merge 2 commits intolightningnetwork:masterfrom
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request improves the robustness of the payment routing system by serializing payment lifecycles based on their payment hash. By enforcing that only one lifecycle is active per hash, it prevents race conditions during payment retries and ensures that state transitions are handled correctly until the entire process has finished. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
🟠 PR Severity: HIGH
🟠 High (2 files)
AnalysisBoth modified files belong to HIGH-severity packages. Severity bump check:
A knowledgeable engineer familiar with the routing subsystem and the RPC layer should review this PR. To override, add a |
There was a problem hiding this comment.
Code Review
This pull request introduces a payment lifecycle mutex to ensure serialized execution for a given payment hash, preventing concurrent lifecycles. Changes include updating PreparePayment to return a release function and integrating this lock into synchronous, asynchronous, and resumed payment paths. Feedback suggests extending this locking mechanism to SendToRoute and SendToRouteSkipTempErr to prevent potential race conditions and recommends using defer for the release function in SendPaymentV2 to ensure safety during server shutdown or context cancellation.
| ) | ||
| } | ||
|
|
||
| releaseLifecycle := r.lockPaymentLifecycle(payment.Identifier()) |
There was a problem hiding this comment.
The lockPaymentLifecycle call is correctly placed before InitPayment to ensure serialization. However, the SendToRoute and SendToRouteSkipTempErr methods (which call the internal sendToRoute) also perform payment initialization and attempt registration but do not appear to use this new locking mechanism. This could lead to race conditions when multiple attempts are registered for the same payment hash concurrently via these entry points, violating the serialization requirement mentioned in the ControlTower interface documentation.
References
- The payment router/controller layer is responsible for ensuring serialized access per payment hash for calls to RegisterAttempt.
| // miss events. | ||
| sub, err := s.subscribePayment(payHash) | ||
| if err != nil { | ||
| releaseLifecycle() |
There was a problem hiding this comment.
When subscribePayment fails, releaseLifecycle() is called to release the mutex. This is correct. However, if the server is shutting down or the context is canceled immediately after, there's a small window where the lock might be held longer than necessary if other cleanup logic is triggered. It might be safer to use a defer for the release function once it's successfully obtained from PreparePayment, provided the logic is refactored to handle the async hand-off correctly.
6dba2fa to
3b96ebc
Compare
3b96ebc to
68b4e6a
Compare
68b4e6a to
9a8b5fe
Compare
Summary
Testing
GOWORK=off go test ./payments/db ./routing ./lnrpc/routerrpcmake lint