-
Notifications
You must be signed in to change notification settings - Fork 3.7k
branch-4.0: [fix](load_stream) Fix flush token deadlock by ensuring wait_for_flush_tasks is called before destruction #60284 #60285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
branch-4.0: [fix](load_stream) Fix flush token deadlock by ensuring wait_for_flush_tasks is called before destruction #60284 #60285
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR fixes a deadlock issue in load_stream by ensuring flush tasks are properly waited for before TabletStream destruction. The fix reverts the use of shared_from_this() and instead relies on explicit lifetime management through the IndexStream destructor.
Changes:
- Removed
std::enable_shared_from_thisinheritance from TabletStream and allshared_from_this()usages - Added
wait_for_flush_tasks()method to TabletStream to safely wait for all flush tasks before destruction - Added IndexStream destructor to ensure all TabletStreams wait for flush tasks before cleanup
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| be/src/runtime/load_stream.h | Removed shared_from_this inheritance, added wait_for_flush_tasks() declaration, added IndexStream destructor, added _flush_tasks_done flag |
| be/src/runtime/load_stream.cpp | Replaced shared_from_this with this in lambdas, implemented wait_for_flush_tasks() and IndexStream destructor, refactored pre_close() to use wait_for_flush_tasks() |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
1529b65 to
ed0add4
Compare
|
run buildall |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
ed0add4 to
3e507a7
Compare
…h_tasks is called before destruction Problem: When TabletStream is destroyed without pre_close() being called (e.g., on_idle_timeout scenario), the _flush_token destructor calls shutdown() which triggers deadlock detection if called from the pool thread. Root cause: - on_idle_timeout() directly calls brpc::StreamClose() without calling LoadStream::close() - This triggers the destruction chain without calling pre_close() on TabletStreams - If flush tasks are still running, TabletStream may be destroyed in pool thread Solution: - Add IndexStream::~IndexStream() to ensure wait_for_flush_tasks() is called on all TabletStreams - Add TabletStream::wait_for_flush_tasks() to wait for all flush tasks to complete - This ensures _flush_token is properly handled before TabletStream destruction - Revert commit c18ef17 (shared_from_this) as it is no longer needed
3e507a7 to
2a1ef37
Compare
|
run buildall |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
|
skip buildall |
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
pick from:#60284