Skip to content

Conversation

@janniklinde
Copy link
Contributor

This PR introduces pipelining support for operators that produce at most one MatrixBlock per incoming block. As current out-of-core primitives are highly concurrent, which may lead to fan-outs (and consequential OOMs) for fast-producing source streams, this implementation aims to process downstream operations in the same thread. To reliably clean up referenced objects from caller methods (stack), we defer the downstream task using a thread-local context (if available). Because the deferred call is still executed in the same task of the caller, sequences of pipelining operators are completed before new tasks are taken from the task queue.
Additionally, this PR adds new primitives required for general matrix multiplies and adds new operators relying on these new primitives.
We also added (optional) messaging capabilities operators and streams, which may be required to communicate stream capabilities in future (e.g., targeted requests of tiles, cached, ...). While it is still uncertain to what degree these messages can be used, they are required for future experiments.
As streams sometimes need size information, they may now hold the corresponding CacheableData object. Here I don't know if these references can create issues regarding memory management.
Finally, this PR contains various bugfixes, safety checks, improved error handling and additional cache features.
Sorry (again) for the large PR.

Better Error Handling
Streams hold underlying CacheableData<?> ?
Generalized primitives for many to many joins
Primitive for (multi-)aggregations
Various bugfixes
Added cache features to merge/prioritize deferred requests
@codecov
Copy link

codecov bot commented Jan 26, 2026

Codecov Report

❌ Patch coverage is 56.74677% with 702 lines in your changes missing coverage. Please review.
✅ Project coverage is 71.61%. Comparing base (b394e32) to head (f92dc69).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
.../sysds/runtime/ooc/cache/OOCLRUCacheScheduler.java 57.29% 53 Missing and 26 partials ⚠️
...sysds/runtime/instructions/ooc/OOCInstruction.java 79.89% 47 Missing and 30 partials ⚠️
...he/sysds/runtime/ooc/cache/OOCMatrixIOHandler.java 47.58% 61 Missing and 15 partials ⚠️
...ime/instructions/ooc/MapMMChainOOCInstruction.java 53.98% 61 Missing and 14 partials ⚠️
...che/sysds/runtime/ooc/cache/DeferredReadQueue.java 42.10% 58 Missing and 8 partials ⚠️
.../sysds/runtime/instructions/ooc/CachingStream.java 44.54% 44 Missing and 17 partials ⚠️
...untime/instructions/ooc/SubscribableTaskQueue.java 49.41% 36 Missing and 7 partials ⚠️
...he/sysds/runtime/ooc/stream/FilteredOOCStream.java 21.73% 34 Missing and 2 partials ⚠️
...sysds/runtime/instructions/ooc/PlaybackStream.java 27.50% 26 Missing and 3 partials ⚠️
...instructions/ooc/MatrixIndexingOOCInstruction.java 56.86% 18 Missing and 4 partials ⚠️
... and 18 more
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #2409      +/-   ##
============================================
+ Coverage     71.51%   71.61%   +0.10%     
- Complexity    47441    47721     +280     
============================================
  Files          1539     1547       +8     
  Lines        182605   183747    +1142     
  Branches      35916    36079     +163     
============================================
+ Hits         130585   131587    +1002     
- Misses        42028    42038      +10     
- Partials       9992    10122     +130     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

1 participant