Skip to content

Conversation

@KyleAMathews
Copy link
Collaborator

@KyleAMathews KyleAMathews commented Jan 22, 2026

Summary

Adds iteration safety limits to prevent app freezes from excessive iterations in ORDER BY + LIMIT queries. When limits are hit, warnings are logged with diagnostic info but queries continue normally with available data.

User impact: Apps no longer freeze. If iteration limits are hit, users get their data (possibly incomplete TopK) and a console warning with diagnostic info.

Root Cause

ORDER BY + LIMIT queries can enter excessive iteration cycles when:

  1. TopK operator requests limit items (e.g., 10)
  2. WHERE clause filters out most data (only 2 items match)
  3. TopK keeps requesting more data that doesn't exist locally
  4. Each iteration triggers more graph processing

The exact trigger conditions in production are still unclear - we observed debugger pausing in WHERE clause evaluation (and, gte cases) during requestLimitedSnapshot.

Approach

Graceful circuit breakers: Add iteration limits that break out of loops and continue with available data:

Location Limit Purpose
D2.run() 100,000 Prevents infinite dataflow cycles
maybeRunGraph 10,000 Prevents graph/loading feedback loops
requestLimitedSnapshot 10,000 Prevents index iteration issues

Key insight: By the time we hit the limit, we likely have all available data - we just couldn't fill the TopK completely because WHERE filtered out most items. So we log a warning and continue rather than erroring.

Iteration breakdown tracking: When limits are hit, warnings show WHERE the loop got stuck:

Iteration breakdown (where the loop spent time):
    1-5: operators with work = [Filter,TopK]
    6-100000: operators with work = [TopK]

This makes it clear that the loop got stuck requesting more data for TopK after filtering completed.

Additional diagnostic info logged:

- Live query ID
- Source collection IDs  
- Run count
- OrderBy optimization info (limit, offset, dataNeeded)
- Cursor position and iteration counts
- Which D2 operators have pending work

Non-goals

  • Root cause fix: We're shipping a graceful band-aid. The diagnostic info will help identify the actual cause when warnings occur in production.
  • Performance optimization: The limits are generous enough to never trigger for legitimate workloads.

Trade-offs

Option Pros Cons
Error state (rejected) Clear signal something is wrong Breaks app, requires refresh
Graceful recovery (chosen) App keeps working, data is shown Users might not notice warnings

Verification

# Run the safety limit tests
pnpm vitest run packages/db/tests/infinite-loop-prevention.test.ts

# Run NaN handling tests (related edge case exploration)
pnpm vitest run packages/db/tests/nan-comparator-infinite-loop.test.ts

Add iteration safeguards to prevent infinite loops that can occur
when using Electric with large datasets and ORDER BY/LIMIT queries:

1. `maybeRunGraph` while loop (collection-config-builder.ts):
   - Can loop infinitely when data loading triggers graph updates
   - Happens when WHERE filters out most data, causing dataNeeded() > 0
   - Loading more data triggers updates that get filtered out
   - Added 10,000 iteration limit with error logging

2. `requestLimitedSnapshot` while loop (subscription.ts):
   - Can loop if index iteration has issues
   - Added 10,000 iteration limit with error logging
   - Removed unused `insertedKeys` tracking

3. `D2.run()` while loop (d2.ts):
   - Can loop infinitely on circular data flow bugs
   - Added 100,000 iteration limit with error logging

The safeguards log errors to help debug the root cause while
preventing the app from freezing.
@changeset-bot
Copy link

changeset-bot bot commented Jan 22, 2026

🦋 Changeset detected

Latest commit: 32bd8e8

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 13 packages
Name Type
@tanstack/db Patch
@tanstack/db-ivm Patch
@tanstack/angular-db Patch
@tanstack/electric-db-collection Patch
@tanstack/offline-transactions Patch
@tanstack/powersync-db-collection Patch
@tanstack/query-db-collection Patch
@tanstack/react-db Patch
@tanstack/rxdb-db-collection Patch
@tanstack/solid-db Patch
@tanstack/svelte-db Patch
@tanstack/trailbase-db-collection Patch
@tanstack/vue-db Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@pkg-pr-new
Copy link

pkg-pr-new bot commented Jan 22, 2026

More templates

@tanstack/angular-db

npm i https://pkg.pr.new/@tanstack/angular-db@1173

@tanstack/db

npm i https://pkg.pr.new/@tanstack/db@1173

@tanstack/db-ivm

npm i https://pkg.pr.new/@tanstack/db-ivm@1173

@tanstack/electric-db-collection

npm i https://pkg.pr.new/@tanstack/electric-db-collection@1173

@tanstack/offline-transactions

npm i https://pkg.pr.new/@tanstack/offline-transactions@1173

@tanstack/powersync-db-collection

npm i https://pkg.pr.new/@tanstack/powersync-db-collection@1173

@tanstack/query-db-collection

npm i https://pkg.pr.new/@tanstack/query-db-collection@1173

@tanstack/react-db

npm i https://pkg.pr.new/@tanstack/react-db@1173

@tanstack/rxdb-db-collection

npm i https://pkg.pr.new/@tanstack/rxdb-db-collection@1173

@tanstack/solid-db

npm i https://pkg.pr.new/@tanstack/solid-db@1173

@tanstack/svelte-db

npm i https://pkg.pr.new/@tanstack/svelte-db@1173

@tanstack/trailbase-db-collection

npm i https://pkg.pr.new/@tanstack/trailbase-db-collection@1173

@tanstack/vue-db

npm i https://pkg.pr.new/@tanstack/vue-db@1173

commit: 32bd8e8

@github-actions
Copy link
Contributor

github-actions bot commented Jan 22, 2026

Size Change: +491 B (+0.54%)

Total Size: 91.4 kB

Filename Size Change
./packages/db/dist/esm/collection/subscription.js 3.9 kB +277 B (+7.65%) 🔍
./packages/db/dist/esm/query/live/collection-config-builder.js 5.65 kB +235 B (+4.34%)
./packages/db/dist/esm/query/live/collection-subscriber.js 1.91 kB -21 B (-1.09%)
ℹ️ View Unchanged
Filename Size
./packages/db/dist/esm/collection/change-events.js 1.39 kB
./packages/db/dist/esm/collection/changes.js 1.19 kB
./packages/db/dist/esm/collection/events.js 388 B
./packages/db/dist/esm/collection/index.js 3.32 kB
./packages/db/dist/esm/collection/indexes.js 1.1 kB
./packages/db/dist/esm/collection/lifecycle.js 1.68 kB
./packages/db/dist/esm/collection/mutations.js 2.34 kB
./packages/db/dist/esm/collection/state.js 3.49 kB
./packages/db/dist/esm/collection/sync.js 2.41 kB
./packages/db/dist/esm/deferred.js 207 B
./packages/db/dist/esm/errors.js 4.7 kB
./packages/db/dist/esm/event-emitter.js 748 B
./packages/db/dist/esm/index.js 2.69 kB
./packages/db/dist/esm/indexes/auto-index.js 742 B
./packages/db/dist/esm/indexes/base-index.js 766 B
./packages/db/dist/esm/indexes/btree-index.js 1.93 kB
./packages/db/dist/esm/indexes/lazy-index.js 1.1 kB
./packages/db/dist/esm/indexes/reverse-index.js 513 B
./packages/db/dist/esm/local-only.js 837 B
./packages/db/dist/esm/local-storage.js 2.1 kB
./packages/db/dist/esm/optimistic-action.js 359 B
./packages/db/dist/esm/paced-mutations.js 496 B
./packages/db/dist/esm/proxy.js 3.75 kB
./packages/db/dist/esm/query/builder/functions.js 733 B
./packages/db/dist/esm/query/builder/index.js 4.08 kB
./packages/db/dist/esm/query/builder/ref-proxy.js 1.05 kB
./packages/db/dist/esm/query/compiler/evaluators.js 1.42 kB
./packages/db/dist/esm/query/compiler/expressions.js 430 B
./packages/db/dist/esm/query/compiler/group-by.js 1.81 kB
./packages/db/dist/esm/query/compiler/index.js 2.02 kB
./packages/db/dist/esm/query/compiler/joins.js 2.07 kB
./packages/db/dist/esm/query/compiler/order-by.js 1.45 kB
./packages/db/dist/esm/query/compiler/select.js 1.06 kB
./packages/db/dist/esm/query/expression-helpers.js 1.43 kB
./packages/db/dist/esm/query/ir.js 673 B
./packages/db/dist/esm/query/live-query-collection.js 360 B
./packages/db/dist/esm/query/live/collection-registry.js 264 B
./packages/db/dist/esm/query/live/internal.js 145 B
./packages/db/dist/esm/query/optimizer.js 2.56 kB
./packages/db/dist/esm/query/predicate-utils.js 2.97 kB
./packages/db/dist/esm/query/subset-dedupe.js 921 B
./packages/db/dist/esm/scheduler.js 1.3 kB
./packages/db/dist/esm/SortedMap.js 1.3 kB
./packages/db/dist/esm/strategies/debounceStrategy.js 247 B
./packages/db/dist/esm/strategies/queueStrategy.js 428 B
./packages/db/dist/esm/strategies/throttleStrategy.js 246 B
./packages/db/dist/esm/transactions.js 2.9 kB
./packages/db/dist/esm/utils.js 924 B
./packages/db/dist/esm/utils/browser-polyfills.js 304 B
./packages/db/dist/esm/utils/btree.js 5.61 kB
./packages/db/dist/esm/utils/comparison.js 852 B
./packages/db/dist/esm/utils/cursor.js 457 B
./packages/db/dist/esm/utils/index-optimization.js 1.51 kB
./packages/db/dist/esm/utils/type-guards.js 157 B

compressed-size-action::db-package-size

@github-actions
Copy link
Contributor

github-actions bot commented Jan 22, 2026

Size Change: 0 B

Total Size: 3.7 kB

ℹ️ View Unchanged
Filename Size
./packages/react-db/dist/esm/index.js 225 B
./packages/react-db/dist/esm/useLiveInfiniteQuery.js 1.17 kB
./packages/react-db/dist/esm/useLiveQuery.js 1.34 kB
./packages/react-db/dist/esm/useLiveSuspenseQuery.js 559 B
./packages/react-db/dist/esm/usePacedMutations.js 401 B

compressed-size-action::react-db-package-size

The infinite loop occurred because `loadMoreIfNeeded` kept trying to load
data even when the local index was exhausted. This happened when:
1. TopK had fewer items than limit (WHERE filtered out data)
2. loadMoreIfNeeded tried to load more → no local data found
3. Loop continued indefinitely since TopK still needed data

Root cause fix:
- Add `localIndexExhausted` flag to CollectionSubscriber
- Track when local index has no more data for current cursor
- Stop calling loadMoreIfNeeded when exhausted
- Reset flag when new data arrives from sync layer (inserts)
- requestLimitedSnapshot now returns boolean indicating if data was found

Error handling improvements (per review feedback):
- D2.run() now throws Error when iteration limit exceeded
- Caller catches and calls transitionToError() for proper error state
- requestLimitedSnapshot returns false when iteration limit hit
- Live query properly shows error state if safeguard limits are hit

This fixes the issue for both eager and progressive syncMode.
@KyleAMathews KyleAMathews force-pushed the claude/debug-electric-infinite-loop-QJD2Z branch from f47fbd0 to 2a796ff Compare January 22, 2026 17:07
These tests verify the localIndexExhausted fix works correctly:

1. Does not infinite loop when WHERE filters out most data
   - Query wants 10 items, only 2 match
   - Verifies status !== 'error' (fix works, not just safeguard)

2. Resumes loading when new matching data arrives
   - Starts with 0 matching items
   - Insert new matching items
   - localIndexExhausted resets, loads new data

3. Handles updates that move items out of WHERE clause
   - Updates change values to no longer match WHERE
   - TopK correctly refills from remaining matching data
@KyleAMathews KyleAMathews force-pushed the claude/debug-electric-infinite-loop-QJD2Z branch from 453e7c7 to e698eb1 Compare January 22, 2026 17:36
KyleAMathews and others added 2 commits January 22, 2026 10:52
Adds a new test that directly reproduces the infinite loop bug by creating
a collection with a custom loadSubset that synchronously injects updates
matching the WHERE clause. This simulates Electric's behavior of sending
continuous updates during graph execution.

The test verifies:
- Without the localIndexExhausted fix, loadSubset is called 100+ times (infinite loop)
- With the fix, loadSubset is called < 10 times (loop terminates correctly)

Also adds additional tests for edge cases around the localIndexExhausted flag.

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@KyleAMathews KyleAMathews changed the title Add iteration limits to prevent infinite loops in graph execution Fix infinite loop in ORDER BY + LIMIT queries with Electric sync Jan 22, 2026
KyleAMathews and others added 7 commits January 22, 2026 11:01
The removed test synchronously injected data in loadSubset, which artificially
creates an infinite loop. Electric's loadSubset is async (uses await stream.fetchSnapshot),
so it can't synchronously inject data during the maybeRunGraph loop.

The remaining tests verify the localIndexExhausted flag's behavior correctly:
- Prevents repeated load attempts when exhausted
- Resets when new inserts arrive

Co-Authored-By: Claude Opus 4.5 <[email protected]>
The localIndexExhausted flag was incorrectly resetting when updates arrived
because splitUpdates converts updates to delete+insert pairs for D2, and the
hasInserts check was seeing those fake inserts.

Fix: Check for ORIGINAL inserts before calling splitUpdates, and pass that
information to sendChangesToPipelineWithTracking. Now the flag only resets
for genuine new inserts from the sync layer.

Also adds a test that verifies requestLimitedSnapshot calls are limited after
the index is exhausted - with the fix, only 0-4 calls happen after initial load
even when 20 updates arrive. Without the fix, it would be ~19 calls.

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Documents the reasoning for external reviewers:
1. splitUpdates fake inserts don't reset the flag
2. Updates to existing rows don't add new rows to scan
3. Edge case "update makes row match WHERE" is handled by
   filterAndFlipChanges converting unseen key updates to inserts

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Remove the localIndexExhausted optimization and keep only the safety
limits with enhanced diagnostic error messages. This approach:

1. Prevents app freezes by capping iterations
2. Provides detailed diagnostic info when limits are hit:
   - Collection IDs and query structure
   - TopK size vs data needed
   - Cursor position and iteration counts
   - Which D2 operators have pending work

The diagnostic info will help identify the actual root cause when
these errors occur in production.

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@KyleAMathews KyleAMathews changed the title Fix infinite loop in ORDER BY + LIMIT queries with Electric sync Add safety limits and diagnostics for ORDER BY + LIMIT infinite loops Jan 26, 2026
KyleAMathews and others added 8 commits January 26, 2026 09:48
Instead of transitioning to error state when iteration limits are hit,
now we:
1. Log a warning with diagnostic info
2. Break out of the loop
3. Continue with the data we have

This makes sense because by the time we hit the limit, we likely have
all available data - we just couldn't fill the TopK completely because
WHERE filtered out most items.

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Shows where the loop spent its time (e.g., "iterations 1-3: X, 4-10000: Y")
to help identify stuck patterns when the safety limits are hit. This makes
it much clearer where the infinite loop got stuck in the state machine.

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Reduces code duplication across three circuit breaker implementations by
creating a shared createIterationTracker utility. This makes the tracking
logic easier to maintain and the circuit breaker implementations cleaner.

- New: packages/db-ivm/src/iteration-tracker.ts
- Simplified: D2.run(), maybeRunGraph(), requestLimitedSnapshot()
- Net reduction: ~140 lines

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Verifies the iteration breakdown output format and state tracking logic
used by circuit breaker diagnostics.

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@kevin-dp
Copy link
Contributor

kevin-dp commented Jan 26, 2026

I would definitely make this opt-in or enable only in debug mode because i'm worried about 2 things:

  • performance hit of doing additional tracking in D2 pipeline
  • limits being hit in a healthy application

For the limits being hit in a healthy application what i mean is that an application could do a LIMIT 1 to get the single best result. But the query may have an additional WHERE clause that filters out most of the data. The table may be big (e.g. 2M rows) and the first result that passes the filter may be at the end of the table. So in order to get to that data, topK needs to make many load requests (more than our artificial limits). So it will hit the limits first and the user will think something's wrong with their app, which isn't the case.

To avoid these kinds of situations i would make it opt-in. Only if you encounter this bug that we are currently unable to reproduce would you enable these limits in an attempt to get tracking information that helps us identifying the root cause.

I'm also worried about this debugging code remaining in the codebase forever. I'd rather ask the users that are currently facing this bug to use the preview package from this PR to help find the root cause, rather than merging this into our main codebase. If we do that, then no need to make this opt-in or enabled only in debug mode.

Copy link
Collaborator

@samwillis samwillis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've had a look through, I agree with Kevin, opt in would be useful, but I'm not wedded - I think ideally we have the confidence to completely remove this at some point.

The comments repeatedly make reference to "local index is exhausted" but there is no index... I think it's referring to the collection and not finding any more matching data, but it makes a little hard to follow. I would like these to be cleared up and explain exactly what is happening.

The tracking in places is causing allocations or stringification even if not needed, if we intended to keep this in long term the parsing of values into a serialised state for warning messages should be lazy.

Also a few other things in here that seem redundant changes, suspect just churn as this PR has evolved a lot - would be nice to make it a little cleaner.

Comment on lines 70 to 74
const operatorsWithWork = this.#operators
.filter((op) => op.hasPendingWork())
.map((op) => op.constructor.name)
.sort()
.join(`,`)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not too keen on this inside the while loop, it's stringifing a bunch of stuff on each iteration even if it's not shown. Ideally anything we pass to trackAndCheckLimit is lazily stringified so we are minimising the work.
At the minimum lift it out of the loop.

const sendChangesInRange = (
changes: Iterable<ChangeMessage<any, string | number>>,
) => {
const changesArray = Array.isArray(changes) ? changes : [...changes]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why it's making this change here - splitUpdates accepts an iterator? I think it's a redundant change thats doing a wasteful allocation.

}

const trackedChanges = this.trackSentValues(changes, orderByInfo.comparator)
const changesArray = Array.isArray(changes) ? changes : [...changes]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also same as above, trackSentValues accepts an iterator, this is redundant.

private loadNextItems(
n: number,
subscription: CollectionSubscription,
): boolean {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This returning a boolean doesn't seem to be used anywhere, can we keep the same signature for now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally the boolean indicates whether it's done loading more items or not. IIRC if we hit the end of the data then we return true indicating we're done loading items (even though the topK may not be completely filled). This is essential to know that we need to stop requesting for more data.

I'm confused about the comment it added about loading this locally. This could be loaded from the backend as well.

minValues,
offset,
}: RequestLimitedSnapshotOptions) {
}: RequestLimitedSnapshotOptions): boolean {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This return values is passed through by loadNextItems but then not used anywhere.

Comment on lines -565 to +604
if (minValue instanceof Date) {
const minValuePlus1ms = new Date(minValue.getTime() + 1)
if (cursorMinValue instanceof Date) {
const minValuePlus1ms = new Date(cursorMinValue.getTime() + 1)
whereCurrentCursor = and(
gte(expression, new Value(minValue)),
gte(expression, new Value(cursorMinValue)),
lt(expression, new Value(minValuePlus1ms)),
)
} else {
whereCurrentCursor = eq(expression, new Value(minValue))
whereCurrentCursor = eq(expression, new Value(cursorMinValue))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renames for no reasion

let hitIterationLimit = false

while (valuesNeeded() > 0 && !collectionExhausted()) {
const insertedKeys = new Set<string | number>() // Track keys we add to `changes` in this iteration
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's interesting that it's spotted this is unused and removed it.

@kevin-dp insertedKeys was introduced in one of your PRs, can you remember what it was for?

I am wandering a little bit if this could be a bug, if it was intended to be used to track sent data, but we are not doing so?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I traced back the original PR that introduced this variable. Keys were added to this set but the set was never used. I think it must have been remnants of a previous approach i had tried to fix the bug and then i ended up switching to a different approach in this variable went unnoticed. (This is the PR that introduced it: #713)

KyleAMathews and others added 4 commits January 26, 2026 11:35
- Simplify iteration tracker to lazy evaluation (only compute diagnostics
  when limit is actually exceeded, not on every iteration)
- Revert unnecessary array conversions in collection-subscriber.ts
- Revert unused boolean return from loadNextItems
- Keep cursorMinValue rename (fixes ESLint shadowing warning)
- Confirmed insertedKeys was dead code (never used after being set)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
The iteration checker now tracks both total iterations AND iterations
without state change. This catches:
- True infinite loops (same state repeating)
- Oscillation patterns (A→B→A→B)
- Slow progress that exceeds total limit

Callers pass a lightweight state key (e.g., count of operators with work,
number of changes collected) that resets the same-state counter when it
changes. If making progress, allows more total iterations; if stuck on
same state, triggers sooner.

Runtime cost: one primitive comparison per iteration (~free).

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Remove confusing "index exhausted" terminology and clarify that this
function loads from local collection via BTree index and also triggers
async backend fetch.

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@KyleAMathews KyleAMathews marked this pull request as draft January 27, 2026 17:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants