WIP perf(optimizer): add count threshold comparisons #898
+242
−0
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
NOTE: work in progress - evaluating whether this could work for
<and<=operators. Draft PR mainly to see benchmarks about the bytecode overhead.Motivation
In #897 count comparisons like
count(users, .active) >= 1were optimised by utilising theanybuiltin. Expressions likecount(users, .active) > 100currently iterate through the entire array even when the 101st match is found early. For large arrays where the threshold is reached quickly, this wastes resources (both CPU and memory).This optimization enables early termination: once the
countreaches the required threshold, the loop exits immediately. This is the bytecode-level approach to optimizing count comparisons without introducing new language builtins (and bloat the stdlib in the process).Changes
There's now a new
Thresholdfield in theBuiltinNodeAST. This handles the communication between the two phases. The new optimizercountThresholddetects count comparison patterns and calculates the threshold:count(arr, pred) > N->threshold = N + 1(need more than N matches)count(arr, pred) >= N->threshold = N(need at least N matches)Modified the compiler's
countbuiltin handler to emit early-termination bytecode when a threshold is set.Benchmark run:
Results against
master:Further comments
BuiltinNode.Map, which thefilterMapoptimizer uses to exchange information between the compiler and the optimizer phases.countAnyoptimizer still remains in use for> 0and>= 1scenarios. It runs before this newcountThresholdoptimizer.O(n)where n equals the array length. With this it'sO(k)where k is position of Nth matching element.