-
Notifications
You must be signed in to change notification settings - Fork 517
Offloading search to lambdas. #6148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.
0cd71d4 to
61eff68
Compare
61eff68 to
7dff81e
Compare
a1c25fc to
6f7c105
Compare
| let mut leaf_search_state_guard = | ||
| SplitSearchStateGuard::new(ctx.split_outcome_counters.clone()); | ||
|
|
||
| rewrite_request( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are rewritten in the caller.
6f7c105 to
64e10ca
Compare
| return Ok(Some(get_leaf_resp_from_count(split.num_docs))); | ||
| } | ||
|
|
||
| if let Some(cached_answer) = ctx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cache now happens in the caller.
53359d9 to
d7e49de
Compare
365e416 to
01f67a7
Compare
31fb6ac to
93b0f5a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 45 out of 47 changed files in this pull request and generated 7 comments.
Comments suppressed due to low confidence (11)
quickwit/rest-api-tests/quickwit:1
- This file hardcodes a developer-specific absolute path into the repo, which will break on other machines and may leak local filesystem details. Replace it with a relative path, an env var (e.g.
QW_BINARY), or a small wrapper script that resolves the binary location at runtime.
quickwit/quickwit-search/src/service.rs:1 - Correct spelling/grammar: 'accross' -> 'across' (and 'request' -> 'requests').
quickwit/quickwit-search/src/leaf.rs:1 - The capacity check is off by one:
batch.len() <= max_items_per_batchstill allows selecting a batch that is already full, sopushcan create batches withmax_items_per_batch + 1items. Use a strict check (batch.len() < max_items_per_batch) so batches never exceed the configured max.
quickwit/quickwit-search/src/search_permit_provider.rs:1 - This computes queue depth using
self.permits_requests.len()(number of requests), but then uses it to truncatepermit_sizes(number of splits/permits). If a single pending request contains many splits,current_pendingstays small and offloading will not trigger even when the split queue is saturated. Track pending load in the same unit aspermit_sizes(e.g., total pending permits/splits remaining across queuedLeafPermitRequests) before deciding how many new split permits to grant locally.
quickwit/quickwit-search/src/search_permit_provider.rs:1 - Changing
get_permitsfromimpl IntoIterator<Item = ByteSize>toVec<ByteSize>is a public API regression (forces allocation/collection at call sites and prevents using iterators/arrays directly). Consider restoringIntoIteratorfor bothget_permitsandget_permits_with_offloadwhile keeping your internalVec<u64>conversion.
quickwit/quickwit-search/src/search_permit_provider.rs:1 - Changing
get_permitsfromimpl IntoIterator<Item = ByteSize>toVec<ByteSize>is a public API regression (forces allocation/collection at call sites and prevents using iterators/arrays directly). Consider restoringIntoIteratorfor bothget_permitsandget_permits_with_offloadwhile keeping your internalVec<u64>conversion.
quickwit/quickwit-search/src/search_permit_provider.rs:1 - The new offload behavior isn’t covered by unit tests in this file. Add a test that calls
get_permits_with_offload(..., offload_threshold)with a low threshold and asserts that the returned futures length is truncated as expected (and that ordering guarantees still hold for the returned subset).
quickwit/quickwit-search/src/search_permit_provider.rs:1 - The new offload behavior isn’t covered by unit tests in this file. Add a test that calls
get_permits_with_offload(..., offload_threshold)with a low threshold and asserts that the returned futures length is truncated as expected (and that ordering guarantees still hold for the returned subset).
quickwit/quickwit-search/src/leaf.rs:1 schedule_search_taskswill offload wheneversearcher_config.lambdaisSome, even ifsearcher_context.lambda_invokerisNone. That makesrun_offloaded_search_taskspanic at runtime via.expect(...). Ensure offloading is only enabled when both config and invoker are present (or return a structuredSearchErrorwhen config requests offload but no invoker is configured).
quickwit/quickwit-search/src/leaf.rs:1schedule_search_taskswill offload wheneversearcher_config.lambdaisSome, even ifsearcher_context.lambda_invokerisNone. That makesrun_offloaded_search_taskspanic at runtime via.expect(...). Ensure offloading is only enabled when both config and invoker are present (or return a structuredSearchErrorwhen config requests offload but no invoker is configured).
quickwit/quickwit-search/src/leaf.rs:1- On JoinSet failures you drop the underlying
JoinErrordetails, making debugging hard. Include the join error (panic vs cancelled, task id, etc.) in the log and in the returnedSearchError::Internal, e.g. by matchingErr(join_err)and formatting it.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Process each split in parallel. The SearchPermitProvider inside | ||
| // SearcherContext gates concurrency based on memory budget. | ||
| let mut split_search_futures = Vec::new(); |
Copilot
AI
Feb 11, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Despite the comment, the split searches are executed sequentially: you collect async blocks into a Vec and then .await them one by one. If you want true parallel execution, drive them concurrently (e.g. try_join_all, FuturesUnordered, or JoinSet) while still preserving ordering if needed (store indices and reorder, or rely on join_all preserving input order).
| // Collect results, preserving split order. | ||
| let mut responses: Vec<LeafSearchResponse> = Vec::with_capacity(num_splits); | ||
| for split_search_fut in split_search_futures { | ||
| let search_result = split_search_fut.await; | ||
| let response = | ||
| search_result.map_err(|e| LambdaError::Internal(format!("leaf search failed: {e}")))?; | ||
| responses.push(response); | ||
| } |
Copilot
AI
Feb 11, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Despite the comment, the split searches are executed sequentially: you collect async blocks into a Vec and then .await them one by one. If you want true parallel execution, drive them concurrently (e.g. try_join_all, FuturesUnordered, or JoinSet) while still preserving ordering if needed (store indices and reorder, or rely on join_all preserving input order).
| // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| // See the License for the specific language governing permissions and | ||
| // limitations under the License. | ||
|
|
Copilot
AI
Feb 11, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The build script declares QUICKWIT_LAMBDA_ZIP_URL as an input, but never actually reads it; it always uses the LAMBDA_ZIP_URL constant. This makes the override documented in quickwit-lambda-client/README.md ineffective and also makes rerun-if-env-changed misleading. Read QUICKWIT_LAMBDA_ZIP_URL from env (falling back to the constant) and use that value for the download + cache marker comparison.
| true | ||
| }; | ||
|
|
||
| let lambda_zip_payload: Vec<u8> = if needs_download { |
Copilot
AI
Feb 11, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The build script declares QUICKWIT_LAMBDA_ZIP_URL as an input, but never actually reads it; it always uses the LAMBDA_ZIP_URL constant. This makes the override documented in quickwit-lambda-client/README.md ineffective and also makes rerun-if-env-changed misleading. Read QUICKWIT_LAMBDA_ZIP_URL from env (falling back to the constant) and use that value for the download + cache marker comparison.
| | `function_name` | Name of the AWS Lambda function to invoke. | `quickwit-lambda-search` | | ||
| | `max_splits_per_invocation` | Maximum number of splits to send in a single Lambda invocation. Must be at least 1. | `10` | | ||
| | `offload_threshold` | Number of pending local searches before offloading to Lambda. A value of `0` offloads everything to Lambda. | `100` | |
Copilot
AI
Feb 11, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two concrete issues: (1) docs say offload_threshold default is 100, but LambdaConfig::default_offload_threshold() in code is 10—please align docs and code. (2) The example function_name: {} is invalid YAML and reads like a placeholder; replace it with a real example value or omit the field.
| ## Example configuration | ||
|
|
||
| Minimal configuration (manual deployment): | ||
|
|
Copilot
AI
Feb 11, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two concrete issues: (1) docs say offload_threshold default is 100, but LambdaConfig::default_offload_threshold() in code is 10—please align docs and code. (2) The example function_name: {} is invalid YAML and reads like a placeholder; replace it with a real example value or omit the field.
| // Sort by version number descending (most recent first) | ||
| quickwit_lambda_versions.sort(); | ||
|
|
||
| // We keep the last 5 versions. |
Copilot
AI
Feb 11, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment is incorrect: sort() sorts ascending, not descending. Either update the comment to reflect ascending sort (and that the list represents versions to delete), or explicitly sort descending if that’s what the subsequent logic is meant to rely on.
| // Sort by version number descending (most recent first) | |
| quickwit_lambda_versions.sort(); | |
| // We keep the last 5 versions. | |
| // Sort by version number ascending (oldest first) | |
| quickwit_lambda_versions.sort(); | |
| // Keep only versions to delete: preserve the most recent GC_KEEP_RECENT_VERSIONS |
341e346 to
c83de48
Compare
2ccad61 to
c9b8223
Compare
c9b8223 to
7beed32
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 45 out of 47 changed files in this pull request and generated 3 comments.
Comments suppressed due to low confidence (3)
quickwit/quickwit-search/src/service.rs:1
- Corrected spelling of 'accross' to 'across'.
quickwit/quickwit-search/src/search_permit_provider.rs:1 - Corrected spelling of 'lambdas' to 'Lambdas' for consistency with AWS Lambda naming convention.
quickwit/quickwit-search/src/leaf.rs:1 - The filter condition allows batches with length <= max_items_per_batch, but the logic should prevent adding items to batches that already have max_items_per_batch items. The condition should be
batch.len() < max_items_per_batch.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
No description provided.