test(vector): assert PQ quantizer is honored on-disk through an Engine commit#809
Merged
Conversation
…e commit (#798) Adds an end-to-end integration test that commits a dense, well-separated 16-document corpus through VectorStore with HnswOption::quantizer = ProductQuantization, then reads the produced on-disk LVS1 segment header back and asserts QuantHeader::ProductQuantization (quant_kind = 2), with the configured subvector_count and sub_dim. This is the only behavioral assertion that PQ is honored through a store commit: it exercises the from_hnsw_option converter path (#790). A regression dropping quantizer from that converter would fall back to the default Scalar8Bit (quant_kind = 1) and fail the test. The existing pq_* tests build HnswIndexConfig directly and bypass from_hnsw_option. Closes #798
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds the missing behavioral test that Product Quantization (PQ) is honored end-to-end through a
VectorStore/Engine commit, asserting against a deterministic on-disk observable rather than only that search succeeds.Follow-up of #790: that change propagated
HnswOption::quantizerintoHnswIndexConfigviafrom_hnsw_option, but the only coverage was converter-level unit tests. The existingpq_*tests buildHnswIndexConfigdirectly and so bypass thefrom_hnsw_optionpath. A regression droppingquantizerfrom that converter (while keepingrerank_storage) would have gone unnoticed by any on-disk/behavioral test.What the test does
test_pq_quantizer_honored_through_engine_commit(inlaurus/tests/vector_segment_test.rs):VectorIndexConfigwhose HNSW field carriesquantizer: QuantizationMethod::ProductQuantization { subvector_count: 2 }(dim = 4, sodim % subvector_count == 0).test_hnsw_pq_search_returns_corpus_neighboursetup) throughVectorStore— exercisingfrom_hnsw_option..hnswsegment viaStorage::list_files(), skips the 20-byte HNSW preamble (num_vectors:u64 + dim/m/ef:u32×3; assertsnum_vectors == 16), and reads the LVS1VectorSegmentHeaderback.QuantHeader::ProductQuantizationwithparams.m == subvector_countandparams.sub_dim == dim / subvector_count.Why it is a real regression guard
The default
QuantizationMethodisScalar8Bit(quant_kind = 1). Iffrom_hnsw_optiondroppedquantizer, the segment header would reportquant_kind = 1and the match arm wouldpanic!. So the test fails iff PQ is not honored through the commit.Scope
Verification
cargo test -p laurus --test vector_segment_test— 2 passed (1 new)cargo clippy -p laurus --tests -- -D warnings— 0 warningscargo fmt --check -p laurus— cleanCloses #798
🤖 Generated with Claude Code