Add EdgeCache encoding to BulkEdgeEncoder of codec-java#249
Open
Add EdgeCache encoding to BulkEdgeEncoder of codec-java#249
Conversation
Emit EdgeCache rows during bulk load for INDEXED and MULTI_EDGE labels. The wire format matches the V3 EdgeCacheRecordMapper byte layout so that bulk-loaded data is visible to cache-backed multi-hop queries. Verified end-to-end by a V3 Decoder round-trip in V2MultiEdgeBulkLoadTest.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Contributor
|
Note - Although the migration to V3 is in progress, this PR is a necessary update to V2 to bring the latest features into the currently running production. @zipdoki |
Contributor
|
Need to check whether the LabelDTO change impacts any of its usages. If the above and HBase testing are confirmed, I'll go ahead with an optimistic merge. |
Contributor
Author
|
@em3s
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Extend
BulkEdgeEncoder.bulkEncodeAllto also emit EdgeCache records during bulk load. Previously only HashEdge/IndexedEdge/CounterEdge rows were produced, so bulk-loaded data was missing from cache-backed (multi-hop) queries for INDEXED and MULTI_EDGE labels. Emitted rows are byte-compatible with what the V3EdgeCacheRecordMapperwrites — key layoutxxhash32(src) | directedSource | labelId | EDGE_CACHE(-6) | direction | cacheCode(int32), qualifiercacheValues... | directedTarget, valuets | (propertyHashKey, propertyValue)..., and the IN-direction src/tgt swap inBytesKeyValueEdgeEncoder/StringKeyFieldValueEdgeEncodermirrors V3EdgeMutationStrategy.MultiEdge. Fixes the behavior gap noted in #37.Test plan
./gradlew :codec-java:build— full codec-java compile; catches type/import breakage from the newCacheDTO,LabelDTO.cachesfield,EncodedEdgeType.EDGE_CACHE_TYPE(-6), and theEdgeEncoder.encodeCacheEdge/encodeAllCacheEdgesadditions../gradlew :codec-java:test --tests "*BulkEdgeEncoderTests*"— INDEXED label encoding across BOTH / OUT / IN directions with total row-count assertions, HASH / inactive-edge negative cases (no cache rows even whencachesis set), and the backward-compat path where a label JSON without acacheskey still deserializes and emits no cache rows../gradlew :codec-java:test --tests "*MultiEdgeBulkEdgeEncoderTests*"— MULTI_EDGE label; verifies the syntheticoutEdge=(src, id)/inEdge=(id, tgt)from the bulk path are reused by the cache encoder and emit 2 cache rows (OUT/IN). Also covers MULTI_EDGE JSON withoutcaches../gradlew :core:test --tests "*V2MultiEdgeBulkLoadTest*"— end-to-end round-trip: V2 bytes produced byBulkEdgeEncoderare decoded via V3EdgeCacheRecordMapper.Decoder(testEdgeCacheOut/In), proving byte compatibility with V3's wire format. Also regresses state/indexed/counter paths to ensure the cache-row addition did not disturb them.getCaches()/entity.cachesundercodec-java/andengine/— confirms no production caller expects a non-nullLabelDTO.cachesafter the null-guard removal (BulkEdgeEncoderis the only Java consumer and already doescaches != null && !caches.isEmpty(); KotlinLabelEntity.cachesis a separate class with a KotlinemptyList()default and is unaffected).