Skip to content

Add EdgeCache encoding to BulkEdgeEncoder of codec-java#249

Open
zipdoki wants to merge 2 commits intomainfrom
feat/bulk-cache-encoding
Open

Add EdgeCache encoding to BulkEdgeEncoder of codec-java#249
zipdoki wants to merge 2 commits intomainfrom
feat/bulk-cache-encoding

Conversation

@zipdoki
Copy link
Copy Markdown
Contributor

@zipdoki zipdoki commented Apr 15, 2026

Summary

Extend BulkEdgeEncoder.bulkEncodeAll to also emit EdgeCache records during bulk load. Previously only HashEdge/IndexedEdge/CounterEdge rows were produced, so bulk-loaded data was missing from cache-backed (multi-hop) queries for INDEXED and MULTI_EDGE labels. Emitted rows are byte-compatible with what the V3 EdgeCacheRecordMapper writes — key layout xxhash32(src) | directedSource | labelId | EDGE_CACHE(-6) | direction | cacheCode(int32), qualifier cacheValues... | directedTarget, value ts | (propertyHashKey, propertyValue)..., and the IN-direction src/tgt swap in BytesKeyValueEdgeEncoder / StringKeyFieldValueEdgeEncoder mirrors V3 EdgeMutationStrategy.MultiEdge. Fixes the behavior gap noted in #37.

Test plan

  • ./gradlew :codec-java:build — full codec-java compile; catches type/import breakage from the new Cache DTO, LabelDTO.caches field, EncodedEdgeType.EDGE_CACHE_TYPE(-6), and the EdgeEncoder.encodeCacheEdge / encodeAllCacheEdges additions.
  • ./gradlew :codec-java:test --tests "*BulkEdgeEncoderTests*" — INDEXED label encoding across BOTH / OUT / IN directions with total row-count assertions, HASH / inactive-edge negative cases (no cache rows even when caches is set), and the backward-compat path where a label JSON without a caches key still deserializes and emits no cache rows.
  • ./gradlew :codec-java:test --tests "*MultiEdgeBulkEdgeEncoderTests*" — MULTI_EDGE label; verifies the synthetic outEdge=(src, id) / inEdge=(id, tgt) from the bulk path are reused by the cache encoder and emit 2 cache rows (OUT/IN). Also covers MULTI_EDGE JSON without caches.
  • ./gradlew :core:test --tests "*V2MultiEdgeBulkLoadTest*" — end-to-end round-trip: V2 bytes produced by BulkEdgeEncoder are decoded via V3 EdgeCacheRecordMapper.Decoder (testEdgeCacheOut/In), proving byte compatibility with V3's wire format. Also regresses state/indexed/counter paths to ensure the cache-row addition did not disturb them.
  • Manual multi-hop cache-backed query after bulk load — confirms bulk-loaded edges surface in cache-path multi-hop queries.
  • Grep audit for getCaches() / entity.caches under codec-java/ and engine/ — confirms no production caller expects a non-null LabelDTO.caches after the null-guard removal (BulkEdgeEncoder is the only Java consumer and already does caches != null && !caches.isEmpty(); Kotlin LabelEntity.caches is a separate class with a Kotlin emptyList() default and is unaffected).

Emit EdgeCache rows during bulk load for INDEXED and MULTI_EDGE labels. The wire format matches the V3 EdgeCacheRecordMapper byte layout so that bulk-loaded data is visible to cache-backed multi-hop queries. Verified end-to-end by a V3 Decoder round-trip in V2MultiEdgeBulkLoadTest.
@zipdoki zipdoki requested a review from em3s April 15, 2026 04:29
@zipdoki zipdoki self-assigned this Apr 15, 2026
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. enhancement New feature or request labels Apr 15, 2026
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@em3s
Copy link
Copy Markdown
Contributor

em3s commented Apr 17, 2026

Note - Although the migration to V3 is in progress, this PR is a necessary update to V2 to bring the latest features into the currently running production.

@zipdoki
BTW, was this tested against HBase? Mutation and simple query are being implemented, so you can test against those.

@em3s
Copy link
Copy Markdown
Contributor

em3s commented Apr 17, 2026

@zipdoki

Need to check whether the LabelDTO change impacts any of its usages.

If the above and HBase testing are confirmed, I'll go ahead with an optimistic merge.

@em3s em3s changed the title feat(codec-java): add EdgeCache encoding to BulkEdgeEncoder Add EdgeCache encoding to BulkEdgeEncoder of codec-java Apr 17, 2026
@em3s em3s removed their request for review April 17, 2026 06:17
@zipdoki
Copy link
Copy Markdown
Contributor Author

zipdoki commented Apr 17, 2026

@em3s
Will test these two and report back:

  • HBase (mutation + simple query)
  • LabelDTO usage impact

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants