Phase 2: enable row-boolean WHERE in connected OPTIONAL MATCH#1224
Merged
Phase 2: enable row-boolean WHERE in connected OPTIONAL MATCH#1224
Conversation
lmeyerov
added a commit
that referenced
this pull request
Apr 27, 2026
d546c8a to
3df138f
Compare
lmeyerov
added a commit
that referenced
this pull request
Apr 27, 2026
3df138f to
e7fa6da
Compare
lmeyerov
added a commit
that referenced
this pull request
Apr 27, 2026
e7fa6da to
329edac
Compare
lmeyerov
added a commit
that referenced
this pull request
Apr 27, 2026
329edac to
b279bcb
Compare
lmeyerov
added a commit
that referenced
this pull request
Apr 27, 2026
b279bcb to
dbf3913
Compare
lmeyerov
added a commit
that referenced
this pull request
Apr 27, 2026
dbf3913 to
24a03d9
Compare
8fa7324 to
eff2a84
Compare
lmeyerov
added a commit
that referenced
this pull request
Apr 27, 2026
24a03d9 to
70d78f2
Compare
lmeyerov
added a commit
that referenced
this pull request
Apr 27, 2026
70d78f2 to
b56f25e
Compare
f9fc71d to
c8bbe38
Compare
lmeyerov
added a commit
that referenced
this pull request
Apr 27, 2026
b56f25e to
cb49359
Compare
c8bbe38 to
2c6d590
Compare
lmeyerov
added a commit
that referenced
this pull request
Apr 27, 2026
cb49359 to
9f49274
Compare
lmeyerov
added a commit
that referenced
this pull request
Apr 27, 2026
9f49274 to
2ca183d
Compare
lmeyerov
added a commit
that referenced
this pull request
Apr 27, 2026
2ca183d to
a152c25
Compare
2eb6cef to
753dec7
Compare
lmeyerov
added a commit
that referenced
this pull request
Apr 27, 2026
a152c25 to
9c7b886
Compare
lmeyerov
added a commit
that referenced
this pull request
Apr 27, 2026
9c7b886 to
bd7729a
Compare
8e5a6e5 to
e7c0f6a
Compare
lmeyerov
added a commit
that referenced
this pull request
Apr 27, 2026
bd7729a to
f3a58fc
Compare
e7c0f6a to
46fc89a
Compare
f3a58fc to
c4b2977
Compare
3 tasks
lmeyerov
added a commit
that referenced
this pull request
Apr 29, 2026
…ils (#1227) * test+docs(#1219): residual row-boolean compositional matrix + guardrails Worker B / independent-hardening stream of #1219. After #1217's Earley swap surfaced row-boolean shapes (OR/NOT/XOR among row predicates) that LALR rejected, four compositional shapes remained unverified beyond the fixtures #1217 covered. This PR locks empirical correctness across the residual matrix + adds two lightweight guardrail comments. ## Compositional matrix tests (test_lowering.py) All four shapes verified correct empirically; locked with sorted-id assertions against discriminating fixtures: 1. **Nullable NOT/OR** — `WHERE NOT n.x = 1 OR n.y IS NULL` against a 4-row fixture mixing real and projected nulls. Locks pandas-backed row-evaluator preserves Cypher 3VL truth table (NULL OR T = T): `{n2, n4}`. 2. **N-ary OR (3 branches)** — `WHERE n.x = 1 OR n.x = 2 OR n.x = 3`. Locks left-associative parse `or(or(=1, =2), =3)` doesn't degenerate under associativity bugs. `{n1, n2, n3}`. 3. **De Morgan compositions** (parametrized × 4) — both `NOT(A OR B)` ≡ `NOT-A AND NOT-B` and `NOT(A AND B)` ≡ `NOT-A OR NOT-B` against a 4-row fixture covering all (x∈{1,2}, y∈{2,3}) combos. Each form and its De-Morganed equivalent return the same row set. 4. **Mixed-string-numeric AND inside OR** — `WHERE (n.s = 'a' AND n.x > 0) OR n.x < -1`, exercising `_StringAllowingComparisonMixin` (#1217) paired with OR composition. `{n1, n2, n4}`. ## Guardrails - `expr_split.py::split_top_level_and` — added load-bearing AND-only docstring with #1219 cross-ref explaining why a sibling `split_top_level_or` would break OR-distributivity-over-join correctness. No future maintainer should accidentally add it without first redesigning topology-aware pushdown safety. - `_boolean_expr_text.py::boolean_expr_to_text` — added explicit `if expr.op == "pattern"` branch with docstring. Currently unreachable in production (lift step extracts pattern leaves before the binder walks the tree) but documents the contract: emit the raw pattern source rather than silently falling through to empty string. ## Test impact Validated on dgx-spark: `graphistry/tests/compute/` → 2524 passed (7 new tests; remaining delta from baseline absorbed by #1224's unrelated additions). Closes the residual frontier portion of #1219. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test+docs(#1219): wave-1 review fixes — discriminating fixtures, equivalence assertions, pattern-op unit test, mirrored guard Wave-1 review on c196393 surfaced 3 IMPORTANTs + 5 SUGGESTIONs. Addressed: 1. N-ary OR test now has a companion (duplicate-leftmost-branch) that isolates the rightmost-drop associativity bug from the any-branch- drop case. Comment rephrased to be honest about what the original test covers (any branch dropped, not specifically rightmost). 2. De Morgan parametrize restructured: paired (compound, distributed, expected) tuples instead of independent rows. Now asserts: - compound matches expected - distributed matches expected - compound == distributed (the actual De Morgan equivalence) Added separate double-negation test (NOT NOT A ≡ A). 3. New test_boolean_expr_to_text_emits_atom_text_for_pattern_op in test_boolean_expr.py exercises the explicit pattern branch added in c196393. Locks the contract even though the branch is currently unreachable in production (lift step extracts pattern leaves before the binder walks the tree). 4. Mirrored AND-only guard comment near _split_conjuncts in predicate_pushdown.py — that's where future maintainers actually look when adding pushdown features; the load-bearing rationale stays in expr_split.py's docstring. Test counts on dgx-spark: 2525 passed (was 2524 + 1 pattern_op unit test; net 8 new tests in PR diff vs master). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test+docs(#1219): wave-2 review fixes — accurate docstring, inline pattern, terser comments Wave-2 (2a targeted + 2b broad) found 1 IMPORTANT + several SUGGESTIONs. IMPORTANT (Wave-2b): - expr_split.split_top_level_and docstring overstated the topology argument. Original draft claimed distribute-OR breaks on fan-out topologies; the cross-alias-OR test in test_lowering.py:3206-3211 (added in #1217) explicitly confirms distribute-OR converges to correct row-set semantics on a 2:2 fan-out fixture. The actual reason pushdown leaves OR opaque is that the multi-alias references on the conjunct cause it to be retained post-join. Rewrote docstring to describe the real mechanism + cite the test. SUGGESTIONs (Wave-2a + 2b): - _ids_for(graph, query) helper inlined to match the established inline-sorted-comprehension style used elsewhere in test_lowering.py (12+ existing call sites). Removes inconsistency within this PR. - Pattern-op branch comment in _boolean_expr_text.py compressed from 9 lines to 3 — the verbose explanation duplicated the test docstring. - Local-variable assignment for graph.gfql() result kept (matches existing test patterns + works around a Plottable.gfql pyright attribute warning that fires on fluent chains). Test counts unchanged: 8 new tests; full gfql suite 1581 passed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test+docs(#1219): wave-3b fixes — accurate split-OR rationale + XOR runtime sibling Wave-3b from-scratch fresh-eyes review found 1 IMPORTANT + 3 SUGGESTIONs. Addressed: - IMPORTANT: split_top_level_and docstring's stated mechanism ('multi-alias refs cause retention post-join') was misleading. Real mechanism: pushdown silently AND-recombines the pushed conjuncts inside PatternMatch.predicates. AND distributes (split + AND-recombine ≡ original); OR does not (split + AND-recombine ≠ original). An OR-aware split would need a UNION-of-pushed-branches recombine path the current pipeline doesn't implement. Rewrote the docstring with the accurate split + AND-recombine rationale. - SUGGESTION 3: XOR runtime row-set test (sibling to OR/AND/NOT tests this PR adds). Locks symmetric-difference semantics. Skipped: - SUGGESTION 4 (NOT IS NULL standalone): nullable_not_or already exercises IS NULL composed with NOT/OR; standalone marginal. - SUGGESTION 5 (cuDF parametrize): real scope creep, defer. Test counts: 9 new tests; full gfql suite green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test+docs(#1219): wave-4 polish — accurate mixin claim, XOR with NULL, mirrored-guard clarification Wave-4 (4a targeted + 4b from-scratch) — 0 BLOCKER + 0 IMPORTANT, 9 SUGGESTIONs. Picked up the cheapest valuable ones: - Wave-4b S1: mixed-string-numeric AND-inside-OR test claimed to exercise _StringAllowingComparisonMixin but used `n.s = 'a'` (plain EQ, supported pre-#1217). Swapped to `n.s > 'a'` (string GT, the mixin-specific path); flipped fixture s-values to keep the same truth-table outcomes. - Wave-4b S3: added test_string_cypher_executes_xor_with_null_uses_three_valued_logic — sibling to the OR/NOT 3VL test, locks NULL XOR T = NULL. Reuses the 3VL fixture from the De Morgan tests. - Wave-4a S1: `_split_conjuncts` mirrored guard now names the actual failure mode (`_combine_conjuncts` AND-joins residuals) before pointing to the fuller rationale in expr_split. Skipped (out of scope or duplicate): - Wave-4b S2: rightmost-only discriminator comment is already honest; test verified correct. - Wave-4b S4: cross-alias OR / cross-product fixture — already covered by test_string_cypher_executes_cross_alias_or_returns_correct_union (#1217). - Wave-4b S6: optional CHANGELOG bullet for test+docs PR. - Wave-4a S2 (OR-analyzer in pushdown_safety.py:58-60): pre-existing unrelated, separate followup. Test counts: 10 new tests; full gfql suite green (1583 passed). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test+docs(#1219): wave-5 polish — extract 3VL fixture helper, clarify pattern-op unreachability Wave-5 formal review (multi-dim per .agents/skills/review/SKILL.md + adversarial pressure-test) found 0 BLOCKER + 0 IMPORTANT, 3 SUGGESTIONs of which 2 were actionable + cheap: - Extract `_three_valued_logic_fixture_graph()` helper paralleling `_de_morgan_fixture_graph()` — eliminates byte-identical 4-row NaN- mixed fixture duplication between nullable_not_or and xor_with_null. - Clarify pattern-op branch comment in `_boolean_expr_text.py` to name BOTH unreachability paths (top-level AND lift + nested-NOT/OR/XOR E108 rejection) instead of just the first. - Adversarial-rejected: malformed-NOT-chain test would be a tautology (right-recursive grammar makes depth-N equivalent to depth-2; existing double-negation test already exercises the path). Test counts unchanged: 11 new tests; full gfql suite 1583 passed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(changelog): add #1227 row-boolean residual matrix entry Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Phase 2 of #1219 stack: targeted disjunction support expansion for connected
MATCH + OPTIONAL MATCHlowering.This PR enables row-boolean
WHEREexpressions in connected optional queries (base and optional clauses) by routing non-filter-dict expressions throughwhere_rows(expr=...)after binding-row materialization.What changed
graphistry/compute/gfql/cypher/lowering.py_apply_where_to_ops()now returns:wherefilter comparisons, plus_compile_connected_optional_match()now appends row-expression filters aswhere_rows(expr=...)calls in base/optional chains.graphistry/compute/gfql_unified.pyASTNode/ASTEdge) forrows(binding_ops=...)where_rows) executed after row materialization.graphistry/tests/compute/gfql/cypher/test_lowering.pyOR,NOT,XORin optional WHEREORin base WHERE with optional arm presentValidation
Ran:
All passing locally.
Stack
Refs #1219