Skip to content

Phase 2: enable row-boolean WHERE in connected OPTIONAL MATCH#1224

Merged
lmeyerov merged 2 commits intomasterfrom
issue-1219-phase2-expand
Apr 28, 2026
Merged

Phase 2: enable row-boolean WHERE in connected OPTIONAL MATCH#1224
lmeyerov merged 2 commits intomasterfrom
issue-1219-phase2-expand

Conversation

@lmeyerov
Copy link
Copy Markdown
Contributor

Summary

Phase 2 of #1219 stack: targeted disjunction support expansion for connected MATCH + OPTIONAL MATCH lowering.

This PR enables row-boolean WHERE expressions in connected optional queries (base and optional clauses) by routing non-filter-dict expressions through where_rows(expr=...) after binding-row materialization.

What changed

  • graphistry/compute/gfql/cypher/lowering.py

    • _apply_where_to_ops() now returns:
      • lowered where filter comparisons, plus
      • row-expression filters for expressions that cannot be lowered to node/edge filter dicts.
    • _compile_connected_optional_match() now appends row-expression filters as where_rows(expr=...) calls in base/optional chains.
  • graphistry/compute/gfql_unified.py

    • connected optional execution now splits each chain into:
      • contiguous binding path ops (ASTNode/ASTEdge) for rows(binding_ops=...)
      • post-binding row ops (e.g., where_rows) executed after row materialization.
  • graphistry/tests/compute/gfql/cypher/test_lowering.py

Validation

Ran:

./bin/pytest.sh graphistry/tests/compute/gfql/cypher/test_lowering.py -k "issue_1219_row_boolean_audit" -q
python3.12 -B -m pytest -q graphistry/tests/compute/gfql/cypher/test_lowering.py -k "issue_1024 or issue_996 or connected_optional_match_null_extension_shapes_without_safe_alignment"
python3.12 -B -m pytest -q graphistry/tests/compute/gfql/cypher/test_parser.py graphistry/tests/compute/gfql/cypher/test_where_bool_conformance.py graphistry/tests/compute/gfql/cypher/test_binder_expr_tree.py

All passing locally.

Stack

Refs #1219

@lmeyerov lmeyerov marked this pull request as ready for review April 26, 2026 19:48
@lmeyerov lmeyerov force-pushed the issue-1219-phase2-expand branch from d546c8a to 3df138f Compare April 27, 2026 18:16
@lmeyerov lmeyerov force-pushed the issue-1219-phase2-expand branch from 3df138f to e7fa6da Compare April 27, 2026 19:29
@lmeyerov lmeyerov force-pushed the issue-1219-phase2-expand branch from e7fa6da to 329edac Compare April 27, 2026 19:41
@lmeyerov lmeyerov force-pushed the issue-1219-phase2-expand branch from 329edac to b279bcb Compare April 27, 2026 19:43
@lmeyerov lmeyerov force-pushed the issue-1219-phase2-expand branch from b279bcb to dbf3913 Compare April 27, 2026 19:53
@lmeyerov lmeyerov force-pushed the issue-1219-phase2-expand branch from dbf3913 to 24a03d9 Compare April 27, 2026 21:28
@lmeyerov lmeyerov force-pushed the issue-1219-phase1-audit branch from 8fa7324 to eff2a84 Compare April 27, 2026 21:53
@lmeyerov lmeyerov force-pushed the issue-1219-phase2-expand branch from 24a03d9 to 70d78f2 Compare April 27, 2026 21:57
@lmeyerov lmeyerov force-pushed the issue-1219-phase2-expand branch from 70d78f2 to b56f25e Compare April 27, 2026 22:18
@lmeyerov lmeyerov force-pushed the issue-1219-phase1-audit branch from f9fc71d to c8bbe38 Compare April 27, 2026 22:45
@lmeyerov lmeyerov force-pushed the issue-1219-phase2-expand branch from b56f25e to cb49359 Compare April 27, 2026 22:46
@lmeyerov lmeyerov force-pushed the issue-1219-phase1-audit branch from c8bbe38 to 2c6d590 Compare April 27, 2026 22:56
@lmeyerov lmeyerov force-pushed the issue-1219-phase2-expand branch from cb49359 to 9f49274 Compare April 27, 2026 22:57
@lmeyerov lmeyerov force-pushed the issue-1219-phase2-expand branch from 9f49274 to 2ca183d Compare April 27, 2026 23:12
@lmeyerov lmeyerov force-pushed the issue-1219-phase2-expand branch from 2ca183d to a152c25 Compare April 27, 2026 23:30
@lmeyerov lmeyerov force-pushed the issue-1219-phase1-audit branch from 2eb6cef to 753dec7 Compare April 27, 2026 23:36
@lmeyerov lmeyerov force-pushed the issue-1219-phase2-expand branch from a152c25 to 9c7b886 Compare April 27, 2026 23:37
@lmeyerov lmeyerov force-pushed the issue-1219-phase2-expand branch from 9c7b886 to bd7729a Compare April 27, 2026 23:39
@lmeyerov lmeyerov force-pushed the issue-1219-phase1-audit branch from 8e5a6e5 to e7c0f6a Compare April 27, 2026 23:44
@lmeyerov lmeyerov force-pushed the issue-1219-phase2-expand branch from bd7729a to f3a58fc Compare April 27, 2026 23:44
@lmeyerov lmeyerov force-pushed the issue-1219-phase1-audit branch from e7c0f6a to 46fc89a Compare April 28, 2026 00:05
@lmeyerov lmeyerov changed the base branch from issue-1219-phase1-audit to master April 28, 2026 00:08
@lmeyerov lmeyerov force-pushed the issue-1219-phase2-expand branch from f3a58fc to c4b2977 Compare April 28, 2026 00:08
@lmeyerov lmeyerov merged commit 864ceb2 into master Apr 28, 2026
5 checks passed
@lmeyerov lmeyerov deleted the issue-1219-phase2-expand branch April 28, 2026 00:08
lmeyerov added a commit that referenced this pull request Apr 29, 2026
…ils (#1227)

* test+docs(#1219): residual row-boolean compositional matrix + guardrails

Worker B / independent-hardening stream of #1219.  After #1217's Earley
swap surfaced row-boolean shapes (OR/NOT/XOR among row predicates) that
LALR rejected, four compositional shapes remained unverified beyond the
fixtures #1217 covered.  This PR locks empirical correctness across the
residual matrix + adds two lightweight guardrail comments.

## Compositional matrix tests (test_lowering.py)

All four shapes verified correct empirically; locked with sorted-id
assertions against discriminating fixtures:

1. **Nullable NOT/OR** — `WHERE NOT n.x = 1 OR n.y IS NULL` against a
   4-row fixture mixing real and projected nulls.  Locks pandas-backed
   row-evaluator preserves Cypher 3VL truth table (NULL OR T = T):
   `{n2, n4}`.

2. **N-ary OR (3 branches)** — `WHERE n.x = 1 OR n.x = 2 OR n.x = 3`.
   Locks left-associative parse `or(or(=1, =2), =3)` doesn't degenerate
   under associativity bugs.  `{n1, n2, n3}`.

3. **De Morgan compositions** (parametrized × 4) — both `NOT(A OR B)`
   ≡ `NOT-A AND NOT-B` and `NOT(A AND B)` ≡ `NOT-A OR NOT-B` against
   a 4-row fixture covering all (x∈{1,2}, y∈{2,3}) combos.  Each form
   and its De-Morganed equivalent return the same row set.

4. **Mixed-string-numeric AND inside OR** — `WHERE (n.s = 'a' AND
   n.x > 0) OR n.x < -1`, exercising `_StringAllowingComparisonMixin`
   (#1217) paired with OR composition.  `{n1, n2, n4}`.

## Guardrails

- `expr_split.py::split_top_level_and` — added load-bearing AND-only
  docstring with #1219 cross-ref explaining why a sibling
  `split_top_level_or` would break OR-distributivity-over-join
  correctness.  No future maintainer should accidentally add it
  without first redesigning topology-aware pushdown safety.

- `_boolean_expr_text.py::boolean_expr_to_text` — added explicit
  `if expr.op == "pattern"` branch with docstring.  Currently
  unreachable in production (lift step extracts pattern leaves before
  the binder walks the tree) but documents the contract: emit the raw
  pattern source rather than silently falling through to empty string.

## Test impact

Validated on dgx-spark: `graphistry/tests/compute/` → 2524 passed
(7 new tests; remaining delta from baseline absorbed by #1224's
unrelated additions).

Closes the residual frontier portion of #1219.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test+docs(#1219): wave-1 review fixes — discriminating fixtures, equivalence assertions, pattern-op unit test, mirrored guard

Wave-1 review on c196393 surfaced 3 IMPORTANTs + 5 SUGGESTIONs.
Addressed:

1. N-ary OR test now has a companion (duplicate-leftmost-branch) that
   isolates the rightmost-drop associativity bug from the any-branch-
   drop case.  Comment rephrased to be honest about what the original
   test covers (any branch dropped, not specifically rightmost).

2. De Morgan parametrize restructured: paired (compound, distributed,
   expected) tuples instead of independent rows.  Now asserts:
   - compound matches expected
   - distributed matches expected
   - compound == distributed (the actual De Morgan equivalence)
   Added separate double-negation test (NOT NOT A ≡ A).

3. New test_boolean_expr_to_text_emits_atom_text_for_pattern_op in
   test_boolean_expr.py exercises the explicit pattern branch added
   in c196393.  Locks the contract even though the branch is currently
   unreachable in production (lift step extracts pattern leaves before
   the binder walks the tree).

4. Mirrored AND-only guard comment near _split_conjuncts in
   predicate_pushdown.py — that's where future maintainers actually
   look when adding pushdown features; the load-bearing rationale
   stays in expr_split.py's docstring.

Test counts on dgx-spark: 2525 passed (was 2524 + 1 pattern_op unit
test; net 8 new tests in PR diff vs master).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test+docs(#1219): wave-2 review fixes — accurate docstring, inline pattern, terser comments

Wave-2 (2a targeted + 2b broad) found 1 IMPORTANT + several SUGGESTIONs.

IMPORTANT (Wave-2b):
- expr_split.split_top_level_and docstring overstated the topology
  argument.  Original draft claimed distribute-OR breaks on fan-out
  topologies; the cross-alias-OR test in test_lowering.py:3206-3211
  (added in #1217) explicitly confirms distribute-OR converges to
  correct row-set semantics on a 2:2 fan-out fixture.  The actual
  reason pushdown leaves OR opaque is that the multi-alias references
  on the conjunct cause it to be retained post-join.  Rewrote
  docstring to describe the real mechanism + cite the test.

SUGGESTIONs (Wave-2a + 2b):
- _ids_for(graph, query) helper inlined to match the established
  inline-sorted-comprehension style used elsewhere in test_lowering.py
  (12+ existing call sites).  Removes inconsistency within this PR.
- Pattern-op branch comment in _boolean_expr_text.py compressed from
  9 lines to 3 — the verbose explanation duplicated the test docstring.
- Local-variable assignment for graph.gfql() result kept (matches
  existing test patterns + works around a Plottable.gfql pyright
  attribute warning that fires on fluent chains).

Test counts unchanged: 8 new tests; full gfql suite 1581 passed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test+docs(#1219): wave-3b fixes — accurate split-OR rationale + XOR runtime sibling

Wave-3b from-scratch fresh-eyes review found 1 IMPORTANT + 3 SUGGESTIONs.
Addressed:

- IMPORTANT: split_top_level_and docstring's stated mechanism
  ('multi-alias refs cause retention post-join') was misleading.  Real
  mechanism: pushdown silently AND-recombines the pushed conjuncts
  inside PatternMatch.predicates.  AND distributes (split + AND-recombine
  ≡ original); OR does not (split + AND-recombine ≠ original).  An
  OR-aware split would need a UNION-of-pushed-branches recombine path
  the current pipeline doesn't implement.  Rewrote the docstring with
  the accurate split + AND-recombine rationale.

- SUGGESTION 3: XOR runtime row-set test (sibling to OR/AND/NOT tests
  this PR adds).  Locks symmetric-difference semantics.

Skipped:
- SUGGESTION 4 (NOT IS NULL standalone): nullable_not_or already
  exercises IS NULL composed with NOT/OR; standalone marginal.
- SUGGESTION 5 (cuDF parametrize): real scope creep, defer.

Test counts: 9 new tests; full gfql suite green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test+docs(#1219): wave-4 polish — accurate mixin claim, XOR with NULL, mirrored-guard clarification

Wave-4 (4a targeted + 4b from-scratch) — 0 BLOCKER + 0 IMPORTANT, 9
SUGGESTIONs.  Picked up the cheapest valuable ones:

- Wave-4b S1: mixed-string-numeric AND-inside-OR test claimed to
  exercise _StringAllowingComparisonMixin but used `n.s = 'a'` (plain
  EQ, supported pre-#1217).  Swapped to `n.s > 'a'` (string GT, the
  mixin-specific path); flipped fixture s-values to keep the same
  truth-table outcomes.

- Wave-4b S3: added test_string_cypher_executes_xor_with_null_uses_three_valued_logic
  — sibling to the OR/NOT 3VL test, locks NULL XOR T = NULL.  Reuses
  the 3VL fixture from the De Morgan tests.

- Wave-4a S1: `_split_conjuncts` mirrored guard now names the actual
  failure mode (`_combine_conjuncts` AND-joins residuals) before
  pointing to the fuller rationale in expr_split.

Skipped (out of scope or duplicate):
- Wave-4b S2: rightmost-only discriminator comment is already honest;
  test verified correct.
- Wave-4b S4: cross-alias OR / cross-product fixture — already covered
  by test_string_cypher_executes_cross_alias_or_returns_correct_union (#1217).
- Wave-4b S6: optional CHANGELOG bullet for test+docs PR.
- Wave-4a S2 (OR-analyzer in pushdown_safety.py:58-60): pre-existing
  unrelated, separate followup.

Test counts: 10 new tests; full gfql suite green (1583 passed).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test+docs(#1219): wave-5 polish — extract 3VL fixture helper, clarify pattern-op unreachability

Wave-5 formal review (multi-dim per .agents/skills/review/SKILL.md +
adversarial pressure-test) found 0 BLOCKER + 0 IMPORTANT, 3 SUGGESTIONs
of which 2 were actionable + cheap:

- Extract `_three_valued_logic_fixture_graph()` helper paralleling
  `_de_morgan_fixture_graph()` — eliminates byte-identical 4-row NaN-
  mixed fixture duplication between nullable_not_or and xor_with_null.
- Clarify pattern-op branch comment in `_boolean_expr_text.py` to name
  BOTH unreachability paths (top-level AND lift + nested-NOT/OR/XOR
  E108 rejection) instead of just the first.
- Adversarial-rejected: malformed-NOT-chain test would be a tautology
  (right-recursive grammar makes depth-N equivalent to depth-2; existing
  double-negation test already exercises the path).

Test counts unchanged: 11 new tests; full gfql suite 1583 passed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(changelog): add #1227 row-boolean residual matrix entry

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant