feat(gfql/cypher): multi-positive WHERE pattern predicates (#1031 slice 3)#1229
Merged
feat(gfql/cypher): multi-positive WHERE pattern predicates (#1031 slice 3)#1229
Conversation
…ce 3)
AND-joined positive WHERE pattern predicates (`WHERE (n)-[:R]->() AND
(n)-[:T]->()`) now lift into structured `WhereClause.predicates` as N
`WherePatternPredicate` entries. The ast_normalizer packs them into a
single appended `MatchClause` whose `patterns: Tuple[Tuple[PatternElement,
...], ...]` carries one tuple per predicate (multi-pattern cartesian
within MATCH), preserving the lowering invariant that only the FINAL
match is connected — pre-binding seeds remain node-only.
Changes:
- `parser.py::pattern_atom` — split the greedy `WHERE_PATTERN` lexer
token (which gobbles `pattern AND pattern AND ...` chains as a single
match) back into individual pattern-item texts via
`_WHERE_PATTERN_ITEM_RE.finditer` and emit one
`BooleanExpr(op="pattern")` per item, joined by an AND-tree via
`_rebuild_and_tree`.
- `parser.py::_build_where_with_pattern_lift` — drop the
`len(pattern_leaves) > 1` E108 gate; build N WherePatternPredicates.
- `parser.py::_parse_single_where_pattern_predicate_text` — rename from
`_parse_where_pattern_predicate_text` and remove its in-helper
multi-item gate (caller now splits before invocation).
- `ast_normalizer.py::_rewrite_where_pattern_predicates_to_matches` —
drop the matching gate; loop over predicates running per-predicate
validation (must include relationship; no new aliases); emit a single
appended MatchClause with N patterns.
Tests:
- Add `test_gfql_executes_multi_positive_where_pattern_predicates_as_intersected_seed`
for the runtime contract: rows where ALL patterns exist.
- Update the legacy rejection test to assert the new lift + compile path.
Verified:
- 1574 GFQL tests pass (was 1573 baseline + 1 new + 1 updated).
- mypy clean on `parser.py` and `ast_normalizer.py`.
Out of scope:
- Slice 2 (NOT-pattern, IC10 unblock): needs anti-semi-join lowering.
Tracked in plan.md; will be a separate stacked PR.
- Slice 4 (OR/XOR-around-pattern): needs row-level pattern-existence
column or AntiSemiApply executor build. Tracked in plan.md.
- Multi-positive against multiple bound aliases (`MATCH (n), (m) WHERE
(n)-[:R]->(m) AND (n)-[:T]->(m) RETURN n, m`): hits a separate
pre-existing engine limit ("repeated MATCH aliases" projection).
Not introduced by this slice; out of scope.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
Author
PR Review: #1229 — feat(gfql/cypher): multi-positive WHERE pattern predicates (#1031 slice 3)Branch: BlockersNone. ImportantNone. Suggestions
(Wave 1 also flagged the bound-aliases multi-pattern case hitting "repeated MATCH aliases" projection limit — rejected as a slice 3 finding because it's a pre-existing engine limit affecting single-positive bound-aliases too. PR body documents.) Human checks required
MethodologyPer CI signal + cross-repo pairing
RecommendationApprove and merge with |
This was referenced Apr 29, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Slice 3 of #1031. AND-joined positive WHERE pattern predicates now lift into structured
WhereClause.predicatesas NWherePatternPredicateentries. Closes thelen(pattern_leaves) > 1E108 gate that #1217 (slice 1) deferred.What changed
Parser (
cypher/parser.py):pattern_atomnow splits the greedyWHERE_PATTERNlexer token (which gobblespattern AND pattern AND ...as a single match) back into individual pattern-item texts via_WHERE_PATTERN_ITEM_RE.finditer. Emits oneBooleanExpr(op="pattern")per item, joined by_rebuild_and_tree. Upstream_split_top_level_and_pattern_leavesthen naturally extracts N pattern leaves._build_where_with_pattern_lift— drop thelen(pattern_leaves) > 1E108 gate; build NWherePatternPredicates._parse_where_pattern_predicate_textrenamed to_parse_single_where_pattern_predicate_textand its in-helper multi-item gate removed (caller now splits before invocation).ast_normalizer (
cypher/ast_normalizer.py):_rewrite_where_pattern_predicates_to_matches— drop the matching gate. Run per-predicate validation independently. Pack the N patterns into a single appendedMatchClausewithpatterns: Tuple[Tuple[PatternElement, ...], ...]of N entries (multi-pattern cartesian within MATCH), preserving the lowering invariant that only the FINAL match is connected — pre-binding seeds remain node-only.Tests
test_gfql_executes_multi_positive_where_pattern_predicates_as_intersected_seed— runtime contract: rows where ALL patterns exist.test_lower_match_query_supports_multiple_where_pattern_predicates(was..._rejects_...) — assert the lift + compile path now succeeds.Test plan
pytest graphistry/tests/compute/gfql/— 1574 passed, 87 skipped, 15 xfailed, no regressions.mypy graphistry/compute/gfql/cypher/parser.py graphistry/compute/gfql/cypher/ast_normalizer.py— clean.Out of scope (deferred, tracked in plan.md)
AntiSemiApply/SemiApplyexecutor properly. Stacked PR-4.MATCH (n), (m) WHERE (n)-[:R]->(m) AND (n)-[:T]->(m) RETURN n, m) hits a pre-existing engine limit ("Cypher row projection from repeated MATCH aliases is not yet supported"). Not a slice 3 regression — same limit existed for the single-positive bound-aliases case before this PR. Filed mentally as a separate engine cleanup; out of scope here.Related