Skip to content

[Optimizer] Consolidate repeated filter-rebuild patterns in PushDownFilter #22056

@kosiew

Description

@kosiew

Summary

PushDownFilter rebuilds filter nodes in many branch-specific paths. This duplication increases maintenance cost and makes rule invariants harder to keep consistent.

Background and Motivation

#21667 introduces work to centralize filter construction, but push/keep/reinsert logic remains duplicated across plan variants.

Today, each branch tends to repeat the same shape:

  1. split predicates into pushable/keep sets
  2. rebuild one or more child filters
  3. reinsert kept predicates above
  4. return transformed plan

This pattern appears in Sort, Distinct, Repartition, Projection, Union, Extension, Aggregate, Window, Unnest, and join-related paths.

Problem Statement

Filter reconstruction mechanics are duplicated across many branches in datafusion/optimizer/src/push_down_filter.rs.

Concrete symptoms:

  • repeated make_filter(...) and Arc::new(...) call patterns
  • repeated split/push/keep control flow
  • repeated branch-local child replacement logic

Why This Matters

  • Correctness risk: duplicated rewrite logic makes subtle behavior divergence more likely across plan-node branches.
  • Invariant drift: filter reconstruction/reinsertion conventions are harder to enforce when spread across many sites.
  • Review burden: future changes require auditing many branches for equivalent behavior.
  • Evolvability: adding new single-input node rewrites repeats boilerplate and invites copy-paste defects.

Proposed Direction

Introduce a small, internal unary-node helper for filter reconstruction within PushDownFilter.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions