fix: coerce operand types in Interval mul/div/intersect/union/contains#22027
fix: coerce operand types in Interval mul/div/intersect/union/contains#22027adriangb wants to merge 2 commits intoapache:mainfrom
Conversation
86c7d43 to
097259d
Compare
|
@pepijnve @neilconway any interest in reviewing this fix? |
| /// If the two intervals have different data types, both are coerced to a | ||
| /// common comparison type via [`comparison_coercion`] before checking | ||
| /// containment. | ||
| pub fn contains<T: Borrow<Self>>(&self, other: T) -> Result<Self> { |
There was a problem hiding this comment.
Nice improvement overall. Since this PR explicitly relaxes intersect / union / contains, it could be helpful to add a couple of focused unit tests that assert the actual bounds and boolean outcomes for mismatched Decimal128 precision/scale intervals.
Right now the new direct unit test only exercises mul / div and checks result types. Adding small cases for contains returning TRUE / FALSE / TRUE_OR_FALSE would help lock in the comparison coercion behavior at the API layer too, not just through the SQL regression tests.
…`/`contains` Previously these methods asserted that both intervals shared an identical data type, causing internal errors during interval propagation for queries like `numeric / count(*)` where the operands end up as different `Decimal128` precisions/scales (e.g. `Decimal128(38, 10) / Decimal128(20, 0)`). `Interval::add` and `Interval::sub` already used `BinaryTypeCoercer` to find a common arithmetic type. This change brings `mul`/`div` in line, and similarly relaxes `intersect`/`union`/`contains` to coerce via `comparison_coercion`, since CP-solver propagation feeds the result of an arithmetic op into `intersect` with a child interval whose type may differ. Adds a unit test in `interval_arithmetic.rs` for mismatched `Decimal128` mul/div, and an end-to-end `decimal.slt` regression covering the `numeric / bigint` shape from the failing customer query. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…h mismatched Decimal128 types Per PR review feedback, lock in the comparison-coercion behavior at the API layer with focused tests asserting exact bounds and the TRUE / FALSE / TRUE_OR_FALSE outcomes for `contains`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
097259d to
b0e6ae0
Compare
I think the change itself makes sense. There is also a change in postconditions of the functions though. Before the output type of the interval was guaranteed to match the input type. After that change that postcondition is less strict. I'm wondering if that could cause problems for existing code. Relaxing the preconditions will cause existing code to follow its success code path instead of its error code path. The change in postconditions may result in an assumption violation for those success code paths. I don't know if this is mostly a theoretical issue, or something that requires actual consideration. |
|
@pepijnve good question. The honest answer is that I don't like relaxing the precondition, and I'm not familiar enough with this part of the code to be 100% sure it will not cause any issues. My only argument (which I do think is a strong one) for this being mostly a theoretical concern is that the existing add and subtract operations already Interval::add and Interval::sub already handled this via BinaryTypeCoercer and thus would have suffered from the same type mismatch problem. Since there have been no issues reported about this afaik maybe it's safe to assume it's a theoretical problem? In other words: this PR is just making the behavior uniform, not introducing new behavior. I do worry that this statistics / constraint solving should not be crashing queries, we should fail gracefully and proceed without this info. I opened #22028 to track that, but it's beyond the scope of this fix I think. |
I agree this is most likely a non-issue and the fact that |
Which issue does this PR close?
Rationale for this change
Interval::mul,Interval::div,Interval::intersect,Interval::union, andInterval::containsall asserted that both operands had identical data types. This causes internal errors during interval propagation for ordinary SQL queries that mixDecimal128precisions/scales — for example dividingnumeric(which DataFusion maps toDecimal128(38, 10)) by aBIGINTcolumn orcount(*)(coerced toDecimal128(20, 0)):Interval::addandInterval::subalready usedBinaryTypeCoercerto find a common arithmetic type; this PR bringsmulanddivin line. Oncemul/divcoerce, the result of an arithmetic op fed intointersect(e.g. by the CP solver incp_solver.rs) may have a different type than the child interval, sointersect/union/containsare also relaxed to coerce viacomparison_coercion(matching the existing pattern inInterval::contains_value).Reproducer (
datafusion-cli, before this PR)fails with the internal-error message above. After this PR the query returns rows.
What changes are included in this PR?
Interval::mulandInterval::divwith a newcoerce_operandshelper that usesBinaryTypeCoercer::get_result_typeand casts both intervals to the common type when they differ.Interval::intersect,Interval::union, andInterval::containswith a newcoerce_for_comparisonhelper that usescomparison_coercionand casts both intervals to the common type when they differ.The same-type fast path is preserved (no allocation/cast when both intervals already share a type), so this should be a no-op for existing call sites.
Are these changes tested?
Yes:
test_mul_div_mismatched_operand_typesininterval_arithmetic.rsexercisingDecimal128(38, 10)/Decimal128(20, 0)andDecimal128 × Int64for bothmulanddiv.decimal.sltcovering thenumeric / bigintshape from the reproducer above through the optimizer's interval-propagation path. Without this fix the SLT query fails with an internal error; with it, it returns rows.datafusion-expr-common,datafusion-physical-expr,datafusion-physical-plan, and fullsqllogictestssuites all pass.Are there any user-facing changes?
Queries that previously hit
Internal error: Intervals must have the same data type ...(or the equivalent for intersect/union/contains) during interval/stats propagation will now succeed. No public API changes — these methods kept their signatures; the documented preconditions are relaxed.🤖 Generated with Claude Code