Is your feature request related to a problem or challenge?
In #21988, we fixed median to return Float64 for integer inputs (avoiding truncation) while preserving Decimal and floating-point types as-is. This means after #21988:
median(int_col) → Float64
median(float_col) → Float32/Float64 (preserved)
median(decimal_col) → Decimal* (preserved)
However, percentile_cont — which is functionally equivalent to median when the percentile is 0.5 — casts all Numeric inputs (including Decimal) to Float64:
https://github.com/apache/datafusion/blob/main/datafusion/functions-aggregate/src/percentile_cont.rs#L143-L157
So median(decimal_col) and percentile_cont(0.5) WITHIN GROUP (ORDER BY decimal_col) return different types for the same input — Decimal* vs Float64. As @Jefffrey noted in #21988, this divergence between two essentially identical functions is surprising.
Describe the solution you'd like
Per @alamb's suggestion in #21988 (comment), align in two phases:
Phase 1 — extend percentile_cont to preserve Decimal (and Float32)
Update PercentileCont::new so the value-argument coercion accepts Decimal and floating-point types directly, matching median's post-#21988 behavior. The percentile argument (the 0.5) keeps its current Float64 coercion.
After this change:
percentile_cont(0.5) WITHIN GROUP (ORDER BY decimal_col) → Decimal* (preserved)
percentile_cont(0.5) WITHIN GROUP (ORDER BY int_col) → Float64
percentile_cont(0.5) WITHIN GROUP (ORDER BY float_col) → Float32/Float64 (preserved)
Phase 2 — fold median into percentile_cont as an alias
Once the signatures and return types align, remove the standalone Median implementation and register median as an alias of percentile_cont with 0.5 as the percentile.
Describe alternatives you've considered
Make median match percentile_cont's current Float64-everywhere behavior — discussed in #21988 and not chosen because casting Decimal to Float64 discards the exact-precision guarantee users opt into when picking Decimal.
Additional context
Is your feature request related to a problem or challenge?
In #21988, we fixed
medianto returnFloat64for integer inputs (avoiding truncation) while preservingDecimaland floating-point types as-is. This means after #21988:median(int_col)→Float64median(float_col)→Float32/Float64(preserved)median(decimal_col)→Decimal*(preserved)However,
percentile_cont— which is functionally equivalent tomedianwhen the percentile is0.5— casts allNumericinputs (includingDecimal) toFloat64:https://github.com/apache/datafusion/blob/main/datafusion/functions-aggregate/src/percentile_cont.rs#L143-L157
So
median(decimal_col)andpercentile_cont(0.5) WITHIN GROUP (ORDER BY decimal_col)return different types for the same input —Decimal*vsFloat64. As @Jefffrey noted in #21988, this divergence between two essentially identical functions is surprising.Describe the solution you'd like
Per @alamb's suggestion in #21988 (comment), align in two phases:
Phase 1 — extend
percentile_contto preserveDecimal(andFloat32)Update
PercentileCont::newso the value-argument coercion acceptsDecimaland floating-point types directly, matchingmedian's post-#21988 behavior. Thepercentileargument (the0.5) keeps its currentFloat64coercion.After this change:
percentile_cont(0.5) WITHIN GROUP (ORDER BY decimal_col)→Decimal*(preserved)percentile_cont(0.5) WITHIN GROUP (ORDER BY int_col)→Float64percentile_cont(0.5) WITHIN GROUP (ORDER BY float_col)→Float32/Float64(preserved)Phase 2 — fold
medianintopercentile_contas an aliasOnce the signatures and return types align, remove the standalone
Medianimplementation and registermedianas an alias ofpercentile_contwith0.5as the percentile.Describe alternatives you've considered
Make
medianmatchpercentile_cont's current Float64-everywhere behavior — discussed in #21988 and not chosen because castingDecimaltoFloat64discards the exact-precision guarantee users opt into when pickingDecimal.Additional context
medianreturns Float64 for integer inputs to avoid truncation #21988 (review)medianreturns Float64 for integer inputs to avoid truncation #21988 (closes bug: Median() truncates integers #19536 )