Skip to content

[SPARK-28587][SQL] Route JDBC partition bound literals through JdbcDialect.compileValue#55727

Open
AliaksandrAleksiukCollibra wants to merge 3 commits intoapache:masterfrom
AliaksandrAleksiukCollibra:SPARK-28587-jdbc-dialect-compileValue
Open

[SPARK-28587][SQL] Route JDBC partition bound literals through JdbcDialect.compileValue#55727
AliaksandrAleksiukCollibra wants to merge 3 commits intoapache:masterfrom
AliaksandrAleksiukCollibra:SPARK-28587-jdbc-dialect-compileValue

Conversation

@AliaksandrAleksiukCollibra
Copy link
Copy Markdown

What changes were proposed in this pull request?

toBoundValueInWhereClause in JDBCRelation now accepts the resolved JdbcDialect and calls dialect.compileValue for date and timestamp partition bounds instead of hardcoding bare quoted string literals.

Why are the changes needed?

When partitioning a JDBC table by a date or timestamp column, Spark generates WHERE clauses like col < '2024-01-01'. Strict-typing engines such as Athena and Phoenix reject this with a type mismatch error (Cannot apply operator: date < varchar) because the bare quoted string is treated as VARCHAR, not DATE/TIMESTAMP.

JdbcDialect.compileValue already exists for dialect-specific value formatting and is used in filter pushdown, but was never wired into partition bound generation. Dialects for strict-typing engines can now override compileValue to emit typed literals such as DATE '2024-01-01' or TIMESTAMP '2024-01-01 00:00:00'.

Does this PR introduce any user-facing change?

Yes. Users with a custom JdbcDialect that overrides compileValue will now see that override applied to partition WHERE clauses as well as filter pushdown. The base JdbcDialect.compileValue returns the same bare-quoted string as before, so all built-in dialects and users without a custom dialect are unaffected.

How was this patch tested?

Added SPARK-28587: columnPartition should use dialect.compileValue for date/timestamp bounds in JDBCSuite. The test registers a custom dialect that emits DATE '...' / TIMESTAMP '...' typed literals, calls JDBCRelation.columnPartition directly, and asserts the generated WHERE clauses contain typed literals rather than bare quoted strings.

Existing tests SPARK-34843 and SPARK-22814 continue to pass, confirming backward compatibility.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Sonnet 4.6

…alect.compileValue

Spark generates partition WHERE clauses like `col < '2024-01-01'` for date/timestamp columns, which strict-typing engines (Athena, Phoenix) reject with a type mismatch since bare quoted strings are VARCHAR, not DATE/TIMESTAMP.

Fix passes the dialect to `toBoundValueInWhereClause` and calls `dialect.compileValue` for date/timestamp bounds. The base implementation returns the same bare-quoted string as before, so existing dialects are unaffected.
@AliaksandrAleksiukCollibra
Copy link
Copy Markdown
Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant