Skip to content

[SPARK-57660][SQL] Support casting between TIME(p) and TIMESTAMP_LTZ(q)#56733

Open
MaxGekk wants to merge 1 commit into
apache:masterfrom
MaxGekk:time-cast-timestamp_ltz
Open

[SPARK-57660][SQL] Support casting between TIME(p) and TIMESTAMP_LTZ(q)#56733
MaxGekk wants to merge 1 commit into
apache:masterfrom
MaxGekk:time-cast-timestamp_ltz

Conversation

@MaxGekk

@MaxGekk MaxGekk commented Jun 24, 2026

Copy link
Copy Markdown
Member

What changes were proposed in this pull request?

This PR adds bidirectional casts between the TIME(p) data type (p in [0, 9]) and TIMESTAMP_LTZ(q) (q in [6, 9], where q=6 is the microsecond TimestampType and q in [7, 9] is the nanosecond TimestampLTZNanosType).

It is the TIMESTAMP_LTZ counterpart of #56677 / SPARK-57618 (TIME <-> TIMESTAMP_NTZ) and a sub-task of SPARK-56822.

Semantics follow the SQL standard (section 6.13 <cast specification>):

  • CAST(TIMESTAMP_LTZ(q) AS TIME(p)) (rule 15.d): the LTZ value is an absolute instant, so its time-of-day is the local wall-clock time observed in the session time zone, truncated to precision p. Unlike TIMESTAMP_NTZ -> TIME, this direction depends on the session time zone.
  • CAST(TIME(p) AS TIMESTAMP_LTZ(q)) (rule 17.c): the date fields come from CURRENT_DATE and the time fields from the value; the resulting local date-time is interpreted in the session time zone to produce the instant. Since CURRENT_DATE is stable within a query, the cast is stabilized via the existing ComputeCurrentTime optimizer rule, so it shares the same date literal as current_date().

Both directions of TIME <-> TIMESTAMP_LTZ therefore depend on the session time zone (whereas for TIMESTAMP_NTZ only TIME -> TIMESTAMP_NTZ does). Fractional precision handling is pure truncation (floor toward the precision step; no rounding). Both directions always succeed, so no new nullability or error condition is introduced.

Implementation notes:

  • New rule-table entries in Cast.canCast / Cast.canAnsiCast for the four pairs. canTryCast inherits these for atomic types.
  • All four pairs are marked needsTimeZone (both directions read the session zone).
  • Interpreted and codegen paths for both directions.
  • ComputeCurrentTime scans CAST nodes and, applying the new Cast.isTimeToTimestampLTZ predicate on the resolved plan, rewrites TIME -> TIMESTAMP_LTZ into a zone-aware date+time builder (new internal MakeTimestampLTZ / MakeTimestampLTZNanos) anchored on the query current date. As with the NTZ feature, these casts are intentionally not tagged with CURRENT_LIKE (inline-table validation treats CURRENT_LIKE as safe to defer). The Cast eval/codegen fallback (using currentDate(zoneId)) covers direct expression evaluation.
  • New helpers: SparkDateTimeUtils.timestampToNanosOfDay / timestampLTZNanosToNanosOfDay and DateTimeUtils.makeTimestampLTZNanos.

Out of scope: Structured Streaming batch-timestamp parity for TIME -> TIMESTAMP_LTZ (the cast uses the optimizer-instant current date rather than the micro-batch timestamp).

Why are the changes needed?

Spark supports TIME <-> TIMESTAMP_NTZ casts (SPARK-57618) but not TIME <-> TIMESTAMP_LTZ. These conversions are required by the SQL standard and are a common user need (attaching a time-of-day to a timestamp, or extracting the time-of-day from a timestamp). This is a sub-task of SPARK-56822 (timestamps with nanosecond precision).

Does this PR introduce any user-facing change?

Yes. Casting between TIME(p) and TIMESTAMP_LTZ(q) is now allowed (previously it failed analysis with a cast type-mismatch). Examples:

-- extract the time-of-day in the session time zone
SELECT CAST(TIMESTAMP'2020-05-17 12:34:56.789012' AS TIME(6));
-- 12:34:56.789012

-- attach the current date, interpreted in the session time zone
SELECT CAST(TIME'12:34:56.789012345' AS TIMESTAMP_LTZ(9));
-- <current_date> 12:34:56.789012345

This is a new feature on an unreleased branch; there is no behavior change relative to a released version.

How was this patch tested?

  • New unit tests in CastSuiteBase (run under ANSI on and off): allowed-pair / needsTimeZone matrix, isTimeToTimestampLTZ truth table, TIMESTAMP_LTZ(q) -> TIME(p) values across all precisions (including pre-epoch and sub-microsecond truncation), interpreted-vs-codegen consistency, and zone-fixed round trips.
  • New tests in DateExpressionsSuite for MakeTimestampLTZ / MakeTimestampLTZNanos (including canonicalization on precision).
  • New test in ComputeCurrentTimeSuite asserting the forward cast is rewritten with a query-stable current-date literal consistent with current_date().
  • New unit tests in DateTimeUtilsSuite for makeTimestampLTZNanos and the time-of-day extraction helpers.
  • New deterministic cases in cast.sql (and the imported nonansi/cast.sql) with regenerated golden files.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Cursor (Claude Opus 4.8)

### What changes were proposed in this pull request?

This PR adds bidirectional casts between the `TIME(p)` data type (`p` in `[0, 9]`) and `TIMESTAMP_LTZ(q)` (`q` in `[6, 9]`, where `q=6` is the microsecond `TimestampType` and `q` in `[7, 9]` is the nanosecond `TimestampLTZNanosType`).

It is the `TIMESTAMP_LTZ` counterpart of SPARK-57618 (`TIME` <-> `TIMESTAMP_NTZ`) and a sub-task of SPARK-56822.

Semantics follow the SQL standard (section 6.13 `<cast specification>`):
- `CAST(TIMESTAMP_LTZ(q) AS TIME(p))` (rule 15.d): the LTZ value is an absolute instant, so its time-of-day is the local wall-clock time observed in the session time zone, truncated to precision `p`. Unlike `TIMESTAMP_NTZ -> TIME`, this direction depends on the session time zone.
- `CAST(TIME(p) AS TIMESTAMP_LTZ(q))` (rule 17.c): the date fields come from `CURRENT_DATE` and the time fields from the value; the resulting local date-time is interpreted in the session time zone to produce the instant. Since `CURRENT_DATE` is stable within a query, the cast is stabilized via the existing `ComputeCurrentTime` optimizer rule, so it shares the same date literal as `current_date()`.

Both directions of `TIME` <-> `TIMESTAMP_LTZ` therefore depend on the session time zone (whereas for `TIMESTAMP_NTZ` only `TIME -> TIMESTAMP_NTZ` does). Fractional precision handling is pure truncation (floor toward the precision step; no rounding). Both directions always succeed, so no new nullability or error condition is introduced.

Implementation notes:
- New rule-table entries in `Cast.canCast` / `Cast.canAnsiCast` for the four pairs. `canTryCast` inherits these for atomic types.
- All four pairs are marked `needsTimeZone` (both directions read the session zone).
- Interpreted and codegen paths for both directions.
- `ComputeCurrentTime` scans `CAST` nodes and, applying the new `Cast.isTimeToTimestampLTZ` predicate on the resolved plan, rewrites `TIME -> TIMESTAMP_LTZ` into a zone-aware date+time builder (new internal `MakeTimestampLTZ` / `MakeTimestampLTZNanos`) anchored on the query current date. As with the NTZ feature, these casts are intentionally not tagged with `CURRENT_LIKE` (inline-table validation treats `CURRENT_LIKE` as safe to defer). The `Cast` eval/codegen fallback (using `currentDate(zoneId)`) covers direct expression evaluation.
- New helpers: `SparkDateTimeUtils.timestampToNanosOfDay` / `timestampLTZNanosToNanosOfDay` and `DateTimeUtils.makeTimestampLTZNanos`.

Out of scope: Structured Streaming batch-timestamp parity for `TIME -> TIMESTAMP_LTZ` (the cast uses the optimizer-instant current date rather than the micro-batch timestamp).

### Why are the changes needed?

Spark supports `TIME` <-> `TIMESTAMP_NTZ` casts (SPARK-57618) but not `TIME` <-> `TIMESTAMP_LTZ`. These conversions are required by the SQL standard and are a common user need (attaching a time-of-day to a timestamp, or extracting the time-of-day from a timestamp). This is a sub-task of SPARK-56822 (timestamps with nanosecond precision).

### Does this PR introduce _any_ user-facing change?

Yes. Casting between `TIME(p)` and `TIMESTAMP_LTZ(q)` is now allowed (previously it failed analysis with a cast type-mismatch). Examples:

```sql
-- extract the time-of-day in the session time zone
SELECT CAST(TIMESTAMP'2020-05-17 12:34:56.789012' AS TIME(6));
-- 12:34:56.789012

-- attach the current date, interpreted in the session time zone
SELECT CAST(TIME'12:34:56.789012345' AS TIMESTAMP_LTZ(9));
-- <current_date> 12:34:56.789012345
```

This is a new feature on an unreleased branch; there is no behavior change relative to a released version.

### How was this patch tested?

- New unit tests in `CastSuiteBase` (run under ANSI on and off): allowed-pair / `needsTimeZone` matrix, `isTimeToTimestampLTZ` truth table, `TIMESTAMP_LTZ(q) -> TIME(p)` values across all precisions (including pre-epoch and sub-microsecond truncation), interpreted-vs-codegen consistency, and zone-fixed round trips.
- New tests in `DateExpressionsSuite` for `MakeTimestampLTZ` / `MakeTimestampLTZNanos` (including canonicalization on precision).
- New test in `ComputeCurrentTimeSuite` asserting the forward cast is rewritten with a query-stable current-date literal consistent with `current_date()`.
- New unit tests in `DateTimeUtilsSuite` for `makeTimestampLTZNanos` and the time-of-day extraction helpers.
- New deterministic cases in `cast.sql` (and the imported `nonansi/cast.sql`) with regenerated golden files.

### Was this patch authored or co-authored using generative AI tooling?

Generated-by: Cursor (Claude Opus 4.8)
@MaxGekk

MaxGekk commented Jun 24, 2026

Copy link
Copy Markdown
Member Author

@cloud-fan @uros-b @stevomitric Could you review this PR, please. It is similar to recently merged PR for TIME(p) <-> TIMESTAMP_NTZ(q).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant