Skip to content

[SPARK-57830][SS] Support event-time watermark on nanosecond-precision timestamp columns#56944

Open
yadavay-amzn wants to merge 2 commits into
apache:masterfrom
yadavay-amzn:SPARK-57830
Open

[SPARK-57830][SS] Support event-time watermark on nanosecond-precision timestamp columns#56944
yadavay-amzn wants to merge 2 commits into
apache:masterfrom
yadavay-amzn:SPARK-57830

Conversation

@yadavay-amzn

@yadavay-amzn yadavay-amzn commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

Allows nanosecond-precision timestamp columns (TimestampNTZNanosType, TimestampLTZNanosType) to be used as the event-time column for Structured Streaming watermarks. withWatermark's analyzer type-check now accepts these types, and EventTimeWatermarkExec converts the nanosecond event-time value to milliseconds for watermark tracking on the correct scale (epochMicros -> microsToMillis), matching the existing microsecond-timestamp path. The watermark eviction literal is constructed with the correct nanosecond type.

Why are the changes needed?

Part of nanosecond-precision timestamp support (SPARK-56822). Streaming queries using a nanosecond-precision timestamp as the event-time column previously failed the withWatermark type-check (EVENT_TIME_IS_NOT_ON_TIMESTAMP_TYPE).

Does this PR introduce any user-facing change?

Yes - withWatermark now accepts nanosecond-precision timestamp event-time columns; the watermark advances identically to an equivalent microsecond-timestamp column.

How was this patch tested?

New EventTimeWatermarkSuite tests for LTZ-nanos and NTZ-nanos event-time columns asserting correct watermark advancement, plus a microsecond/nanosecond equivalence test. All EventTimeWatermarkSuite tests pass.

Limitations / follow-ups

This PR covers the watermark itself (event-time column acceptance + advancement). window / session_window / window_time over nanosecond timestamps and full nanosecond support in streaming stateful operators (e.g. dropDuplicatesWithinWatermark) are tracked as follow-ups (SPARK-57829, SPARK-57843); using a nanosecond-precision event-time column with dropDuplicatesWithinWatermark now raises a clear analysis error until SPARK-57843 adds support, rather than silently mishandling it.

Was this patch authored or co-authored using generative AI tooling?

Authored with assistance by Claude Opus 4.8.

@MaxGekk

MaxGekk commented Jul 2, 2026

Copy link
Copy Markdown
Member

@HeartSaVioR Could you review this PR, please.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants