build: Enable Spark SQL tests for Spark 4.1.1#4093
Open
andygrove wants to merge 19 commits intoapache:mainfrom
Open
build: Enable Spark SQL tests for Spark 4.1.1#4093andygrove wants to merge 19 commits intoapache:mainfrom
andygrove wants to merge 19 commits intoapache:mainfrom
Conversation
Add a spark-4.1 Maven profile (root pom.xml and spark/pom.xml) targeting Spark 4.1.1, generate dev/diffs/4.1.1.diff from the 4.0.1 diff, and add a Spark 4.1.1 entry to the spark_sql_test workflow matrix. The spark-4.1 profile reuses the spark-4.0 shim sources for now. Iceberg and Jetty deps mirror the spark-4.0 profile.
…ment new-version workflow Spark 4.1 widened IndexShuffleBlockResolver's third constructor parameter from java.util.Map to java.util.concurrent.ConcurrentMap, so reflectively invoking it with Collections.emptyMap() now fails with "argument type mismatch". Pass a ConcurrentHashMap instead, which is assignable to both. Also expand docs/source/contributor-guide/spark-sql-tests.md with the profile/shim setup and local sbt workflow lessons (NOLINT_ON_COMPILE, fresh -Dmaven.repo.local, common API-shape changes between Spark minors).
Mirror spark-4.0 test shim tree so JVM tests can compile under -Pspark-4.1.
Wire isSpark41Plus into CometPlanStabilitySuite and update regenerate-golden-files.sh to accept --spark-version 4.1. Generated golden files for tpcds v1.4 and v2.7 under approved-plans-{v1_4,v2_7}-spark4_1.
Mirrors the spark-4.0 CometTypeShim helper that apache#4084 added to the scan rule. VariantMetadata.isVariantStruct exists in Spark 4.1.1 (in PushVariantIntoScan.scala) so the implementation is identical to spark-4.0.
Comet's Maven phase resolves the transitive dependency graph and pulls POMs for artifacts whose JARs it never needs. When sbt then resolves Spark's deps, sbt-coursier sees the POM in mavenLocal, declares the artifact found locally, and fails on the missing JAR without falling back to Maven Central. Delete pom-only entries (where packaging is jar/bundle) between Comet's install and sbt's update so sbt re-fetches the full artifact remotely. Hit on Spark 4.1 jobs because the cached scala-xml_2.13:2.1.0 pom blocks Spark's tags subproject from resolving the JAR.
Adds a Spark 4.1, JDK 17 entry to lint-java and linux-test in pr_build_linux.yml and a Spark 4.1, Scala 2.13 entry to pr_build_macos.yml so PR builds exercise the new spark-4.1 profile alongside 3.4/3.5/4.0.
semanticdb-scalac_2.13.17:4.13.6 is not published; the most recent semanticdb build for the 2.13 branch is for 2.13.16. Scala 2.13 is binary-stable across patch versions, so Comet compiled with 2.13.16 still links against Spark 4.1.1's 2.13.17 runtime.
Reverts the 2.13.16 pin: Spark 4.1.1 is compiled against 2.13.17 and emits calls into MurmurHash3 stdlib methods that don't exist in 2.13.16, so any TreeNode hashCode at runtime throws NoSuchMethodError. semanticdb-scalac_2.13.17 isn't published yet, so drop the spark-4.1 entry from the lint-java matrix (which runs -Psemanticdb scalafix); the linux-test matrix entry stays. Verified locally: ./mvnw test -Pspark-4.1 -Dsuites=org.apache.comet.CometFuzzMathSuite passes 30/30.
The scala-2.13 profile sets scala.version=2.13.16, overriding the spark-4.1 profile's 2.13.17 pin. Activating both produced a 2.13.16 runtime that hits NoSuchMethodError on Spark 4.1.1's TreeNode.hashCode (calls into MurmurHash3 stdlib methods added in 2.13.17). The spark-4.1 profile already sets scala.binary.version=2.13, so -Pscala-2.13 is redundant.
1. CometNativeWriteExec: Spark 4.1 made newTaskTempFile(taskContext, dir, ext: String) throw mustOverrideOneMethodError by default; the FileNameSpec overload is now the supported one. Switch the call to use FileNameSpec("", ""), which exists in 3.4 onward, so it works across all profiles.
2. CometExpressionSuite remainder function test: Spark 4.1 introduced REMAINDER_BY_ZERO for the % operator instead of reusing DIVIDE_BY_ZERO. Branch the expected error message on isSpark41Plus.
4 tasks
manuzhang
reviewed
Apr 27, 2026
| import org.apache.spark.sql.types.{DataType, StringType, StructType} | ||
|
|
||
| trait CometTypeShim { | ||
| // A `StringType` carries collation metadata in Spark 4.0. Only non-default (non-UTF8_BINARY) |
Member
There was a problem hiding this comment.
If this is common for Spark 4.0 and Spark 4.1, we can move it from spark-4.0 to spark-4.x.
Member
There was a problem hiding this comment.
It looks most shims in Spark 4.1 are identical to Spark 4.0 except for CometExprShim. I added a CometSumShim for Spark 4.1 in #2829 and moved other shims from spark-4.0 to spark-4.x.
Member
|
@andygrove Has this been superseded by #4097 and #4106? |
Member
Author
This pr has been rebased and now just enables the spark sql tests |
This was referenced Apr 28, 2026
Comet native scan rejects invalid UTF-8 byte sequences in STRING column (hll.sql on Spark 4.1)
#4121
Open
Patches dev/diffs/4.1.1.diff to handle the four failing CI jobs: - CollationSuite 'hash agg is not used for non binary collations': recognise CometHashAggregateExec in the binary-collation plan-shape assertion (the existing patch already handled the join-shape cases). - StreamRealTimeModeAllowlistSuite: mix in IgnoreCometSuite. The suite asserts on hard-coded operator-name strings that Comet replaces, and the allowlist mechanism is orthogonal to Comet. - Disable Comet for hll.sql, except-all.sql, intersect-all.sql, having-and-order-by-recursive-type-name-resolution.sql via --SET spark.comet.enabled = false, and tag SPARK-53968 in SQLViewSuite with IgnoreComet. Each carries a TODO link to the follow-up issue tracking the underlying Comet bug: - issues/4121: native scan rejects invalid UTF-8 in STRING column - issues/4122: EXCEPT ALL / INTERSECT ALL with GROUP BY incorrect - issues/4123: native sort lacks row-format support for Struct(Map) - issues/4124: decimal arithmetic 10x discrepancy through view CTE
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Rationale for this change
We currently run the Spark SQL test suites against 3.4.3, 3.5.8, and 4.0.1 in CI. This PR adds Spark 4.1.1 to that matrix so we can catch incompatibilities introduced by that release as early as possible.
What changes are included in this PR?
New
dev/diffs/4.1.1.diffgenerated by applyingdev/diffs/4.0.1.diffto thev4.1.1Spark tag withgit apply --rejectand resolving rejects manually. Most rejects were import-only differences caused by surrounding-context changes in 4.1.1 (for example,ShuffleExchangeExecvsShuffleExchangeLikeand the additional comet imports). Thepom.xmlportion of the diff setsspark.version.short=4.1so the patched Spark build pullscomet-spark-spark4.1_2.13.Workflow update:
{spark-short: '4.1', spark-full: '4.1.1', java: 17, scan-impl: 'auto'}entry added to.github/workflows/spark_sql_test.yml.Docs update: expanded
docs/source/contributor-guide/spark-sql-tests.mdwith two new sections covering (1) adding a profile and shim tree for a brand-new Spark minor, including the API-shape patterns above, and (2) running the SQL tests locally against a patched Spark clone (NOLINT_ON_COMPILE=trueto skip Spark's scalastyle, and the-Dmaven.repo.local=/tmp/spark-m2-repoworkaround for partially-cached~/.m2entries that confuse Coursier).How are these changes tested?
Locally:
make release PROFILES=-Pspark-4.1builds cleanly. Against the patched Spark 4.1.1 source withENABLE_COMET=true ENABLE_COMET_ONHEAP=true:catalyst/test: 8472 tests passed.sql/testOnly org.apache.spark.sql.MathFunctionsSuite: 61 tests passed.The full SQL/Hive matrix will run as part of
Spark SQL Testson this PR; the run itself is the broader test plan.