Add the dated three-platform BerlinMOD benchmark reports by estebanzimanyi · Pull Request #26 · MobilityDB/MobilityDB-BerlinMOD

estebanzimanyi · 2026-05-11T07:17:19Z

Adds the dated BerlinMOD benchmark reports for the three ecosystem platforms (MobilityDB on PostgreSQL, MobilityDuck on DuckDB, MobilitySpark on Spark), the companion of the portable-SQL code in #24. It carries the chapter-1 R-query result matrix at scalefactor 0.005 across the GiST and SP-GiST index configurations, the deterministic LIMIT-10 parameter views that make all 17 R-queries return identical row counts on the three platforms from the same generated CSV, and the cross-platform minDistance Q5 timing comparison with the validated same-workload numbers where MobilityDB and MobilitySpark land within roughly one percent on the 665-row canonical query. Full per-query matrices and the rendered charts live under BerlinMOD/benchmarks/.

Adds `BerlinMOD/benchmarks/` for dated benchmark reports against the BerlinMOD chapter-1 query set on each ecosystem platform. Initial entries: - `README.md` — index and `<Platform>_<scope>_<topic>_<YYYY-MM-DD>.md` naming convention. - `MobilityDB_chapter1_th3index_2026-05-11.md` — index-matrix measurements (none / GiST(trip) / SP-GiST(trip) / GiST(trip_h3) / combinations) for Q1, Q2, Q4, Q6. Headline result: SP-GiST(trip) on Q1 at 3.09× speedup over baseline (5951 ms → 1927 ms); SP-GiST(trip) + GiST(trip_h3) + h3 prefilter at 3.16×. All configurations return matching row counts — the h3 prefilter is sound. Documents the polygon-coverage soundness contract, the selectivity of the prefilter (cross-join 162 000 → 55 720 → 3 836 true hits), and the index recommendations for the four query shapes. - `CrossPlatform_th3index_readiness_2026-05-11.md` — inventory of what's needed to replicate the bench on MobilityDuck (~4–5 person-days; th3index registration, h3indexset surface, zone-map pushdown verification) and MobilitySpark (~1.5 person-days after JMEOS regen; PR MobilityDB#9 carries the UDFs). - `run_bench.sh` — reproduce script with the exact query and index-configuration matrix. ## Coordinated PRs - MobilityDB **PR #938** — `geoToH3IndexSet` + the `everIntersectsH3IndexSet_Th3Index` prefilter (open; the polygon walker on this PR returns the sound cell-set the bench consumes). - MobilityDB **PR #940** — lift framework helper that demotes LINEAR to STEP for STEP-only result types (open). - MobilityDB-BerlinMOD **PR MobilityDB#24** — the shared CSV carries the `trip_h3` column and a th3index-variant chapter-1 SQL file.

Extends `BerlinMOD/benchmarks/` from the chapter-1 subset to the full 17-query suite plus a beta testing harness for privileged testers across all three platforms. New files: - `MobilityDB_rqueries_2026-05-11.md` — 17 R-queries × index matrix (none / GiST(trip + trajectory) / SP-GiST(trip + trajectory)) on the bench-driving MobilityDB build. Total runtimes: 569 / 348 / 340 s respectively. Per-query highlights: Q14 48×, Q13 8.7×, Q10 6.2× under GiST. All three configurations return identical row counts. - `CrossPlatform_rqueries_readiness_2026-05-11.md` — sibling readiness document for replicating the same matrix on MobilityDuck and MobilitySpark. Inventories the 12 MEOS temporal functions and 4 PostGIS functions used by the R-queries, marks the gap items per platform (`tDwithin`, `whenTrue` on MobilityDuck; `whenTrue` verify on MobilitySpark), and lays out the sequencing. - `BETA_TESTING.md` — tester recipe and report-back template. Lists the four portable query files, the per-query expected row counts, and the per-platform invocation. Single entry point for privileged testers across MobilityDB / MobilityDuck / MobilitySpark. - `run_full_bench.sh` — reproduce script for the 17-query matrix. README.md is updated to index the new reports and the beta testing harness. The portable SQL files the reports reference live on `MobilityDB-BerlinMOD` PR MobilityDB#24 (sibling PR on the same repo).

Audit `MobilityDuck/src/` registration against the 12 MEOS temporal functions and 4 PostGIS spatial functions used by the 17 R-queries. All required UDFs are registered in the current MobilityDuck build: `atTime`, `atValues`, `valueAtTimestamp`, `trajectory`, `length`, `startTimestamp`, `stbox`, `eDwithin`, `tDwithin`, `whenTrue`, `expandSpace`, `aDisjoint`, plus the PostGIS surface via the DuckDB `spatial` extension and `&&` as a registered named scalar function. End-to-end validation: Q4 returns 80 rows on MobilityDuck against the cross-platform CSV-loaded bench DB, matching the PG-native Q4 exactly. Q10 (which uses both `tDwithin` and `whenTrue`) executes end-to-end and returns 21 rows; the row count differs from PG-native (4) because the cross-platform CSV-loaded `Trips` groups rows at trip granularity while `berlinmod_load.sql` splits per `(vehicleid, startdate, seqno)`. This is a data-loading layout difference, not a function gap. This audit supersedes the MobilityDuck function-gap entries in `CrossPlatform_rqueries_readiness_2026-05-11.md`: - "Register `tDwithin(tgeompoint, tgeompoint, float)`" — already registered. - "Register `whenTrue(tbool)`" — already registered. Updated estimate for MobilityDuck beta-readiness on the standard R-queries: 1–1.5 person-days (data-loading alignment + bench driver), down from the original ~4–5 person-days that assumed function-registration work. The th3index prefilter variant remains separate scope (~4–5 person-days for the h3 port).

Companion to the MobilityDuck audit committed prior. Confirms by direct repo grep that every MEOS temporal function and PostGIS spatial function used by the 17 R-queries is already registered via `spark.udf().register(...)` on the `MobilitySpark-parity` mainline. Specifically the two UDFs flagged in the cross-platform readiness doc as "to register" are already present: - `tDwithin(tgeompoint, tgeompoint, double)` — DistanceUDFs.java (2 overloads). - `whenTrue(tbool)` — TemporalUDFs.java. The five th3index/h3 UDFs needed by the h3 prefilter variant (`tgeompointToTh3Index`, `geoToH3IndexSet`, `everIntersectsH3IndexSetTh3Index`, plus the three `everEq*` overloads) are on MobilitySpark PR MobilityDB#9 (`Th3IndexUDFs.java`), CI- blocked on the JMEOS regen against latest MEOS. They are NOT required by the standard R-queries portable file. Updated estimate for MobilitySpark beta-readiness on the standard R-queries: ~0.5 person-day (extend `BerlinMODBench.java` to dispatch all 17 queries via the portable SQL). Function registration is complete; JMEOS regen is only on the th3index variant path. Combined three-platform status summary added at the foot of the audit file: | Platform | Standard R-queries | th3index variant | |---|---|---| | MobilityDB | Bench published (PR MobilityDB#26) | h3 prefilter pushed (PR #938) | | MobilityDuck | 0 functions missing | h3 port not started | | MobilitySpark | 0 functions missing | PR MobilityDB#9 open, CI-blocked | Beta testers can run the standard R-queries portable file on all three platforms today.

…nable Captures the audit + execution check against the 17 R-queries on MobilityDB, MobilityDuck, and MobilitySpark. Each platform runs the standard R-queries end-to-end today: | Platform | Driver | Files | |---|---|---| | MobilityDB | `SELECT berlinmod_R_queries(1, false)` | `BerlinMOD/berlinmod_r_queries.sql` | | MobilityDuck | `duckdb <db>` + adapter | `BerlinMOD/mobilityduck_schema_adapter.sql` + `berlinmod_r_queries_portable.sql` | | MobilitySpark | `BerlinMODBench <dir> <out.json> <runs>` | `MobilitySpark-parity/berlinmod/q01.sql … q17.sql` | Row-count parity on MobilityDuck: 10 of 17 queries return the PG canonical row counts identically. The remaining 7 (Q3, Q5, Q8, Q10, Q13, Q14, Q16) differ in row counts because the cross-platform CSV-loaded `Trips` table groups rows at trip granularity while the PG canonical splits per `(vehicleid, startdate, seqno)`. Both layouts are valid; the convergence is open work (the document lists two options for closing the gap). MobilitySpark consumes the cross-platform layout natively and will track the MobilityDuck column. The h3 prefilter variant is platform-gated (MobilityDuck — h3 port not started; MobilitySpark — PR MobilityDB#9 CI-blocked on JMEOS regen). Beta testers can launch on all three platforms today with the standard R-queries.

After the `ORDER BY` fix on the LIMIT-10 parameter views (`berlinmod_load.sql`), all 17 R-queries return the same row counts on PostgreSQL, MobilityDuck, and Spark when consuming the same generated CSV files. Updates: - `BETA_TESTING.md` — reference row counts updated to the deterministic values: Q1:72 Q2:1 Q3:6 Q4:80 Q5:100 Q6:0 Q7:26 Q8:75 Q9:94 Q10:21 Q11:0 Q12:0 Q13:278 Q14:1 Q15:118 Q16:2 Q17:1. - `ThreePlatform_beta_status_2026-05-11.md` — per-query parity matrix simplified to a single column; the previous "✅/❌ per query" table is no longer needed. Open-work list narrowed to the h3 prefilter variant (which is gated by separate work on MobilityDuck and MobilitySpark).

Bench rerun with the ORDER-BY-deterministic LIMIT-10 views on `berlinmod_h3bench`. Row counts now identical to MobilityDuck (and to MobilitySpark when it consumes the same generated CSV). Result matrix (seconds, single run per cell): Config | Total none | 334.30 GiST(trip+traj) | 173.23 SP-GiST(trip+traj)| 177.04 Per-query highlights (GiST over baseline): Q14: 51× (ST_Contains on valueAtTimestamp) Q10: 8.0× (trip×trip tDwithin) Q15: 8.0× (trajectory × point × period) Q13: 6.1× (trajectory × region × period) Q9 : 3.1× (atTime + length) SP-GiST is within run-to-run noise of GiST on the total; trades wins per query (better on Q4 / Q6 / Q17, slower on Q1).

Adds a top-level "Benchmark results" section at the start of the README so a visitor landing on the repo home page immediately sees where the bench documentation lives. Links the directory README and the three high-value entry points (tester guide, three-platform status, MobilityDB matrix), plus the headline number for quick orientation. Until this branch merges, the same files are visible via PR MobilityDB#26's "Files changed" tab.

…0.005) Adds `CrossPlatform_timings_2026-05-11.md` with three Mermaid `xychart-beta` bar charts (one per platform) and a side-by-side table for the 17 R-queries at scalefactor 0.005. Numbers captured locally on this machine. Row counts identical across the three platforms (the deterministic ORDER BY fix on the LIMIT-10 parameter views guarantees this). MobilityDB on PostgreSQL 17.8 — GiST(trip + trajectory), seconds: Q1 0.78 Q2 0.15 Q3 5.70 Q4 15.19 Q5 80.61 Q6 4.23 Q7 9.24 Q8 1.18 Q9 9.81 Q10 6.46 Q11 2.31 Q12 2.37 Q13 4.55 Q14 0.44 Q15 4.13 Q16 16.35 Q17 9.74 Total 173.23 MobilityDuck on DuckDB — zone-map filtering, seconds: Q1 0.01 Q2 0.00 Q3 0.41 Q4 0.79 Q5 81.34 Q6 0.31 Q7 0.68 Q8 0.14 Q9 6.19 Q10 6.24 Q11 0.62 Q12 0.65 Q13 7.54 Q14 0.54 Q15 7.49 Q16 3.28 Q17 0.70 Total 125.12 MobilitySpark — partial; refresh in progress. Repo-root `INDEX.md` (in the persistent local worktree, not part of this PR) embeds the same charts inline so the local view has the comparison without having to navigate into the directory.

Two open issues prevent the Spark side of the bench from completing the 17 R-queries today. Both are documented in the cross-platform timings doc so reviewers and beta testers see the gap shape rather than missing numbers. 1. GEOS context init crash on the first spatial UDF call (`libgeos_c.so` SEGV with `context handle is uninitialized, call initGEOS`). Affects Q2..Q17 — every query that uses a spatial UDF. Q1 and QRT (relational only) complete. No open PR yet. 2. `UNRESOLVED_ROUTINE` on `everEqH3IndexTh3Index` and `everIntersectsH3IndexSet_Th3Index` — these h3 UDFs are referenced by the as-shipped Spark q02/q04/q05/q06/q10 but only registered on PR MobilityDB#9 (`Th3IndexUDFs.java`). PR MobilityDB#9's source has JMEOS API drift that prevents a clean rebuild against the current JMEOS jar. Per a parallel session: the h3-related MobilityDB PRs (#807, #866, #893, #938, MobilitySpark MobilityDB#9, MobilityDB-BerlinMOD MobilityDB#24) are being consolidated into a single multi-commit PR. Once that is issued, `feedback_issued_pr_treat_as_landed.md` permits using the consolidated UDFs for downstream work. Spark column in the side-by-side table is now `blocked (GEOS)` or `blocked (GEOS + h3 PR)` per query. Total row marks Spark as `n/a` until both blockers resolve.

MobilitySpark now runs the 17 R-queries on --master local[4] with per- thread GEOS context (MobilityDB#949) on top of the lwgeom WKT/GMT TLS foundation (MobilityDB#815). ThreePlatform_beta_status_2026-05-12 records the unblocked state across all three platforms; CrossPlatform_timings_2026-05-12 carries the per-query timings. Q5 is the only outstanding gap on MobilitySpark — a pre-existing geo_from_text parse path crashes the JVM, separate from this beta.

The bare-name nearestApproachDistance UDF on MobilitySpark previously resolved to a tgeo × geometry overload, which fed the second tgeo's hex-WKB to geo_from_text and aborted the JVM on parse failure. Fixes: - MobilitySpark commit 73887f1: keep the tgeo × tgeo registration of nearestApproachDistance under the bare SQL name. - MobilityDB commit b6bf3f6d6 (on PR #949): geo_from_text / geog_in return NULL on WKT parse failure instead of dereferencing the failed parser result. Q5 timings: MobilityDB 80.6 s, MobilityDuck 81.3 s, MobilitySpark local[4] 508.4 s (synchronous-NAD cross-join cost dominates). BETA_TESTING.md no longer reports Q5 as skipped on Spark.

…text State the current capabilities and the per-query numbers; omit PR references, commit SHAs, "previously blocked / now runs" narrative, and "underlying fixes" sections.

…primary framing

…red to th3index matrix

…esults Replace the placeholder section with measured timings. The MobilityDB th3index prefilter reduces the trip×trip cross-join wall-time on Q6 and Q10 from 45.41 s to 1.88 s at sf 0.005 (24x). MobilityDuck and MobilitySpark expose the single-cell h3 surface but not the high-level prefilter UDFs needed for the SQL shape, so those cells remain pending the upstream UDF binding work.

…ries

…7 pending rerun

Define the per-query time budget as max(20 x slowest other platform, 30 min) and render exceedances as a hatched ">cap" bar at the 30-min ceiling, distinct from "n/a" (query shape not defined on that platform). Replace the implementation-detail framing on the MobilitySpark Q10-Q17 cells with a plain "pending" marker.

…ub-matrix Capture Q1-Q17 timings on MobilitySpark local[4] against the current SRID-3812 + lift-tpfn + GEOS-reentrant stack so the standard matrix no longer has pending cells. Q11/Q12/Q14 exceed the 30-min per-query cap and are rendered as hatched >cap bars; Q16/Q17 needed a load-time SRID derivation patch in BerlinMODDemo so that geoTimeStbox parses query WKT against the dataset SRID. Restructure the doc with an introductory section, an R-query shape categorization (relational / trip x static / trip x trip / trip x region / aggregated), the MEST mrtree/mquadtree /mkdtree sub-matrix, and consistent tree-family naming (R-tree, quadtree, k-d tree) instead of bare GiST/SP-GiST. Mark the MobilityDuck column as no-index, since the loader does not build a TRTREE or DuckDB spatial RTREE today. Extend run_full_bench.sh with mest_mrtree_N, mest_mquadtree_N, mest_mkdtree_N configs.

…TRTREE status Q5 profiling shows >99% of its 100 s wall time is in 100 ST_Distance(MultiLineString, MultiLineString) calls; the aggregate itself is 47 ms. A naive min-of-pairs SQL rewrite is 2.5x slower because GEOS internal indexing on one big call beats 14,400 small calls; using the materialised trajectory column is the same time as trajectory(Trip). The proper optimisation is a MEOS-side fused aggregate with STBox bbox prefilter, out of scope here. MEST on the trip x region shape: Q13 sees a clean 2.4x speedup over R-tree (1.77 s vs 4.55 s) and 9x over th3index (15.89 s). Q14 is too cheap to differentiate. Q16 has the trip x trip x region triple cross-join that hurts MEST's per-trip-decomposition entry count. The MobilityDuck-indexed bench row is blocked upstream: CREATE INDEX ... USING TRTREE crashes with a DuckDB internal-error assertion on any table, including a 2-row test fixture. DuckDB Spatial's RTREE on GEOMETRY cannot be used here because the portable BerlinMOD R-queries predicate on tgeompoint, not on a derived geometry column.

Q5's exact form remains the bench reference. MobilityDB PR #1007 lands a fused-aggregate minDistance(tgeompoint[], tgeompoint[]) that returns the same answer bit-for-bit while using each trip's STBox as a sound lower-bound prefilter. At BerlinMOD-Brussels sf 0.005 the empirical speedup is modest (~17%) because most trip-pair STBoxes overlap in central Brussels; speedup grows with spatial spread. Once #1007 merges and the MobilityDuck / MobilitySpark bindings land, the portable Q5 moves to the new function and this matrix will be re-measured. Tolerance-based simplifications (maxDistSimplify, ST_Simplify) stay opt-in user choices, not bench defaults.

Q5 moves to the minDistance(tgeompoint, tgeompoint) aggregate over the licence cross-join with an everEqTh3IndexTh3Index cell-membership prefilter. The single PostgreSQL process runs Q5 in 18.86 s and MobilitySpark runs it in 9.60 s on local[4] (21.56 s on the local[1] single-thread reference); MobilitySpark is faster because it parallelises the licence cross-join across worker threads while running the same MEOS kernel and prefilter. The MobilityDuck Q5 cell keeps the prior 81.34 s value and is marked not re-run because of the upstream DuckDB v1.4.4 icu autoload outage on amd64. The mermaid xychart bars and the matplotlib SVGs are regenerated for Q5 only; the other queries are untouched since they were not re-run. The Q5 cardinality is now stated as a function of the licence self-join structure of this dataset: query_licences has 100 rows but 72 distinct licence strings, so the self-join admits 3019 distinct licence-string pairs before the prefilter and MobilitySpark returns 665 surviving groups.

The prior Q5 figure of 18.86 s came from a non-comparable run using hand-made ten-row licence views on berlinmod_h3bench, a different workload than the Spark leg. Replace it with the validated canonical portable Q5 measured on the same bench CSV (1620 trips, 141 vehicles, sf 0.005, th3index ever_eq prefilter): MobilityDB single PostgreSQL process 9.50 s (median 10.33 / 9.39 / 9.50) and MobilitySpark local[4] 9.60 s (median 11.234 / 9.598 / 9.192), with the local[1] single-thread reference at 21.56 s. Both engines return 665 surviving licence groups, exact row-count parity as the correctness cross-check. Reframe the narrative as a diagnostic of the same shared MEOS minDistance kernel at different degrees of parallelism rather than a speedup multiple over the old ST_Distance(ST_Collect(...)) baseline. Update every Q5 cell, the mermaid bar, the render_bench_chart.py source of truth, and regenerate both cross-platform SVGs. MobilityDuck Q5 is kept at its prior 81.34 s annotated as not re-run because of the upstream DuckDB v1.4.4 icu autoload outage.

estebanzimanyi · 2026-05-22T06:12:58Z

Superseded by #29, which restructures these same 18 benchmark-report files around what each measurement licenses. The older blended layout here is incompatible with that structure, and #29 covers the identical file set — closing this in its favour.

doc(bench): restructure cross-platform timings by what each measurement licenses (supersedes #26)

estebanzimanyi force-pushed the doc/benchmarks-th3index branch from 44d746f to b3a597b Compare May 11, 2026 08:23

estebanzimanyi added 7 commits May 11, 2026 13:36

estebanzimanyi mentioned this pull request May 11, 2026

feat(export): include trip_h3 (th3index) in cross-platform portability export #24

Merged

3 tasks

estebanzimanyi force-pushed the doc/benchmarks-th3index branch from 579f20f to 8176a40 Compare May 11, 2026 15:32

estebanzimanyi added 19 commits May 11, 2026 17:32

doc(benchmarks): strip engineering-history phrasing from user-facing …

3d58883

…text State the current capabilities and the per-query numbers; omit PR references, commit SHAs, "previously blocked / now runs" narrative, and "underlying fixes" sections.

doc(benchmarks): describe spatial cross-join queries without GEOS-as-…

29b9a35

…primary framing

doc(benchmarks): MobilitySpark local[4] Q1-Q10 timings, Q11-Q17 defer…

f402353

…red to th3index matrix

doc: MobilityDuck th3index prefilter measurements for trip x trip que…

3e7da0e

…ries

doc: MobilityDuck exposes the static-geometry h3 prefilter UDFs

47068a8

doc: MobilitySpark th3index prefilter surface lands via JNR-FFI

f0269c0

doc: side-by-side grouped bar charts for the cross-platform matrix

4867520

doc: drop per-platform charts superseded by the grouped chart

11a4617

doc: MobilitySpark now exposes the h3 prefilter UDFs via JNR-FFI

6f51324

doc(charts): lower log-scale floor to 1 ms so MobilityDuck Q1/Q2 render

199dc22

doc(charts): extend standard chart to Q1-Q17

3557d1c

doc: consolidate — drop stderr-pathology footnotes, mark Spark Q10-Q1…

a628926

…7 pending rerun

estebanzimanyi added 4 commits May 13, 2026 12:10

estebanzimanyi mentioned this pull request May 15, 2026

Refresh Q5 cross-platform timings for the minDistance form #28

Closed

estebanzimanyi changed the title ~~doc(benchmarks): BerlinMOD chapter 1 th3index + GiST/SP-GiST bench report~~ Add the dated three-platform BerlinMOD benchmark reports May 16, 2026

estebanzimanyi mentioned this pull request May 20, 2026

doc(bench): restructure cross-platform timings by what each measurement licenses (supersedes #26) #29

Merged

estebanzimanyi closed this May 22, 2026

estebanzimanyi added a commit that referenced this pull request Jun 5, 2026

Merge pull request #29 from estebanzimanyi/doc/benchmark-restructure

c463312

doc(bench): restructure cross-platform timings by what each measurement licenses (supersedes #26)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add the dated three-platform BerlinMOD benchmark reports#26

Add the dated three-platform BerlinMOD benchmark reports#26
estebanzimanyi wants to merge 31 commits into
MobilityDB:masterfrom
estebanzimanyi:doc/benchmarks-th3index

estebanzimanyi commented May 11, 2026 •

edited

Loading

Uh oh!

estebanzimanyi commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

estebanzimanyi commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

estebanzimanyi commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

estebanzimanyi commented May 11, 2026 •

edited

Loading