Skip to content

Initialize MEOS per worker thread#137

Closed
estebanzimanyi wants to merge 1 commit into
fix/amd64-icu-test-autoloadfrom
fix/meos-per-thread-init
Closed

Initialize MEOS per worker thread#137
estebanzimanyi wants to merge 1 commit into
fix/amd64-icu-test-autoloadfrom
fix/meos-per-thread-init

Conversation

@estebanzimanyi
Copy link
Copy Markdown
Member

The MEOS 1.4 uplift keeps the session timezone, errno, PROJ context and RNGs in thread-local storage and requires every thread that calls into MEOS to run meos_initialize() before its first call. The extension only initialized MEOS once on the load thread via std::call_once, so DuckDB TaskScheduler worker threads executed scalar, cast and aggregate bodies with a NULL session_timezone and segfaulted in pg_next_dst_boundary on the first timestamp parse (reproduced locally as a SIGSEGV in 042_tgeogpoint_parity, previously masked because amd64 CI never reached the tests and arm64 runs no test step). A thread-local guard now runs the per-thread init and re-installs the process-global MobilityDuck error handler (meos_initialize() resets it to the exit-on-error default), invoked from the scalar exec wrapper and through a cast registration trampoline that covers every cast entry point. Locally the full suite goes from unrunnable to 58/59 files and 1311/1312 assertions with no segfaults; the one remaining assertion is an unrelated pre-existing ln() output drift from the MEOS bump.

@estebanzimanyi estebanzimanyi changed the base branch from fix/bump-meos-pin to fix/amd64-icu-test-autoload May 16, 2026 06:15
@estebanzimanyi estebanzimanyi force-pushed the fix/meos-per-thread-init branch from 54e6181 to f896b95 Compare May 16, 2026 06:15
@estebanzimanyi
Copy link
Copy Markdown
Member Author

Reviewer's quickstart — ~8 minutes

What this PR does in one sentence: moves meos_initialize() from process-once to per-DuckDB-worker-thread, fixing the data-race where MEOS's thread-local state (timezone, error handler, lwgeom session) was uninitialised in workers spawned after the initial LoadInternal.

Files (14): mostly src/geo/*.cpp and src/temporal/*.cpp — each scalar/cast/cast-trampoline now starts with MeosThread::ensureReady() (a TLS-guarded one-time init).

Read first:

  • src/include/mobilityduck/meos_thread.hpp (new) — the TLS guard + ensureReady() definition.
  • src/include/mobilityduck/meos_exec_serial.hpp (new) — the serial-execution wrapper for paths MEOS itself isn't thread-safe on (pg-derived TZ, legacy GEOS).
  • src/mobilityduck_extension.cpp — registers the worker thread's MEOS state.
  • 11 callsite files just add MeosThread::ensureReady(); at the top of each UDF lambda body.

Verification:

# Repro pre-PR: SIGSEGV on multi-threaded queries
duckdb -unsigned -c "LOAD mobilityduck; SET threads=4; SELECT count(*) FROM (SELECT atTime(traj::VARCHAR, '2026-01-01'::TIMESTAMP) FROM ...);"
# Pre-PR: occasionally crashes. Post-PR: clean.

Cross-link: Linux arm64 needs #161 for the MeosType build error.

Why it's safe to merge: pure additive guards; valid-input paths are unaffected. The change correctly diagnoses MEOS's process-singleton initialization model as a multi-worker bug.

DuckDB runs scalar and cast bodies on TaskScheduler worker threads; MEOS keeps
the session timezone, errno, PROJ context and RNGs in thread-local storage and
needs meos_initialize() on each thread before first use. A thread-local guard
in the scalar exec wrapper and a cast trampoline run the per-thread init (and
re-install the error handler) at every entry point, so timestamp formatting on
a worker no longer dereferences a NULL session_timezone.
@estebanzimanyi estebanzimanyi force-pushed the fix/meos-per-thread-init branch from 6900085 to 7b85d06 Compare May 30, 2026 06:29
@estebanzimanyi estebanzimanyi force-pushed the fix/amd64-icu-test-autoload branch from 202034a to 880d8d9 Compare May 30, 2026 06:29
@estebanzimanyi
Copy link
Copy Markdown
Member Author

Folded into #134: the MEOS pin bump and the changes that keep CI green (graceful icu degradation + stage_icu, per-thread MEOS init, deterministic ln/exp/log10, macOS int64 forwarder, wasm pg_config) are jointly required for the extension to build, load, and pass — they ship as one foundation PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant