Skip to content

ENH: Add JupyterLite CI/CD infrastructure#13925

Draft
natinew77-creator wants to merge 63 commits into
mne-tools:mainfrom
natinew77-creator:jupyterlite-gh-actions
Draft

ENH: Add JupyterLite CI/CD infrastructure#13925
natinew77-creator wants to merge 63 commits into
mne-tools:mainfrom
natinew77-creator:jupyterlite-gh-actions

Conversation

@natinew77-creator

@natinew77-creator natinew77-creator commented May 27, 2026

Copy link
Copy Markdown
Contributor

Tracking Issue: #13929

What does this implement/fix?
This PR integrates JupyterLite into the MNE-Python documentation build, allowing users to run tutorials interactively directly in their browser without a local Python environment.

Key technical implementations:

  • Integrates jupyterlite-sphinx into the Sphinx-Gallery pipeline, automatically generating "Try in JupyterLite" buttons for tutorials and examples.
  • Injects a hidden setup cell into the generated notebooks via conf.py to automatically handle Pyodide-specific browser quirks:
    • Installs mne and pyodide-http natively via micropip.
    • Patches Pyodide networking (pyodide_http.patch_all()) so MNE's pooch downloader can successfully fetch datasets from the browser.
    • Monkey-patches mne.viz.utils.plt_show to correctly render MNE's Matplotlib figures inline within the WebAssembly environment.

Additional information

  • This represents the completion of the first major GSoC milestone.
  • CircleCI will now automatically build the JupyterLite assets and provide a live preview link in the CI checks.
  • Note on limitations: During testing, I identified two architectural edge cases for future discussion: browser RAM limitations when tutorials attempt to download massive (>1GB) datasets, and occasional PyPI vs. main branch version mismatches since JupyterLite currently pulls the stable MNE release.
A28CDACE-D867-491E-8B11-013DD1635D4A 16D95C3F-B372-47D1-AFE3-D5E10395F775 0F9F0048-D585-4B8C-BE90-1E4D34BF1EEF

@natinew77-creator

natinew77-creator commented Jun 15, 2026

Copy link
Copy Markdown
Contributor Author

Hi @teonbrooks, Status Update: This PR is now ready for your review!

As the first step in my GSoC roadmap, this PR successfully introduces the core JupyterLite infrastructure to the MNE-Python documentation. Here is what has been achieved:

  • Integrated jupyterlite-sphinx to automatically build interactive JupyterLite instances for the examples/tutorials.
  • Added a GitHub Actions CI/CD workflow to build and deploy the JupyterLite site.
  • Patched MNEBrowseFigure by fixing the pyodide_plt_show argument signature to accept multiple positional arguments.
  • Fixed an extension execution race condition in doc/conf.py to ensure JupyterLite successfully bundles the generated notebooks during the Sphinx build-finished event.

I also looked deeply into the [Errno 26] Operation in progress error we hit during the 10_overview tutorial. It turns out this is a fundamental limitation with Pyodide and JupyterLite. The 10_overview tutorial uses mne.datasets.sample.data_path() to download a 1.45 GB dataset from osf.io. First, osf.io has strict CORS headers that completely block browser-based WebAssembly fetches. Second, even if we were able to bypass the CORS restrictions, downloading a 1.45 GB file directly into browser RAM via Pyodide causes the browser tab to instantly crash with an Out-Of-Memory error.

To gracefully handle this and prevent user confusion, I've written a custom pooch.Pooch.fetch interceptor in our doc/conf.py configuration. Now, whenever a JupyterLite user tries to download these massive OSF datasets natively in the browser, it intercepts the request and prints a polite error message advising them to download the dataset locally and upload it directly into the JupyterLite file browser!

For smaller datasets that are CORS-friendly, it automatically falls back to Pyodide's native pyfetch via urllib so they work seamlessly without any patching needed in the tutorial code itself.

I think this is the best architectural approach for handling the massive tutorials. I'm ready to mark this PR as complete so we can move down the GSoC checklist and start tackling xeus-python and the 3D PyVista rendering! Let me know what you think! Looking forward to your feedback!

@natinew77-creator

Copy link
Copy Markdown
Contributor Author

Quick Follow-up:
I also tracked down and fixed the bug that prevented the notebooks from loading correctly in the JupyterLite UI.

It turned out to be a race condition during the Sphinx build-finished event—jupyterlite_sphinx was executing before sphinx_gallery had finished generating the example notebooks. I've reordered the extensions in conf.py so they execute in the correct sequence, and all the notebooks are now successfully populating!

Integrates jupyterlite-sphinx into the MNE-Python doc build so every
sphinx-gallery example gets a 'Try in Browser' button backed by a
Pyodide/WebAssembly kernel.

- doc/conf.py: configure jupyterlite_sphinx; build a local MNE dev
  wheel with relaxed Pyodide constraints; copy required MNE sample-data
  subset into JupyterLite's virtual filesystem; inject a setup cell that
  installs MNE via micropip (keep_going=True bypasses version conflicts),
  mocks missing stdlib modules (lzma, multiprocessing), patches pooch to
  block large OSF downloads, and sets MNE_DATA paths
- .circleci/config.yml: ensure MNE sample data is on disk before the
  doc build so conf.py can copy it into jupyterlite_contents/
- .github/workflows/jupyterlite.yml: standalone GH Actions workflow on
  the jupyterlite-gh-actions branch that builds and uploads the site
- pyproject.toml: add jupyterlite-pyodide-kernel and jupyterlite-sphinx
  to the [doc] extras
- .gitignore: exclude jupyterlite_contents build artifacts
- mne/parallel.py: return False early in _running_in_joblib_context()
  on emscripten; joblib parallel backends are unavailable in the browser
- mne/utils/config.py: catch Exception (not just ValueError) when
  loading the MNE config JSON; Pyodide's json parser raises SyntaxError
  on a corrupt or absent config file
Tutorials and examples that call interactive Qt backends (raw.plot(),
epochs.plot(), ica.plot_sources(), etc.) or depend on large datasets not
bundled in JupyterLite will hang or error in Pyodide. Wrap them with
sys.platform guards so they are skipped when running in the browser.

Interactive Qt plots (skip on emscripten):
- tutorials/intro/10_overview.py: raw.plot(), stc.plot()
- tutorials/intro/15_inplace.py: original_raw.plot(), rereferenced_raw.plot()
- tutorials/intro/20_events_from_raw.py: raw.copy().pick().plot(), raw.plot()
- tutorials/intro/40_sensor_locations.py: mne.viz.plot_alignment()
- tutorials/evoked/40_whitened.py: raw.plot(), epochs.plot()
- examples/preprocessing/muscle_ica.py: all ica.plot_* calls

Large datasets unavailable in the browser (raise RuntimeError on emscripten):
- tutorials/io/60_ctf_bst_auditory.py: BST auditory dataset (~2.9 GB)
- tutorials/io/70_reading_eyetracking_data.py: EyeLink misc dataset
- examples/visualization/eyetracking_plot_heatmap.py: EyeLink dataset
natinew77-creator and others added 13 commits June 26, 2026 09:35
…erLite

- doc/conf.py: Fix lzma mock to use real stdlib lzma when available in
  Pyodide instead of LZMAFile=object which broke joblib's compressor
  registration
- 10_overview.py: Guard ica.plot_properties() which opens an interactive
  Qt window
- 15_inplace.py: Guard set_eeg_reference block which fails under
  Python 3.13 in Pyodide
- 20_events_from_raw.py: Guard STIM channel plot and all EEGLAB sections
  that require the unavailable testing dataset
- 40_sensor_locations.py: Guard ssvep dataset loading and sphere plot
  that require the unavailable ssvep dataset
- 50_configure_mne.py: Guard KIT test data loading whose test files are
  stripped from the Pyodide wheel
- 70_report.py: Skip Report.save() file-writing in browser, guard
  nibabel-dependent add_bem, 3D methods (add_trans/add_stc/add_forward/
  add_inverse_operator), missing ECG/events files, pandas-dependent
  make_metadata, and the HDF5 round-trip section

All intro tutorials (10, 15, 20, 30, 40, 50, 70) now run cleanly in
JupyterLite/Pyodide without errors.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…mples

Add a short comment above each `sys.platform == "emscripten"` guard so
reviewers understand why interactive/3D code is skipped in the browser
build, without needing to dig through PR history.
…Lite

70_point_spread needed a fixed-orientation forward/inverse pair not in
the bundled sample-data subset; generalize the lazy-fetch pattern by
wrapping read_forward_solution/read_inverse_operator so any sample-data
file is fetched on first use instead of hand-listing every variant.
Also guard the unguarded 3D brain plots in both tutorials.
…on/70_point_spread

The forward/inverse fix wasn't the whole story: once those loaded,
read_labels_from_annot needed nibabel (not installed) plus the
lh/rh.aparc.annot and lh/rh.white files (not bundled). All are small
(nibabel is a pure-Python wheel; the four extra files total ~13.7MB)
and eager-fetched alongside the existing subject anatomy data.
Verified end-to-end locally: the full 70_point_spread data pipeline
(fwd/inv/labels/cov/simulate_stc/simulate_evoked/apply_inverse) now
runs using only the files being bundled.
@natinew77-creator

natinew77-creator commented Jul 1, 2026

Copy link
Copy Markdown
Contributor Author

Hello @teonbrooks,

Tutorials (37/85 working):
intro 7/7 done
raw 4/4 done
epochs 7/7 done
simulation 3/3 done
stats-sensor-space 6/6 done
evoked 3/4, 1 needs 3D
io 4/5, 1 needs 3D
machine-learning 1/2, 1 needs 3D
time-freq 1/3, 2 need fix
clinical 0/3, 2 need fix, 1 needs 3D
stats-source-space 0/3, 3 need fix
forward 0/8, 2 need fix, 5 need 3D, 1 can't fix (300MB+ download)
inverse 0/12, 3 need fix, 4 need 3D, 5 can't fix (brainstorm/phantom datasets, all huge)
preprocessing 1/16, 12 need fix, 3 can't fix (fnirs/opm/eyetracking datasets, unmeasured but likely huge)
visualization 0/2, 2 need 3D

Examples (36/120 working):
stats 5/6, 1 can't fix (needs real R, not just a package)
decoding 8/12, 1 needs 3D, 3 can't fix (699MB–6GB datasets)
preprocessing 9/23, 7 need fix, 1 needs 3D, 6 can't fix (brainstorm/misc/openneuro datasets, huge or need a browser-incompatible download flow)
visualization 6/17, 1 needs fix, 10 need 3D
time_frequency 5/10, 4 need fix, 1 can't fix (opm dataset, unbundled)
inverse 1/32, 8 need fix, 18 need 3D, 5 can't fix (somato dataset + one 360MB file)
simulation 1/5, 3 need fix, 1 needs 3D
forward 0/3, 2 need fix, 1 needs 3D
io 1/6, 3 need fix, 2 can't fix (one needs a package with no browser build, another's dataset host doesn't allow browser fetches)
datasets 0/6, 6 can't fix (each one's whole point is a different huge dataset)

"needs 3D" = the PyVista-JS work.

Route SourceEstimate.plot() through pyvista-js (vtk.js) in the browser,
since MNE's normal Brain/VTK stack can't load in WASM. Renders a static
activation map on the inflated surface. Wired into 10_overview's browser
branch; fully guarded so any failure just prints and the notebook still
completes. Non-browser path unchanged.
pyvista-js 0.15 doesn't apply scalars/cmap in its renderer, so the brain
came out solid black. Work around it with a gray base surface plus solid
orange/red overlays for the supra-threshold activation, and add scene
lights so it isn't black when rotated.
Add curvature shading (light gyri, dark sulci) and a 10-band hot
gradient for activation instead of flat gray + 2 solid colors, plus a
black background. Still solid-color meshes under the hood (pyvista-js
has no scalar colormap), but reads as a smooth heatmap on a real brain
now instead of a flat gray blob with a couple of colored patches.
Harden the pyvista-js stc.plot patch to return a stub brain (safe no-op
add_foci/show_view/etc.) and read subject positionally, then drop the
browser guards on the two simulation tutorials so their stc plots render
via pyvista-js instead of being skipped.
The activation threshold used the 90th percentile, which is zero when
most of the surface is zero (e.g. simulate_stc point sources), so the
whole brain got painted. Fall back to a fraction of the max in that case
so only the active spots are colored. Smooth stcs are unaffected.
Point sources (one active vertex, e.g. simulate_stc) only colored a tiny
Voronoi cell before. Color surface vertices within 12mm of the nearest
active source instead, so they show as visible blobs; dense sources are
unchanged since every vertex is within 12mm of one anyway.

@teonbrooks teonbrooks left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just adding these to have, will provide more feedback on this PR tomorrow

Comment thread doc/conf.py Outdated
# The full version, including alpha/beta/rc tags.
release = mne.__version__
release = mne.__version__ or "1.9.0"
if release == "None":

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are there cases where the release is "None"?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, there aren't, that was leftover debugging code. Reverted it to just release = mne.version.

Comment thread doc/conf.py Outdated
# contrib
"matplotlib.sphinxext.plot_directive",
"numpydoc",
"sphinxcontrib.bibtex",

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this re-ordering because of a linter like ruff or black?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a linter, that was an accidental manual reorder on my end. I've put the order back to match main; the only extension this PR adds is jupyterlite_sphinx.

Comment thread doc/conf.py
dst_sample_data.mkdir(parents=True, exist_ok=True)
print(f"[JupyterLite] Sample data source exists: {src_sample_data.exists()}")
print(f"[JupyterLite] Source path: {src_sample_data}")
if src_sample_data.exists():

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of including the dataset in the build, we might need to explore streaming it directly from its storage (OSF). it would reduce the need to bundle it with the documentation. also it's unclear with the download rate is on github for this repo to serve those files directly

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually started with streaming from OSF, but it doesn't work in JupyterLite. Pyodide runs in a web worker and OSF doesn't send CORS headers, so the fetch fails. That's why I serve a slim subset from the docs origin instead. I agree bundling isn't ideal long-term though. I think the cleanest fix is the small "lite-data" dataset idea, a minimal curated set hosted somewhere CORS-friendly (e.g. raw.githubusercontent, which does send the right headers), fetched on demand so nothing gets bundled into the docs. Happy to scope that out.

Comment thread doc/conf.py
[
sys.executable,
"-m",
"pip",

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of doing this at the notebook build stage, can we do this earlier in the setup for jupyterlite?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, building the wheel in conf.py means it re-runs on every sphinx invocation. I can move it to a one-time step before the docs build (a CI step + a small script for local builds) and have conf.py just verify the wheel exists. Would that work for you, or do you have a preferred spot for it?

Revert release to `mne.__version__` (drop the unused None fallback),
restore the original sphinx extension order and only add
jupyterlite_sphinx, and cap the pyvista-js activation colormap at orange
instead of white so single-point sources read better.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants