Work in progress — This project is an experiment in "vibe coding" with AI assistance. It comes with no guarantee of correctness, completeness, or stability. Use at your own risk.
Auto-sync PRIM (Île-de-France Mobilités) OpenAPI/Swagger specs, generate Python clients, and sync/validate Opendatasoft datasets.
This repository maintains up-to-date interface contracts from PRIM APIs and dataset exports from the IDFM Opendatasoft portal. Everything is manifest-driven and idempotent.
- Syncs OpenAPI/Swagger specs from PRIM APIs (supports direct URLs and PRIM page scraping)
- Generates Python clients from specs using OpenAPI Generator
- Downloads dataset exports from Opendatasoft (JSONL format, full exports without pagination limits)
- Validates datasets against JSON Schema
- Runs nightly via GitHub Actions and opens PRs when updates are detected
The sync pipeline runs in 4 steps:
- sync_specs — downloads OpenAPI/Swagger specs from
manifests/apis.yml, resolves PRIM page URLs, caches with ETag/Last-Modified/sha256 - generate_clients — regenerates Python clients in
generated/clients/when specs change - sync_datasets — downloads dataset exports from Opendatasoft portal as defined in
manifests/datasets.yml - validate_datasets — retrieves JSON Schema for each dataset, validates records, generates reports
Each step is conditional: resources are only re-fetched or regenerated when changes are detected.
prim_api/ # Python SDK (IdFMPrimAPI, dataset sync, background updater)
samples/ # Runnable usage examples (update when adding endpoints/data)
manifests/ # YAML manifests (apis.yml, datasets.yml, urls_of_interest.yml)
specs/ # Downloaded OpenAPI/Swagger specs (committed)
generated/clients/ # Generated Python clients (committed)
data/schema/ # JSON Schemas for datasets (committed)
data/raw/ # Dataset exports in JSONL (gitignored)
data/reports/ # Validation reports (gitignored)
tools/ # CLI scripts
docs/site/ # Generated API docs (gitignored)
.github/workflows/ # CI and nightly sync workflows
Committed:
- Manifests (
manifests/*.yml) - Tools (
tools/*.py) - Tests (
tests/) - CI workflows (
.github/workflows/) - Project config (
pyproject.toml,.gitignore) - OpenAPI specs + metadata (
specs/) — updated by nightly sync - Generated Python clients (
generated/clients/) — regenerated when specs change - Dataset schemas (
data/schema/) — kept in sync with portal metadata
Gitignored (downloaded on demand by devs):
data/raw/— dataset exports (JSONL, can be large)data/reports/— validation reports
- Python 3.12+
- uv (recommended) or pip
- Docker (for client generation)
uv syncThe prim_api package provides a high-level Python interface to PRIM APIs and datasets.
from prim_api import IdFMPrimAPI
api = IdFMPrimAPI(api_key="your-prim-key")
# Query real-time next passages at a stop
passages = api.get_passages("IDFM:473921")
# Filter by line
passages = api.get_passages("IDFM:473921", line_id="IDFM:C01742")
# Access downloaded datasets
zones = api.get_zones_darrets()
lignes = api.get_referentiel_lignes()
# Cleanup (stops background dataset updater)
api.stop()IdFMPrimAPI(
api_key="...", # Required. PRIM API key.
auto_sync=True, # Download missing datasets on init.
sync_interval=3600, # Background refresh interval in seconds.
)| Method | Description |
|---|---|
get_passages(stop_id, *, line_id=None) |
Real-time next passages at a stop/area |
get_zones_darrets() |
Load zones-d-arrets dataset as list of dicts |
get_referentiel_lignes() |
Load referentiel-des-lignes dataset as list of dicts |
get_arrets_lignes() |
Load arrets-lignes (stop-line associations) as list of dicts |
ensure_datasets() |
Download datasets if missing or stale |
refresh_datasets() |
Force re-check all datasets |
stop() |
Stop the background updater thread |
The prim_api.refs module provides helpers to convert between IDFM and STIF identifier formats:
| Helper | Description |
|---|---|
parse_stop_ref(idfm_id) |
Auto-detect StopPointRef or StopAreaRef from an IDFM ID |
parse_line_ref(idfm_id) |
Parse an IDFM line ID into a LineRef |
StopPointRef / StopAreaRef / LineRef |
Dataclasses with .to_stif() and .from_idfm() |
from prim_api.refs import parse_stop_ref, parse_line_ref
stop = parse_stop_ref("IDFM:473921")
print(stop.to_stif()) # "STIF:StopPoint:Q:473921:"
line = parse_line_ref("IDFM:C01742")
print(line.to_stif()) # "STIF:Line::C01742:"Datasets are open data and don't require authentication:
from prim_api.datasets import ensure_all_datasets, load_dataset
ensure_all_datasets()
zones = load_dataset("zones-d-arrets")
lignes = load_dataset("referentiel-des-lignes")
arrets_lignes = load_dataset("arrets-lignes")See samples/ for runnable examples.
# Sync OpenAPI/Swagger specs
uv run sync-specs
# Generate Python clients
uv run generate-clients
# Download datasets
uv run sync-datasets
# Validate datasets
uv run validate-datasetsuv run sync-allMost tools support --dry-run to preview changes without modifying files:
uv run sync-all --dry-run# Run tests
uv run pytest
# Lint
uv run ruff check .
# Format
uv run ruff format .PRIM_TOKEN— Bearer token for authenticated PRIM spec exports (optional, required only if the API enforces auth)
Set in GitHub repository secrets for CI.
Runs on every PR and push to main:
- Install dependencies
- Lint with ruff
- Run tests
- Dry-run the sync pipeline
Runs nightly at 01:00 UTC (≈ 02:00 Europe/Paris):
- Syncs OpenAPI/Swagger specs from PRIM
- Regenerates Python clients if specs changed
- Opens a PR automatically if anything changed
Dataset sync is not part of the nightly — devs download data locally on demand via uv run sync-datasets.
Builds API documentation with pdoc and deploys to GitHub Pages on push to main.
CI runs pytest --cov and pushes a dynamic badge to a GitHub gist. See setup instructions below.
Defines APIs to sync. Supports two types:
type: direct— URL returns OpenAPI/Swagger JSON directlytype: prim_page— PRIM page URL; script scrapes HTML to find the spec export link
Example:
apis:
idfm_ivtr_requete_unitaire:
type: prim_page
page_url: "https://prim.iledefrance-mobilites.fr/fr/apis/idfm-ivtr-requete_unitaire"Defines Opendatasoft datasets to download and validate.
Example:
datasets:
- dataset_id: "zones-d-arrets"
portal_base: "https://data.iledefrance-mobilites.fr"
export_format: "jsonl"
validate: trueCurated list of useful URLs (docs, consoles, examples).
Example:
urls:
prim_api_example: "https://prim.iledefrance-mobilites.fr/fr/apis/idfm-ivtr-requete_unitaire"
dataset_zones_d_arrets: "https://data.iledefrance-mobilites.fr/explore/dataset/zones-d-arrets/"
explore_api_docs: "https://help.opendatasoft.com/apis/ods-explore-v2/"API reference is auto-generated from docstrings and published to GitHub Pages:
https://MaxLeb.github.io/idfm-prim-api/
This project's own code is licensed under the MIT License.
- IDFM data — datasets and API responses from Île-de-France Mobilités are published under the Open Database License (ODbL 1.0).
- Generated clients — Python clients in
generated/clients/are produced by OpenAPI Generator, licensed under Apache 2.0.