Skip to content

Add local benchmark runner with GROBID baseline comparison#614

Merged
de-code merged 1 commit into
mainfrom
grobid-comparison-locally
May 27, 2026
Merged

Add local benchmark runner with GROBID baseline comparison#614
de-code merged 1 commit into
mainfrom
grobid-comparison-locally

Conversation

@de-code
Copy link
Copy Markdown
Collaborator

@de-code de-code commented May 27, 2026

part of https://github.com/eLifePathways/ScienceBeam2.0/issues/73

Adds benchmarks/run_local.py to orchestrate a full local benchmark run: starts baseline tools (e.g. GROBID) via docker if predictions are absent, scores all tools, and produces a side-by-side comparison report. Predictions are stored under baselines/{tool}/{version}/{split}/ so version changes automatically trigger regeneration. Exposed via make dev-benchmark-with-baselines.

part of eLifePathways/ScienceBeam2.0#73

Adds benchmarks/run_local.py to orchestrate a full local benchmark
run: starts baseline tools (e.g. GROBID) via docker if predictions
are absent, scores all tools, and produces a side-by-side comparison
report. Predictions are stored under baselines/{tool}/{version}/{split}/
so version changes automatically trigger regeneration. Exposed via
make dev-benchmark-with-baselines.
@de-code de-code marked this pull request as ready for review May 27, 2026 19:35
@de-code de-code enabled auto-merge (squash) May 27, 2026 19:35
@de-code de-code merged commit 2b4d988 into main May 27, 2026
6 checks passed
@de-code de-code deleted the grobid-comparison-locally branch May 27, 2026 19:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant