Add GROBID baseline comparison to benchmark workflow by de-code · Pull Request #613 · eLifePathways/sciencebeam-parser

de-code · 2026-05-27T10:11:05Z

part of https://github.com/eLifePathways/ScienceBeam2.0/issues/73

Inline GROBID prediction generation into benchmark.yml (conditional on predictions being absent), score baseline predictions via a new --split override in score.py, and produce a side-by-side comparison report. Adds variant field to eval.yml corpus config and baselines section declaring GROBID 0.9.0-crf as the reference tool.

part of eLifePathways/ScienceBeam2.0#73 Inline GROBID prediction generation into benchmark.yml (conditional on predictions being absent), score baseline predictions via a new --split override in score.py, and produce a side-by-side comparison report. Adds variant field to eval.yml corpus config and baselines section declaring GROBID 0.9.0-crf as the reference tool.

github-actions · 2026-05-27T10:18:50Z

ScienceBeam Parser Evaluation

biorxiv

grobid 0.9.0-crf: 10 docs | sciencebeam-parser:main-f87246c2-20260526.2020: 9 docs | sciencebeam-parser:pr-613-3c18e6c0-20260527.1023: 9 docs

Field (method)	Type	grobid 0.9.0-crf	sciencebeam-parser:main-f87246c2-20260526.2020	sciencebeam-parser:pr-613-3c18e6c0-20260527.1023	Δ grobid 0.9.0-crf	Δ sciencebeam-parser:main-f87246c2-20260526.2020
title (exact)	string	0.889	0.941	0.941	+0.052	+0.000
title (levenshtein)	string	0.947	0.941	0.941	-0.006	+0.000
title (edit_sim)	string	0.939	0.944	0.944	+0.005	+0.000
abstract (levenshtein)	string	0.947	0.364	0.364	-0.584	+0.000
abstract (edit_sim)	string	0.947	0.541	0.541	-0.406	+0.000
author_full_names (levenshtein)	partial_ulist	0.970	0.962	0.962	-0.008	+0.000
author_full_names (edit_sim)	partial_ulist	0.933	0.926	0.926	-0.007	+0.000
affiliation_text (levenshtein)	partial_ulist	0.000	0.907	0.907	+0.907	+0.000
affiliation_text (edit_sim)	partial_ulist	0.000	0.891	0.891	+0.891	+0.000
keywords (levenshtein)	partial_ulist	0.901	0.000	0.000	-0.901	+0.000
keywords (edit_sim)	partial_ulist	0.907	0.000	0.000	-0.907	+0.000
body_section_titles (levenshtein)	partial_list	0.516	0.819	0.819	+0.303	+0.000
body_section_titles (edit_sim)	partial_list	0.472	0.764	0.764	+0.293	+0.000
acknowledgement (levenshtein)	string	0.750	0.875	0.875	+0.125	+0.000
acknowledgement (edit_sim)	string	0.726	0.871	0.871	+0.145	+0.000
first_reference_text (levenshtein)	string	0.000	0.875	0.875	+0.875	+0.000
first_reference_text (edit_sim)	string	0.000	0.870	0.870	+0.870	+0.000
reference_title (levenshtein)	partial_list	0.766	0.291	0.291	-0.475	+0.000
reference_title (edit_sim)	partial_list	0.727	0.343	0.343	-0.385	+0.000
reference_doi (levenshtein)	partial_ulist	0.954	0.860	0.860	-0.094	+0.000
reference_doi (edit_sim)	partial_ulist	0.871	0.775	0.775	-0.096	+0.000

de-code added the benchmark:smoke label May 27, 2026

de-code temporarily deployed to benchmark May 27, 2026 10:12 — with GitHub Actions Inactive

Remove accept header as that was rejected by GROBID

73e3641

de-code temporarily deployed to benchmark May 27, 2026 10:23 — with GitHub Actions Inactive

de-code marked this pull request as ready for review May 27, 2026 10:33

de-code merged commit 3d8b253 into main May 27, 2026
7 checks passed

de-code deleted the grobid-baseline branch May 27, 2026 10:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add GROBID baseline comparison to benchmark workflow#613

Add GROBID baseline comparison to benchmark workflow#613
de-code merged 2 commits into
mainfrom
grobid-baseline

de-code commented May 27, 2026

Uh oh!

github-actions Bot commented May 27, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

de-code commented May 27, 2026

Uh oh!

github-actions Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

ScienceBeam Parser Evaluation

biorxiv

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions Bot commented May 27, 2026 •

edited

Loading