Fix ILIAS/INQUIRE evaluation workflow: configurable paths, format mismatch, and sample data generator by Copilot · Pull Request #5 · mvrl/QuARI

Copilot · 2026-04-11T11:28:48Z

Users had no clear path to evaluate on ILIAS/INQUIRE: compute_embeds.py had hardcoded researcher-specific paths, eval_retrieval.py silently broke when loading distractor embeddings due to a key mismatch, and there was no way to test the pipeline without downloading large datasets.

`compute_embeds.py` — remove hardcoded paths

--shard_dir and --out_dir are now required CLI arguments (were hardcoded to /u/ericx003/data/ilias/yfcc100m and ./yfcc_embeds)
Added module docstring with output format spec and example commands for both ILIAS (YFCC100M) and INQUIRE (iNaturalist)

`eval_retrieval.py` — fix distractor loading format mismatch

compute_embeds.py writes {"keys": ..., "embeddings": ...} but the distractor loader called data['image_embeddings'], causing a KeyError at eval time.

Added load_distractor_embeddings_from_dir() that accepts both formats:

# compute_embeds.py output → "embeddings" key
# precompute_embeddings.py output → "image_embeddings" key

Explicit .float() cast on distractor tensors since compute_embeds.py defaults to fp16 storage.

`create_sample_dataset.py` — new synthetic data generator

Generates minimal WebDataset tar shards for smoke-testing the pipeline end-to-end without large downloads:

# Image-only shards for compute_embeds.py (mimics YFCC100M / iNaturalist)
python create_sample_dataset.py --mode distractors --out_dir ./sample/distractors --n_images 200

# Image+caption shards for precompute_embeddings.py (mimics ILIAS-core / INQUIRE queries)
python create_sample_dataset.py --mode pairs --out_dir ./sample/pairs --n_images 50 --captions_per_image 3

`README.md` — evaluation documentation

Expanded evaluation section with a staged walkthrough (distractor embeddings → paired embeddings → retrieval eval) for both ILIAS and INQUIRE, plus a complete smoke-test example using create_sample_dataset.py.

Agent-Logs-Url: https://github.com/mvrl/QuARI/sessions/22c88cba-d21a-4b84-b08e-560fa0978893 Co-authored-by: jacobsn <984784+jacobsn@users.noreply.github.com>

Initial plan

c06b1e7

Copilot AI assigned Copilot and jacobsn Apr 11, 2026

Copilot started work on behalf of jacobsn April 11, 2026 11:28 View session

Copilot AI linked an issue Apr 11, 2026 that may be closed by this pull request

Evaluation on ILIAS and INQUIRE #1

Open

Add sample dataset generator and fix ILIAS/INQUIRE evaluation workflow

74f26d0

Agent-Logs-Url: https://github.com/mvrl/QuARI/sessions/22c88cba-d21a-4b84-b08e-560fa0978893 Co-authored-by: jacobsn <984784+jacobsn@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Add sample dataset for evaluating ILIAS and INQUIRE~~ Fix ILIAS/INQUIRE evaluation workflow: configurable paths, format mismatch, and sample data generator Apr 11, 2026

Copilot AI requested a review from jacobsn April 11, 2026 11:36

Copilot finished work on behalf of jacobsn April 11, 2026 11:36

jacobsn requested review from EricX003 and removed request for jacobsn April 11, 2026 12:39

jacobsn marked this pull request as ready for review April 11, 2026 12:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix ILIAS/INQUIRE evaluation workflow: configurable paths, format mismatch, and sample data generator#5

Fix ILIAS/INQUIRE evaluation workflow: configurable paths, format mismatch, and sample data generator#5
Copilot wants to merge 2 commits intomainfrom
copilot/create-sample-dataset-for-evaluation

Copilot AI commented Apr 11, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Copilot AI commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

compute_embeds.py — remove hardcoded paths

eval_retrieval.py — fix distractor loading format mismatch

create_sample_dataset.py — new synthetic data generator

README.md — evaluation documentation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Apr 11, 2026 •

edited

Loading

`compute_embeds.py` — remove hardcoded paths

`eval_retrieval.py` — fix distractor loading format mismatch

`create_sample_dataset.py` — new synthetic data generator

`README.md` — evaluation documentation