Benchmark: Verify LoCoMo category assignments match source code, not paper

## Context

MemMachine's benchmark blog discovered that LoCoMo category assignments in the paper differ from the source code:

> 'This finding suggests that some public LoCoMo results might be presenting misclassified data, making a direct and fair comparison challenging.'

They use the source code assignments as ground truth, not the paper's descriptions.

## Action

1. Compare our category assignments against the LoCoMo source code (github.com/snap-research/LoCoMo)
2. Document any discrepancies with the paper
3. Ensure our per-category results use the correct assignments
4. If our categories were wrong, re-run and report corrected numbers

This is important for credibility — if we publish numbers with wrong categories, competitors will call it out.

## Related
- basicmachines-co/basic-memory-benchmarks#9, basicmachines-co/basic-memory-benchmarks#8
- MemMachine blog: memmachine.ai/blog/2025/12/memmachine-v0.2-delivers-top-scores-and-efficiency-on-locomo-benchmark/

## Milestone
v0.19.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmark: Verify LoCoMo category assignments match source code, not paper #4

Context

Action

Related

Milestone

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Benchmark: Verify LoCoMo category assignments match source code, not paper #4

Description

Context

Action

Related

Milestone

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions