-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Context
QMD (github.com/tobi/qmd, 10.7K stars, by Tobi Lütke/Shopify CEO) is a local-first CLI search engine for markdown knowledge bases. It combines BM25 + vector search + local LLM reranking via node-llama-cpp GGUF models. MCP-native.
QMD is the most relevant comparison for BM because:
- Same philosophy: local-first, markdown files, on-device
- Same search techniques: BM25 + vector + hybrid
- Same ecosystem: MCP tools for Claude Code/Cursor
- But fundamentally different architecture: flat document search (QMD) vs knowledge graph with semantic relations (BM)
Benchmark Design
Retrieval metrics (existing benchmark)
- Ingest LoCoMo conversations into QMD collections
- Run same queries through
qmd query(hybrid + reranking mode) - Measure R@5, R@10, MRR against same ground truth
- Compare: BM hybrid search vs QMD hybrid+reranking
LLM-as-Judge (once #9 lands)
- Same eval: retrieve via BM MCP tools vs QMD MCP tools
- Same eval LLM, same judge, same questions
- Direct answer accuracy comparison
What we expect to learn
- Multi-hop: BM should win — our knowledge graph connects concepts across documents. QMD does flat retrieval.
- Single-hop: QMD may win — their local LLM reranker adds precision for direct fact lookup.
- Open domain: Interesting — QMD's
contexttree feature vs our semantic relations. - Temporal: Both probably weak here (neither has temporal-specific indexing yet).
What we learn either way
- If QMD reranking beats our hybrid search → validates Add reranking step to search pipeline (local cross-encoder) basic-memory#618 (add local reranking to BM)
- If BM's knowledge graph beats QMD on multi-hop → proves the value of semantic relations over flat search
- If results are close → the differentiator is UX, graph, and bidirectional human+AI access, not raw retrieval
Installation
npm install -g @tobilu/qmdQMD MCP tools: qmd_search, qmd_vector_search, qmd_deep_search, qmd_get, qmd_multi_get
Notes
- Be respectful. Tobi has a massive audience. A fair comparison that acknowledges QMD's strengths (reranking, simplicity, speed) while showing BM's advantages (knowledge graph, relations, bidirectional access) is the right tone.
- Publish methodology and results openly.
Related
- Benchmark: Add LLM-as-Judge evaluation (GPT-4.1) for LoCoMo #9 (LLM-as-Judge)
- Benchmark: Adopt Backboard's LoCoMo methodology for reproducible comparison #8 (methodology)
- Add reranking step to search pipeline (local cross-encoder) basic-memory#618 (local reranking — QMD already has this)
- Benchmark: Test with multiple eval LLMs to isolate memory quality from model capability #7 (multi-eval LLM comparison)
Milestone
v0.19.0
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels