Stroma is a neutral corpus and indexing substrate.
It owns the lowest-level operations needed to ingest text artifacts, chunk them, embed them, persist them in SQLite plus sqlite-vec, retrieve semantically close sections, and call OpenAI-compatible embedding and chat completion endpoints over a shared HTTP substrate. Callers consume Stroma through its APIs and treat the SQLite snapshot as an opaque local artifact. It does not own governance, specifications, compliance, drift analysis, prompt templates, product-specific output semantics, MCP, or CLI workflows.
Stroma is for products that need a reusable text corpus layer with:
- canonical records with deterministic content fingerprints
- pluggable chunking strategies (
chunk.Policy—MarkdownPolicydefault,KindRouterPolicyfor per-record-kind dispatch,LateChunkPolicyfor parent/leaf hierarchy) - pluggable embedders (
Embedder/ContextualEmbedder) with a deterministic fixture and an OpenAI-compatible HTTP embedder - OpenAI-compatible chat completion client (
chat.OpenAI) sharing the same substrate asembed.OpenAI: retry withRetry-After(capped), classified failures (auth/rate_limit/timeout/server/transport/schema_mismatch/dependency_unavailable), preserved lower-level causes on provider errors, APIToken redaction, and a product-neutral structured JSON helper - hybrid retrieval: dense vector + FTS5, fused via a pluggable
FusionStrategy(RRFFusionby default) with per-arm provenance surfaced to downstream rerankers - quantization knobs:
float32(default),int8(4× smaller),binary(1-bit sign-packedvec0prefilter that is 32× smaller for the prefilter representation; full-precision vectors are retained in a companion table for cosine rescoring, so total snapshot size is not 32× smaller) - optional Matryoshka prefilter at a truncated dimension with full-dim cosine rescore (
SearchParams.SearchDimension) - atomic rebuilds and incremental
Updatewith embedding reuse at the section level, chaining schema migrations v2 → v3 → v4 → v5 in one transaction
Stroma is not for:
- spec governance
- source discovery or repository scanning
- code compliance or doc drift analysis
- prompt templates, system prompts, or semantic interpretation of structured chat responses
- product-specific adapters and transports
corpus— canonical record model,NewRecordhelper,Normalize, deterministicFingerprintchunk—Policyinterface withMarkdownPolicy,KindRouterPolicy,LateChunkPolicy;MarkdownWithOptionsreturnsErrTooManySectionswhen a body exceeds the DoS capembed—EmbedderandContextualEmbedderinterfaces; deterministicFixture; OpenAI-compatible HTTP embedder withMaxBatchSizebatching, deadline scaling across batches, andAPITokenredaction inString/GoString/LogValuechat— OpenAI-compatible chat completion client (chat.OpenAI,chat.Message,ChatCompletionText,ChatCompletionJSON); tolerates string and multi-part array content; structured JSON responses decode into caller-owned targets and malformed JSON returnsschema_mismatch;APITokenredaction parity withembed.OpenAIConfigprovider— shared HTTP substrate used byembedandchat: retry with cappedRetry-After, response-size bounding, negativeMaxRetriesnormalization to zero, and a stableFailureClasstaxonomy surfaced via*provider.Error. Callers branch onFailureClassto retry / degrade / propagate, and can unwrap lower-level transport/decode causes where availablestore— SQLite readiness probes,sqlite-vecreadiness, quantization blob helpers (QuantizationFloat32/QuantizationInt8/QuantizationBinary)index— atomicRebuildwith embedding reuse and explicit reuse diagnostics, incrementalUpdatewithMaxPlannedRecordsbatching guard, long-livedSnapshotreaders,Stats, hybridSearchwith provenance and explicitMaxSearchLimit,ExpandContextfor parent/neighbor walks
Use OpenSnapshot when issuing many searches against one built index. A Snapshot is safe for concurrent reads; callers own the concurrency limit, so use a bounded worker pool or semaphore sized for the host and workload, then close the snapshot after all searches and context expansions finish.
For durable evidence handles, persist at least:
Stats.ContentFingerprintfrom the opened snapshot, identifying the indexed content generationSearchHit.ChunkID, identifying a chunk only within that snapshot generationSearchHit.Refplus any caller-needed record metadata orSourceRef
ChunkID is not a cross-rebuild identity. Before expanding a previously saved hit, reopen the snapshot, compare Stats.ContentFingerprint with the saved value, and rerun search if it differs. SearchHit.Score and HitProvenance are ranking evidence for the query that produced the hit; keep them for audit/debugging, but do not use them as identity fields.
ExpandContext(hit.ChunkID, opts) returns the hit chunk plus requested parent/neighbor sections in document order. On flat snapshots, parent expansion is a no-op and neighbors are same-record chunks. On hierarchical snapshots, parent expansion follows parent_chunk_id one level and neighbors stay in the same sibling group. A missing chunk returns an empty slice and nil error, which lets wrappers treat stale handles as "not found" after they have already checked the content fingerprint.
Update chunks, contextualizes, reuse-plans, and embeds added/replaced records before opening its SQLite write transaction. That keeps external embedder latency out of the transaction and preserves stale-plan rollback semantics, but the pre-transaction plan retains each added record's chunks, reuse decisions, and new vectors until the write phase.
For large ingests, split added records into caller-sized batches and set UpdateOptions.MaxPlannedRecords to that batch size. A batch above the cap fails before embedding and before the write transaction starts with an error wrapping index.ErrUpdatePlanTooLarge, so callers can retry smaller batches without changing the on-disk snapshot. MaxChunkSections still bounds per-record section expansion.
ctx := context.Background()
records := []corpus.Record{
corpus.NewRecord(
"widget-overview",
"Widget Overview",
"# Overview\n\nWidgets are synchronized in batches.",
),
}
fixture, err := embed.NewFixture("fixture-demo", 16)
if err != nil {
log.Fatal(err)
}
if _, err := index.Rebuild(ctx, records, index.BuildOptions{
Path: "stroma.db",
Embedder: fixture,
}); err != nil {
log.Fatal(err)
}
hits, err := index.Search(ctx, index.SearchQuery{
Path: "stroma.db",
SearchParams: index.SearchParams{
Text: "synchronized batches",
Limit: 5,
Embedder: fixture,
// Fusion / Reranker / SearchDimension are optional; zero values
// give hybrid RRF over vector+FTS with the full stored dimension.
},
})
if err != nil {
log.Fatal(err)
}
fmt.Println(hits[0].Ref)See the v2.0.0 release notes for the full API surface.
v2.0.0 (current) ships the stable substrate: hybrid retrieval, pluggable fusion, quantization, matryoshka, contextual retrieval, adaptive chunking, and incremental update. Higher-order products should consume the library rather than re-embedding their own indexing substrate.