fix: stream-based export for large databases (SQL, CSV, JSON) by chaudl113 · Pull Request #255 · outerbase/starbasedb

chaudl113 · 2026-05-30T06:52:43Z

Fixes #59

/claim #59

Summary

Replace in-memory export with streaming using TransformStream and chunked LIMIT/OFFSET queries. This prevents the 30-second timeout on large databases by processing data in manageable batches instead of loading everything into memory.

Changes

`src/export/index.ts`

Added getTableDataChunked() — async generator that fetches rows in configurable chunks (default 1000) using LIMIT/OFFSET
Added createStreamingExportResponse() — creates a Response backed by a TransformStream, allowing the producer to write data incrementally
Added writeChunk() helper for encoding and writing string data to the stream

`src/export/dump.ts`

Rewrote dumpDatabaseRoute() to stream SQL dump output
Schema fetched per-table via parameterized query (also fixes SQL injection in original)
Data rows written in 1000-row batches with breathing intervals (10ms) between chunks to avoid DO lock contention

`src/export/csv.ts`

Rewrote to use chunked streaming for all table sizes
CSV headers written from first chunk, then rows streamed incrementally

`src/export/json.ts`

Rewrote to stream JSON array output
Proper comma handling between rows (first row vs subsequent)

How it works

Before (breaks on large DB):

SELECT * FROM table → load ALL rows into memory → build string → return

After (works at any scale):

for each chunk of 1000 rows:
  SELECT * FROM table LIMIT 1000 OFFSET N
  write chunk to stream
  breathe (10ms) → let other DO requests through

Testing

All 23 export tests pass (dump, csv, json, index)
Added new test for chunked streaming with 2500 rows across 3 chunks
Existing behavior preserved for small tables

Replace in-memory export with streaming using TransformStream and chunked LIMIT/OFFSET queries. Fixes timeout on large databases. - dump: stream SQL rows in 1000-row batches via async generator - csv: stream CSV rows with header detection from first chunk - json: stream JSON array with proper comma handling - breathing intervals between chunks to avoid DO lock contention - all 23 export tests pass Signed-off-by: longtn <tnlong1214@gmail.com>

chaudl113 · 2026-05-30T06:59:37Z

Demo

Before (current implementation)

SELECT * FROM users  →  loads ALL 10M rows into memory
Build entire SQL dump string in memory
Return as single Blob  →  💥 timeout at 30s for large DBs

After (streaming implementation)

for each chunk of 1000 rows:
  SELECT * FROM users LIMIT 1000 OFFSET N
  Write INSERT statements directly to response stream
  Breathe 10ms  →  let other DO requests through
  Repeat until all rows exported

Key changes:

TransformStream — data streams to client as it's fetched
LIMIT/OFFSET chunking — constant memory usage regardless of DB size
Breathing intervals — prevents DO lock contention
Parameterized queries — fixes SQL injection in original code

Test results:

✓ src/export/index.test.ts  — 8 tests passed
✓ src/export/csv.test.ts    — 5 tests passed  
✓ src/export/json.test.ts   — 5 tests passed
✓ src/export/dump.test.ts   — 5 tests passed (incl. 2500-row chunked test)

All 23 tests pass. Video demo available upon request.

chaudl113 · 2026-05-30T07:03:19Z

Demo Video

What the demo shows:

The problem: current export loads entire DB into memory, fails on large databases
The solution: streaming with TransformStream + chunked LIMIT/OFFSET queries
Breathing intervals between chunks to avoid DO lock contention
All 23 export tests passing

Technical details:

Chunk size: 1000 rows per batch
Breathing interval: 10ms between chunks
Memory usage: constant regardless of database size
Works for SQL, CSV, and JSON exports

Signed-off-by: longtn <tnlong1214@gmail.com>

algora-pbc Bot added the 🙋 Bounty claim label May 30, 2026

algora-pbc Bot mentioned this pull request May 30, 2026

Database dumps do not work on large databases #59

Open

chaudl113 added 3 commits May 30, 2026 14:06

chore: remove package-lock.json from PR

14ede82

Signed-off-by: longtn <tnlong1214@gmail.com>

chore: remove unused hasData variable

0985c28

Signed-off-by: longtn <tnlong1214@gmail.com>

chore: revert unintended .gitignore and pnpm-workspace.yaml changes

43e5ca0

Signed-off-by: longtn <tnlong1214@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: stream-based export for large databases (SQL, CSV, JSON)#255

fix: stream-based export for large databases (SQL, CSV, JSON)#255
chaudl113 wants to merge 4 commits into
outerbase:mainfrom
chaudl113:fix/streaming-export-large-databases

chaudl113 commented May 30, 2026

Uh oh!

chaudl113 commented May 30, 2026

Uh oh!

chaudl113 commented May 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

chaudl113 commented May 30, 2026

Summary

Changes

src/export/index.ts

src/export/dump.ts

src/export/csv.ts

src/export/json.ts

How it works

Testing

Uh oh!

chaudl113 commented May 30, 2026

Demo

Before (current implementation)

After (streaming implementation)

Key changes:

Test results:

Uh oh!

chaudl113 commented May 30, 2026

Demo Video

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

`src/export/index.ts`

`src/export/dump.ts`

`src/export/csv.ts`

`src/export/json.ts`