Local-first CLI for converting MP3/MP4 lectures into searchable, timestamped Markdown study notes.
ScribeFlow converts local MP3 and MP4 files into:
- normalized WAV audio
- transcript JSON
- timestamped Markdown transcripts
- optional enhanced study notes
- local searchable index
It is designed for local lecture, meeting, and training-material workflows where reproducible processing and private-by-default file handling matter.
- MP3 and MP4 ingestion
- SHA-256 duplicate detection
- SQLite processing ledger
- FFmpeg audio extraction and normalization
- faster-whisper transcription
- Markdown transcript export
- local deterministic study enhancements
- retry, reprocess, clean, and archive commands
- local SQLite FTS5 search
- no cloud API required
macOS/Linux:
python3.11 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install -e '.[dev,stt]'Windows PowerShell:
py -3.11 -m venv .venv
.venv\Scripts\Activate.ps1
python -m pip install --upgrade pip
pip install -e ".[dev,stt]"FFmpeg is required for media processing. ScribeFlow cannot extract or normalize MP3/MP4 audio without it.
Check FFmpeg:
ffmpeg -versionscribeflow init
# Add safe test files:
# inbox/mp3/example.mp3
# inbox/mp4/example.mp4
scribeflow scan
scribeflow status
scribeflow process --limit 1 --model small --language en --enhance
scribeflow statusGenerated files appear in:
working/audio/
output/raw_json/
output/markdown/
Use short safe files first, ideally 30 seconds to 2 minutes.
- Place MP3 files in
inbox/mp3/. - Place MP4 files in
inbox/mp4/. - Do not commit media files to GitHub.
- Run
git statusbefore committing. - Only process recordings you have the right to process.
- Do not commit private, copyrighted, FERPA-protected, HIPAA-protected, or confidential recordings.
scribeflow versionprints the installed version.scribeflow initcreates workspace folders and local SQLite databases.scribeflow scanscans inbox folders and registers new MP3/MP4 files.scribeflow statusshows ledger totals and pending files.scribeflow processruns audio extraction, transcription, JSON export, and Markdown export.scribeflow retryretries failed jobs.scribeflow reprocess --file <filename>regenerates outputs for one tracked file.scribeflow cleanremoves selected generated working files.scribeflow archivemoves completed source media into archive folders.scribeflow indexbuilds a local search index.scribeflow search "query"searches indexed transcript content.
scribeflow process --limit 1 --model small --language en
scribeflow process --limit 1 --model small --language en --enhance
scribeflow process --file "lecture.mp3" --model medium --language en
scribeflow process --dry-runscribeflow process --enhance
scribeflow process --summary
scribeflow process --terms
scribeflow process --questions
scribeflow process --notes--enhance enables summary, study notes, key terms, and study questions.
Enhancement is deterministic and local. It does not use an LLM yet.
scribeflow index --rebuild --include-markdown
scribeflow search "machine learning" --limit 10
scribeflow search "sensitivity" --source "machine-learning" --limit 5
scribeflow search "machine learning" --jsonSearch is lexical SQLite FTS5 search, not semantic search.
inbox/mp3/ MP3 input files
inbox/mp4/ MP4 input files
working/audio/ normalized WAV files
working/temp/ temporary working files
working/logs/ local processing logs
output/raw_json/ transcript JSON files
output/markdown/ Markdown transcript and study note files
output/subtitles/ reserved subtitle output
archive/completed/ archived completed source media
archive/failed/ archived failed source media
.scribeflow/ local SQLite ledger and search index
ScribeFlow uses a local SQLite ledger. It tracks source path, filename, file type, size, SHA-256 hash, status, output paths, retry count, and errors.
SHA-256 hashing prevents duplicate processing even if a file is renamed.
pendingaudio_extractedtranscribedmarkdown_exportedcompletedfailed_audiofailed_transcriptionfailed_export
# Lecture Title
**Source file:** example.mp3
**Media type:** MP3
**Processed:** YYYY-MM-DD HH:MM
**Model:** faster-whisper-small
**Status:** Completed
---
## Summary
- Example summary point.
---
## Study Notes
- Example study note.
---
## Key Terms
- Example term
---
## Study Questions
1. Example question?
---
## Timestamped Transcript
### 00:00:00 - 00:00:05
Transcript text here.pip install -e '.[dev,stt]'
pytest
git diff --check- semantic search
- local RAG/ask
- local LLM summarization
- speaker diarization
- web UI
- course/project folders
- export to Obsidian/Notion-compatible notes
- Transcription accuracy depends on audio quality.
- First faster-whisper run may download model files.
- Search is lexical, not semantic.
- Study enhancements are deterministic and may be basic.
- Requires FFmpeg for media processing.
- Requires SQLite FTS5 for search.
ScribeFlow is released under the MIT License. See LICENSE.
