fix(core): handle SQLite and Windows semantic regressions#655
fix(core): handle SQLite and Windows semantic regressions#655phernandez merged 4 commits intomainfrom
Conversation
|
Claude finished @phernandez's task in 2m 15s —— View job PR Review: Fix SQLite and Windows semantic regressions
SummaryThis PR addresses three distinct regressions: graceful degradation when Code Quality & Standards
|
| Check | Status |
|---|---|
| Unit tests for new functions | ✅ Good coverage of _build_chunk_records dedup, setup_logging behavior |
| Duplicate chunk key collapse test | ✅ test_duplicate_rows_collapse_to_unique_chunk_keys is clear and targeted |
sqlite-vec graceful degradation test |
✅ Mocks the OperationalError correctly; tests the exact returned EmbeddingStatus fields |
| Windows log file per-process test | ✅ Uses monkeypatching of os.name and os.getpid cleanly |
| 100% coverage maintained | setup_logging is still # pragma: no cover — new tests won't count |
| Edge case: duplicate rows with different text |
Basic Memory Architecture
| Check | Status |
|---|---|
| Preserves local-first architecture | ✅ |
| Knowledge graph consistency | ✅ |
| MCP tool atomicity | ✅ (not changed) |
| No DB migrations needed | ✅ (no schema changes) |
| Markdown file integrity | ✅ |
Security & Performance
| Check | Status |
|---|---|
| No hardcoded secrets | ✅ |
| Error message exposes only safe details | ✅ (str(exc).lower() string check is safe) |
| Exception is re-raised for Postgres | ✅ (if is_postgres or ... : raise) |
| No emojis/unicode in log output | ✅ |
| Windows path comparison compatible | ✅ |
Checklist Results
| Item | Status |
|---|---|
| Coding conventions (CLAUDE.md) | ✅ |
| Python 3.12+ types & async | ✅ |
| SQLAlchemy 2.0 patterns | ✅ |
| 100-char line limit | ✅ |
| No commented-out code blocks | ✅ |
| Decision-point comments (literate style) | ✅ Excellent — all three fixes have proper Trigger/Why/Outcome comments |
| 100% test coverage | setup_logging # pragma: no cover should be removed |
| Windows compatibility | ✅ |
| No emoji/unicode in console/log output | ✅ |
Overall Assessment
Approve with minor suggestions. The three fixes are well-motivated and correctly implemented. The main nit is removing # pragma: no cover from setup_logging now that it has real tests. The Windows log accumulation is worth a follow-up issue. The broad try-except scope is acceptable given the string-match guard.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 975504410b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| # Trigger: Windows does not allow renaming an open file held by another process. | ||
| # Why: multiple basic-memory processes can share the same log directory at once. | ||
| # Outcome: use per-process log files on Windows so log rotation stays local. | ||
| log_filename = f"basic-memory-{os.getpid()}.log" if os.name == "nt" else "basic-memory.log" |
There was a problem hiding this comment.
Avoid unbounded PID-specific log file growth on Windows
Using basic-memory-{pid}.log makes every process write to a brand-new filename, so Loguru's retention=5 no longer limits the overall log directory size across runs. On Windows hosts where CLI/MCP processes start frequently, ~/.basic-memory will accumulate stale per-PID logs indefinitely and can eventually consume significant disk space.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Addressed in 445d009 by trimming stale per-process Windows log files during setup while keeping the current-process file path unchanged.
Bound the per-process Windows log directory so the PID-based rotation workaround does not accumulate stale files indefinitely. Signed-off-by: phernandez <paul@basicmachines.co>
|
Claude finished @phernandez's task in 2m 22s —— View job PR Review: Fix SQLite and Windows semantic regressions
SummaryThis PR closes three regressions with well-scoped, well-commented fixes. The latest commit (445d009) correctly addresses the Windows log accumulation issue raised in the previous review. One outstanding item from the prior review remains unaddressed. Code Quality & Standards
|
| Check | Status |
|---|---|
| Unit tests for new functions/methods | ✅ Good coverage of _build_chunk_records dedup, setup_logging Windows behavior, and _cleanup_windows_log_files retention |
| Duplicate chunk key collapse test | ✅ test_duplicate_rows_collapse_to_unique_chunk_keys — clear and targeted |
sqlite-vec graceful degradation test |
✅ Mocks OperationalError correctly; verifies the exact returned EmbeddingStatus fields |
| Windows log per-process file test | ✅ test_setup_logging_uses_per_process_log_file_on_windows — clean monkeypatching of os.name and os.getpid |
| Windows log retention test | ✅ test_setup_logging_trims_stale_windows_pid_logs — verifies the 4-file retention ceiling correctly (6 stale → 4 kept, 2 removed) |
| Windows log accumulation (prior review) | ✅ Fixed in 445d009 via _cleanup_windows_log_files |
| 100% coverage maintained | setup_logging # pragma: no cover — new tests won't count toward coverage |
| Edge case: duplicate rows with different text |
Basic Memory Architecture
| Check | Status |
|---|---|
| Preserves local-first architecture | ✅ |
| Knowledge graph consistency | ✅ |
| MCP tool atomicity | ✅ (not changed) |
| No DB migrations needed | ✅ (no schema changes) |
| Markdown file integrity | ✅ |
| Decision-point comments (literate style) | ✅ All three fixes have proper Trigger/Why/Outcome comments |
Security & Performance
| Check | Status |
|---|---|
| No hardcoded secrets | ✅ |
| Error message exposes only safe details | ✅ (str(exc).lower() string check is safe) |
| Exception re-raised for Postgres | ✅ (if is_postgres or ... : raise) |
| No emoji/unicode in log output | ✅ |
| Windows path comparison compatible | ✅ |
| Stale PID log OSError silenced safely | ✅ Another process may hold the file; logger.debug is appropriate |
Checklist Results
| Item | Status |
|---|---|
| Coding conventions (CLAUDE.md) | ✅ |
| Python 3.12+ types & async | ✅ |
| SQLAlchemy 2.0 patterns | ✅ |
| 100-char line length | ✅ |
| No commented-out code blocks | ✅ |
| Decision-point comments (literate style) | ✅ |
| 100% test coverage | setup_logging # pragma: no cover should be removed now that tests exist |
| Windows compatibility | ✅ (improved by this PR) |
| No emoji/unicode in console/log output | ✅ |
Overall Assessment
Approve with minor suggestion. The three regression fixes are correct and well-tested. The Windows log accumulation concern raised in the prior review is now resolved. The one remaining item is removing # pragma: no cover from setup_logging now that three real tests exercise it — this would require either adding tests for the remaining untested branches (log_to_stdout, structured_context, env == "test") or moving the pragma to those specific lines.
Signed-off-by: phernandez <paul@basicmachines.co>
|
Claude finished @phernandez's task in 2m 40s —— View job PR Review: Fix SQLite and Windows semantic regressions
SummaryThree well-scoped fixes — graceful Prepared Fix: Remove
|
| Item | Status |
|---|---|
| Coding conventions (CLAUDE.md) | ✅ |
| Python 3.12+ types & async | ✅ |
| SQLAlchemy 2.0 patterns | ✅ |
| 100-char line length | ✅ |
| No commented-out code blocks | ✅ |
| Decision-point comments (literate style) | ✅ Excellent Trigger/Why/Outcome comments throughout |
| 100% test coverage | # pragma: no cover + 4 new tests (fix prepared above) |
| Windows compatibility | ✅ (improved by this PR) |
_cleanup_windows_log_files f-string nit |
logger.debug(f"Failed to delete...") should use lazy format: logger.debug("Failed to delete stale Windows log file: {path}", path=stale_log) |
| sqlite-vec graceful degradation | ✅ |
| Duplicate chunk key dedup | ✅ |
| No hardcoded secrets | ✅ |
| Windows path comparisons compatible | ✅ |
Overall Assessment
Approve with minor follow-up. All three regression fixes are correct and well-tested. The only items remaining from prior reviews are:
# pragma: no coveronsetup_logging(fix prepared above — apply manually)_cleanup_windows_log_filesf-string nit — changelogger.debug(f"Failed to delete stale Windows log file: {stale_log}")tologger.debug("Failed to delete stale Windows log file: {path}", path=stale_log)to match loguru's lazy-format style used elsewhere in the file
Signed-off-by: phernandez <paul@basicmachines.co>
|
Claude finished @phernandez's task in 1m 26s —— View job PR Review: Fix SQLite and Windows semantic regressions
SummaryAll three regression fixes are correctly implemented and all outstanding items from prior reviews have been addressed in the latest commit ( Resolved Items (vs. prior reviews)
Final Checklist
Overall AssessmentApprove. All three regression fixes are correct and well-tested. All items raised in the two prior reviews have been resolved. The |
Summary
bm project infoinstead of crashing on vec0 queriesbm reindex --embeddingscannot enqueue duplicate rowidsTesting
uv run ruff check src/basic_memory/services/project_service.py src/basic_memory/utils.py src/basic_memory/repository/search_repository_base.py tests/services/test_project_service_embedding_status.py tests/repository/test_semantic_search_base.py tests/utils/test_setup_logging.pyuv run pytest tests/services/test_project_service_embedding_status.py tests/repository/test_semantic_search_base.py tests/utils/test_setup_logging.pyuv run pytest tests/repository/test_sqlite_vector_search_repository.pyCloses #652
Closes #653
Closes #654