Skip to content

feat: concat chunked Telegram messages for multi-part pastes (#186)#187

Open
IliyaBrook wants to merge 2 commits intoRichardAtCT:mainfrom
IliyaBrook:fix/186-concat-chunked-telegram-messages
Open

feat: concat chunked Telegram messages for multi-part pastes (#186)#187
IliyaBrook wants to merge 2 commits intoRichardAtCT:mainfrom
IliyaBrook:fix/186-concat-chunked-telegram-messages

Conversation

@IliyaBrook
Copy link
Copy Markdown

Summary

When a user pastes text longer than 4096 characters, the Telegram client silently splits it into multiple messages. Previously, each fragment triggered a separate Claude run — wasteful, slow, and producing poor results since Claude would see incomplete text fragments.

This PR adds a per-user debounce buffer that detects Telegram-split chunks (messages near the 4096-char limit), accumulates them, and submits the combined text as a single Claude request.

How it works

  • Messages >= 4000 chars (configurable) are considered potential chunks and buffered
  • A 500ms debounce timer (configurable) waits for more chunks to arrive
  • Short "tail" chunks flush the buffer immediately — no timer wait needed
  • Chunks are joined with no delimiter ("".join()) since Telegram splits at character boundaries
  • User sees "Receiving message…" while buffering, then the normal "Working..." flow
  • Stop button cancels pending buffers
  • Per-user lock prevents concurrent Claude runs from timer-fired flushes

New settings

  • CHUNK_BUFFER_TIMEOUT (default 0.5) — seconds to wait for more chunks
  • CHUNK_BUFFER_THRESHOLD (default 4000) — min message length to trigger buffering

Changed files

  • src/bot/utils/message_buffer.py — new MessageBuffer class
  • src/bot/orchestrator.py — extracted _process_agentic_text(), added buffer logic to agentic_text()
  • src/config/settings.py — new settings
  • src/utils/constants.py — new defaults
  • tests/unit/test_bot/test_message_buffer.py — 20 unit tests
  • tests/unit/test_bot/test_middleware.py — fixture update for new settings

Test plan

  • All 550 existing tests pass
  • 20 new unit tests for MessageBuffer (chunk detection, buffering, timer flush, cancel, key independence, concatenation)
  • Manually tested with ~27k char paste (7 chunks) — single "Receiving message…" → single Claude response

Closes #186

@MatveyF
Copy link
Copy Markdown

MatveyF commented Apr 16, 2026

Hey, thank you for the PR
I have one suggestion - the default CHUNK_BUFFER_THRESHOLD of 4000 may be too high. In my testing, Telegram's client splits at paragraph/line boundaries rather than at the 4096 char limit itself, so chunks can land well below 4000. I had a real case where the first chunk was 3397 chars, which slipped under a 3500 threshold I was testing locally and fired as a separate Claude run.

Or maybe at least add a note to the docs.

@IliyaBrook
Copy link
Copy Markdown
Author

Good catch, thanks! You're right — I had the same thought after reading your example. I've lowered the default CHUNK_BUFFER_THRESHOLD from 4000 to 3000, which should cover the paragraph/line-boundary splits you described (including your 3397-char case).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: Concat messages that were chunked by Telegram

2 participants