Skip to content

perf(slack): skip LLM + cache config + drop reaction — thread+Pensando+trigger in ~1 RTT#442

Merged
decobot merged 1 commit into
mainfrom
slack-skip-llm-cache-config
May 14, 2026
Merged

perf(slack): skip LLM + cache config + drop reaction — thread+Pensando+trigger in ~1 RTT#442
decobot merged 1 commit into
mainfrom
slack-skip-llm-cache-config

Conversation

@JonasJesus42
Copy link
Copy Markdown
Contributor

@JonasJesus42 JonasJesus42 commented May 14, 2026

Summary

Fast-path the slack-mcp webhook: send thread + "Pensando..." in ~1 Slack RTT, publish the enriched trigger right after, and stop calling the broken LLM path entirely.

Hot path before

webhook → config lookup (Supabase/Redis/KV)
        → addReaction 👀
        → sendThinkingMessage
        → removeReaction 👀
        → buildLLMMessages (with its own getThreadReplies)
        → isLLMAvailable
        → handleLLMCall (broken in prod — always failed)
        → catch → publishMessageReceived (with ANOTHER getThreadReplies)

Hot path after

webhook → memCache hit (sub-ms after first call)
        → sendThinkingMessage           [thread + Pensando]
        → processAttachedFiles
        → publishMessageReceived        [trigger with thinking_message_ts]

Changes

  • config-cache.ts — in-process Map<connectionId, { config, expiresAt }> with 24h TTL on top of the existing Supabase → Redis → KV chain. First webhook for a connection still hits storage; subsequent webhooks are sub-ms. Write-through on cacheConnectionConfig, invalidated on removeCachedConnectionConfig.
  • eventHandler.tshandleAppMention / handleDirectMessage / handleThreadReply are now just: sendThinkingMessageprocessAttachedFilespublishMessageReceived/publishAppMention(..., { thinking_message_ts }). Removed: 👀 reactions, buildLLMMessages, isLLMAvailable, handleLLMCall, per-handler user info lookup. handleMessage no longer pre-processes files so the thinking message fires before Whisper/file-download work.
  • event-publisher.tspublishMessageReceived / publishAppMention accept extras.thinking_message_ts. Payload carries thinking_message_ts and a context-aware reply_instruction telling the agent to SLACK_EDIT_MESSAGE the placeholder (no more stacking another reply via SLACK_REPLY_IN_THREAD). thread_messages is still only fetched for thread continuations.
  • trigger-store.ts — trigger descriptions document the edit-vs-reply rule + the new thinking_message_ts field.

Latency

  • Top-level DM: ~200 ms (single sendThinkingMessage RTT) — thread + "Pensando..." appear, trigger published right after.
  • Thread continuation: same + ~500 ms getThreadReplies inside the publisher.

What's NOT in this PR

  • llm.ts, llm-handler.ts, and the unused parts of context-builder.ts (buildContextMessages, formatMessagesForLLM, buildCurrentContent) stay as dead code. Leaving them in keeps the path easy to rewire if/when the decopilot endpoint is fixed. A follow-up commit can prune them.

Test plan

  • Deploy; send a top-level DM → thread + "Pensando..." appear within ~300 ms; logs show [Triggers] Notified slack.message.received: channel=<id> shortly after; no [LLM] streamAgent / decopilot stream failed in logs.
  • Subscriber agent reads thinking_message_ts from the trigger payload and calls SLACK_EDIT_MESSAGE to replace "Pensando..." with the final answer — only one bot message in the thread.
  • Send a reply inside the bot's thread → handler routes to handleThreadReply; trigger payload includes thread_messages with full history.
  • Send a DM with an audio file (Whisper enabled) → "Pensando..." still appears before transcription; final trigger carries the transcribed text.
  • In triggerOnly mode: same flow, no "Pensando..." (preserved by showThinking check).
  • Supabase connection_configs query rate drops to ~1 per connection per 24h.

Summary by cubic

Make the Slack webhook fast: send the thread plus a “Pensando...” placeholder in ~1 Slack RTT, then publish the enriched trigger. Drop the broken direct LLM flow and add an in-process config cache to cut storage round-trips.

  • New Features

    • Trigger payload adds thinking_message_ts and a context-aware reply_instruction: edit the placeholder via SLACK_EDIT_MESSAGE when present; otherwise use SLACK_REPLY_IN_THREAD. Agents should follow this to avoid stacked replies.
    • thread_messages is fetched only for thread continuations to avoid extra Slack RTTs.
  • Refactors

    • Simplified handlers in eventHandler.ts: sendThinkingMessageprocessAttachedFilespublishMessageReceived/publishAppMention. Removed 👀 reactions and all LLM-related calls.
    • Added a 24h in-process mem cache in config-cache.ts on top of Supabase → Redis → KV (write-through on save; invalidated on delete).
    • File processing runs after sending the placeholder so feedback is instant; if audio needs Whisper and it’s disabled, a warning is sent and the placeholder is removed.
    • Updated trigger docs in trigger-store.ts to document the edit-vs-reply rule.

Written for commit b689b47. Summary will update on new commits.

…trigger in ~1 RTT

Hot path used to do four sequential Slack RTTs and a Supabase round-trip
on every webhook before publishing the trigger:

  config lookup (Supabase/Redis/KV)
  -> addReaction
  -> sendThinkingMessage
  -> removeReaction
  -> buildLLMMessages (with its own getThreadReplies)
  -> isLLMAvailable
  -> handleLLMCall (broken in prod -- always failed)
  -> catch -> publishMessageReceived (with another getThreadReplies)

LLM has been broken in production for several incident PRs (taskId,
Thread not found, etc.), and the trigger-driven agent is the actual
response mechanism. Cut everything in the middle.

Changes:

- config-cache.ts: in-process memCache Map with 24h TTL on top of
  Supabase/Redis/KV. First webhook for a connection still hits storage;
  every subsequent webhook is sub-ms. Write-through on
  cacheConnectionConfig; invalidation on removeCachedConnectionConfig.

- eventHandler.ts (handleAppMention, handleDirectMessage,
  handleThreadReply): each handler is now just
    1. sendThinkingMessage (creates the thread and Pensando placeholder)
    2. processAttachedFiles (transcriptions + text-file inlining)
    3. publishMessageReceived/publishAppMention with the resolved
       fullText and thinking_message_ts in extras.
  Removed: addReaction/removeReaction (Pensando is the feedback),
  buildLLMMessages, isLLMAvailable, handleLLMCall, resolveUserName,
  per-handler user-info lookup. handleMessage no longer pre-processes
  files; each handler owns its own file pipeline so the thinking
  message fires before Whisper/file-download work.

- event-publisher.ts: publishMessageReceived and publishAppMention
  accept extras.thinking_message_ts. The payload now carries
  thinking_message_ts and a context-aware reply_instruction telling
  the agent to SLACK_EDIT_MESSAGE the placeholder (instead of stacking
  another reply via SLACK_REPLY_IN_THREAD). thread_messages is still
  only fetched when the event is a thread continuation.

- trigger-store.ts: trigger descriptions updated to document the
  edit-vs-reply rule and the new thinking_message_ts field.

The LLM-related modules (llm.ts, llm-handler.ts, context-builder.ts's
buildContextMessages / formatMessagesForLLM / buildCurrentContent) stay
in place as dead code, leaving the path easy to rewire if/when the LLM
endpoint situation is resolved. A follow-up commit can prune them.

Expected latency:
- top-level DM: ~200ms (single sendThinkingMessage RTT) -> thread and
  Pensando visible, trigger published right after.
- thread continuation: same + ~500ms getThreadReplies inside the
  publisher.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@decobot decobot merged commit 6ea0cb4 into main May 14, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants