perf(slack): skip LLM + cache config + drop reaction — thread+Pensando+trigger in ~1 RTT#442
Merged
Merged
Conversation
…trigger in ~1 RTT
Hot path used to do four sequential Slack RTTs and a Supabase round-trip
on every webhook before publishing the trigger:
config lookup (Supabase/Redis/KV)
-> addReaction
-> sendThinkingMessage
-> removeReaction
-> buildLLMMessages (with its own getThreadReplies)
-> isLLMAvailable
-> handleLLMCall (broken in prod -- always failed)
-> catch -> publishMessageReceived (with another getThreadReplies)
LLM has been broken in production for several incident PRs (taskId,
Thread not found, etc.), and the trigger-driven agent is the actual
response mechanism. Cut everything in the middle.
Changes:
- config-cache.ts: in-process memCache Map with 24h TTL on top of
Supabase/Redis/KV. First webhook for a connection still hits storage;
every subsequent webhook is sub-ms. Write-through on
cacheConnectionConfig; invalidation on removeCachedConnectionConfig.
- eventHandler.ts (handleAppMention, handleDirectMessage,
handleThreadReply): each handler is now just
1. sendThinkingMessage (creates the thread and Pensando placeholder)
2. processAttachedFiles (transcriptions + text-file inlining)
3. publishMessageReceived/publishAppMention with the resolved
fullText and thinking_message_ts in extras.
Removed: addReaction/removeReaction (Pensando is the feedback),
buildLLMMessages, isLLMAvailable, handleLLMCall, resolveUserName,
per-handler user-info lookup. handleMessage no longer pre-processes
files; each handler owns its own file pipeline so the thinking
message fires before Whisper/file-download work.
- event-publisher.ts: publishMessageReceived and publishAppMention
accept extras.thinking_message_ts. The payload now carries
thinking_message_ts and a context-aware reply_instruction telling
the agent to SLACK_EDIT_MESSAGE the placeholder (instead of stacking
another reply via SLACK_REPLY_IN_THREAD). thread_messages is still
only fetched when the event is a thread continuation.
- trigger-store.ts: trigger descriptions updated to document the
edit-vs-reply rule and the new thinking_message_ts field.
The LLM-related modules (llm.ts, llm-handler.ts, context-builder.ts's
buildContextMessages / formatMessagesForLLM / buildCurrentContent) stay
in place as dead code, leaving the path easy to rewire if/when the LLM
endpoint situation is resolved. A follow-up commit can prune them.
Expected latency:
- top-level DM: ~200ms (single sendThinkingMessage RTT) -> thread and
Pensando visible, trigger published right after.
- thread continuation: same + ~500ms getThreadReplies inside the
publisher.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
5 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fast-path the slack-mcp webhook: send thread + "Pensando..." in ~1 Slack RTT, publish the enriched trigger right after, and stop calling the broken LLM path entirely.
Hot path before
Hot path after
Changes
config-cache.ts— in-processMap<connectionId, { config, expiresAt }>with 24h TTL on top of the existing Supabase → Redis → KV chain. First webhook for a connection still hits storage; subsequent webhooks are sub-ms. Write-through oncacheConnectionConfig, invalidated onremoveCachedConnectionConfig.eventHandler.ts—handleAppMention/handleDirectMessage/handleThreadReplyare now just:sendThinkingMessage→processAttachedFiles→publishMessageReceived/publishAppMention(..., { thinking_message_ts }). Removed: 👀 reactions,buildLLMMessages,isLLMAvailable,handleLLMCall, per-handler user info lookup.handleMessageno longer pre-processes files so the thinking message fires before Whisper/file-download work.event-publisher.ts—publishMessageReceived/publishAppMentionacceptextras.thinking_message_ts. Payload carriesthinking_message_tsand a context-awarereply_instructiontelling the agent toSLACK_EDIT_MESSAGEthe placeholder (no more stacking another reply viaSLACK_REPLY_IN_THREAD).thread_messagesis still only fetched for thread continuations.trigger-store.ts— trigger descriptions document the edit-vs-reply rule + the newthinking_message_tsfield.Latency
sendThinkingMessageRTT) — thread + "Pensando..." appear, trigger published right after.getThreadRepliesinside the publisher.What's NOT in this PR
llm.ts,llm-handler.ts, and the unused parts ofcontext-builder.ts(buildContextMessages,formatMessagesForLLM,buildCurrentContent) stay as dead code. Leaving them in keeps the path easy to rewire if/when the decopilot endpoint is fixed. A follow-up commit can prune them.Test plan
[Triggers] Notified slack.message.received: channel=<id>shortly after; no[LLM] streamAgent/decopilot stream failedin logs.thinking_message_tsfrom the trigger payload and callsSLACK_EDIT_MESSAGEto replace "Pensando..." with the final answer — only one bot message in the thread.handleThreadReply; trigger payload includesthread_messageswith full history.triggerOnlymode: same flow, no "Pensando..." (preserved byshowThinkingcheck).connection_configsquery rate drops to ~1 per connection per 24h.Summary by cubic
Make the Slack webhook fast: send the thread plus a “Pensando...” placeholder in ~1 Slack RTT, then publish the enriched trigger. Drop the broken direct LLM flow and add an in-process config cache to cut storage round-trips.
New Features
thinking_message_tsand a context-awarereply_instruction: edit the placeholder viaSLACK_EDIT_MESSAGEwhen present; otherwise useSLACK_REPLY_IN_THREAD. Agents should follow this to avoid stacked replies.thread_messagesis fetched only for thread continuations to avoid extra Slack RTTs.Refactors
eventHandler.ts:sendThinkingMessage→processAttachedFiles→publishMessageReceived/publishAppMention. Removed 👀 reactions and all LLM-related calls.config-cache.tson top of Supabase → Redis → KV (write-through on save; invalidated on delete).trigger-store.tsto document the edit-vs-reply rule.Written for commit b689b47. Summary will update on new commits.