Skip to content

fix(telegram): MarkdownV2 rendering + telegram-chat reference example#407

Merged
dancer merged 7 commits intovercel:mainfrom
serejke:fix/telegram-markdownv2
Apr 21, 2026
Merged

fix(telegram): MarkdownV2 rendering + telegram-chat reference example#407
dancer merged 7 commits intovercel:mainfrom
serejke:fix/telegram-markdownv2

Conversation

@serejke
Copy link
Copy Markdown
Contributor

@serejke serejke commented Apr 20, 2026

Summary

Two things bundled in one PR:

  1. Fix: Telegram adapter was rendering messages in an incompatible dialect — standard markdown (**bold**) shipped with parse_mode: "Markdown" (legacy, single-asterisk) — so every LLM-generated message containing ., !, (, ), - got rejected with can't parse entities. Fixes Telegram adapter: parse_mode not set for markdown messages #226.
  2. Example: a new examples/telegram-chat/ reference bot that exercises the adapter end-to-end: MarkdownV2 rendering, interactive cards, reactions, file uploads, streaming edits. Doubles as a manual regression harness for adapter changes.

The fix (efae68f)

Root cause

The adapter's fromAst() delegated to the SDK's generic stringifyMarkdown(), which emits standard markdown (**bold**, no escaping). That output shipped with parse_mode: "Markdown" — Telegram's legacy parser that uses *bold* (single asterisk) and has no escape rules. Two incompatible dialects glued together.

Changes to packages/adapter-telegram

  • Switch TELEGRAM_MARKDOWN_PARSE_MODE to "MarkdownV2".
  • Replace fromAst() with a dedicated AST → MarkdownV2 renderer (markdown.ts):
    • Single *bold* / _italic_ / ~strike~ markers.
    • Context-aware escaping: 20-char matrix for normal text; only ` and \ inside code blocks; only ) and \ inside link URLs.
    • Headings rendered as bold (MarkdownV2 has no heading syntax).
    • Ordered/unordered lists with escaped dashes/periods.
    • Blockquotes with per-line > prefix.
    • Tables pre-empted and rendered as ASCII code blocks (existing behavior preserved).
    • Explicit handlers for linkReference, imageReference, definition, html so nothing is silently dropped.
  • Route card fallback text through fromMarkdown (not raw escape), with boldFormat: "**" passed to @chat-adapter/shared's cardToFallbackText. Default boldFormat is "*" (Slack mrkdwn) which, fed back through a markdown parser, becomes italic — not bold — on Telegram.
  • Fix resolveParseMode so every message routed through the format converter ({markdown}, {ast}, cards, JSX) gets parse_mode: "MarkdownV2". Previously only {markdown} and cards were covered, so {ast} messages shipped without parse_mode and rendered asterisks literally.
  • Clarify inbound-vs-outbound: docstrings on applyTelegramEntities / escapeMarkdownInEntity note they're the inbound path (Telegram entities → standard markdown for parseMarkdown) and are distinct from the new outbound MarkdownV2 renderer.

Tests (74 → 148)

  • Full 20-char MarkdownV2 escape matrix.
  • Context-escape tests: code block (only `, \), link URL (only ), \).
  • Nested formatting: bold-in-italic, code-in-link, list-with-bold.
  • Edge cases: empty input, whitespace-only, raw HTML.
  • End-to-end LLM-output corpus test with a MarkdownV2 validity invariant — strips entities and code blocks, asserts every remaining special char is escaped.
  • Regression guards in index.test.ts for the AST / plain-string / raw parse_mode paths and for card-title MarkdownV2 bold rendering.

Changeset

patch bump on @chat-adapter/telegram.

The example (12d5435)

A polling-mode Telegram bot at examples/telegram-chat/ that exercises the adapter end-to-end. One command to run (pnpm --filter example-telegram-chat start), no webhook, no public URL, no external API keys.

Menu structure — three categorized sub-menus, inline-keyboard navigation:

  • Text & Markdown — 6 curated markdown demos + realistic LLM corpus + streaming edit loop.
  • Cards & Actions — interactive approval card (edits in-place on press), callback_data size probe (see limitation note below), LinkButton.
  • Media & Reactions — on-demand reaction one-shot (briefly subscribes), generated 1×1 PNG upload, generated minimal PDF upload.

Zero new runtime deps. PNG/PDF are hand-rolled in memory (lib/png.ts / lib/pdf.ts) rather than pulled from a binary-processing library.

Excluded from npm release via .changeset/config.json.

Known limitation (not fixed in this PR)

The size-probe demo surfaces a DX gap worth flagging for a future PR:

  • @chat-adapter/shared's button encoder wraps every Button.id in a chat:{\"a\":\"<id>\",\"v\":\"<value>\"} JSON envelope before writing callback_data.
  • That envelope eats ~13 bytes of Telegram's 64-byte limit, leaving ~51 bytes for id+value combined.
  • This is documented nowhere; developers discover it by hitting ValidationError: Callback payload too large for Telegram (max 64 bytes) — which is exactly what the size-probe demo teaches.

Possible follow-ups in a separate PR: document the effective budget in the adapter README; consider a pluggable CallbackEncoder hook so apps with short action ids can opt into a leaner encoding (drop chat: prefix, single-string instead of JSON) when they don't need round-trip value. Happy to propose this separately if maintainers agree.

Test plan

  • pnpm --filter @chat-adapter/telegram test — 148/148 pass
  • pnpm --filter @chat-adapter/telegram typecheck clean
  • pnpm --filter example-telegram-chat typecheck clean
  • pnpm dlx ultracite check on both packages clean
  • pnpm knip clean
  • Manual end-to-end against a real bot token: main menu renders, every category opens, every demo posts without 400 can't parse entities, approval card edits in-place on button press, size-probe shows safe card + teaching error for oversize, streaming demo visibly edits a single message through 8 frames

Fixes #226

serejke added 2 commits April 20, 2026 15:28
The Telegram adapter hardcoded `parse_mode: "Markdown"` (legacy) but
rendered messages via the SDK's generic `stringifyMarkdown()`, which
emits standard markdown. Two incompatible dialects glued together:

- Standard markdown uses `**bold**`, Telegram legacy uses `*bold*`
- Legacy Markdown has no escape rules — any message with `.`, `!`,
  `(`, `)`, `-`, `_` in unexpected positions was rejected with
  `can't parse entities`, which is virtually every LLM-generated
  response
- Legacy Markdown is deprecated by Telegram and lacks support for
  underline, strikethrough, spoiler, and blockquote

This commit:

- Switches TELEGRAM_MARKDOWN_PARSE_MODE to "MarkdownV2"
- Replaces fromAst() with a proper AST → MarkdownV2 renderer:
  - Single `*bold*`, `_italic_`, `~strike~` markers
  - Context-aware escaping: 20-char matrix for normal text, only
    `` ` `` and `\` inside code blocks, only `)` and `\` inside link
    URLs
  - Headings rendered as bold (MarkdownV2 has no heading syntax)
  - Ordered/unordered lists with escaped dashes and periods
  - Blockquotes with per-line `>` prefix
  - Tables pre-empted and rendered as ASCII code blocks
  - Explicit handlers for reference-style links, images, HTML, and
    definitions so nothing is silently dropped
- Routes card fallback text through `fromMarkdown` (not raw escape)
  with `boldFormat: "**"` — @chat-adapter/shared's cardToFallbackText
  defaults `boldFormat` to "*" (Slack mrkdwn), which would render as
  italic on Telegram. Explicit "**" keeps the card title rendered as
  real MarkdownV2 bold.
- Fixes resolveParseMode so every message routed through the format
  converter (`{markdown}`, `{ast}`, cards, JSX) gets
  `parse_mode: "MarkdownV2"`. Previously only `{markdown}` and cards
  were covered, so `{ast}` messages shipped without parse_mode and
  rendered asterisks literally.
- Documents inbound vs outbound dialects on applyTelegramEntities /
  escapeMarkdownInEntity (inbound entities → standard markdown)
  versus the new outbound MarkdownV2 renderer, so future
  contributors don't confuse the two.

Tests: full 20-char MarkdownV2 escape matrix, context-escape tests
for code blocks and link URLs, nested-formatting tests, edge cases
(empty, whitespace-only, raw HTML), and an end-to-end LLM-output
corpus test that asserts MarkdownV2 validity (no unescaped special
chars outside entities or code blocks). Regression guards added in
index.test.ts for the AST / plain-string / raw parse_mode paths and
for card-title MarkdownV2 bold rendering.

Fixes vercel#226
Polling-mode Telegram bot that exercises the adapter end-to-end:
MarkdownV2 rendering, interactive cards with inline keyboards,
reactions, file uploads, and streaming edits. Runs with a single
`pnpm --filter example-telegram-chat start`; no webhook, no public
URL, no external API keys.

Menu structure — three categorized sub-menus reached from any DM text:

- Text & Markdown: plain, inline emphasis, code block, links, list+table,
  20-char torture string, LLM-style corpus, streaming editMessage loop
- Cards & Actions: interactive approval card (edits in-place on press),
  callback_data size probe demonstrating the 64-byte limit, LinkButton
- Media & Reactions: on-demand reaction one-shot (briefly subscribes),
  generated 1×1 PNG upload, generated minimal PDF upload

Zero new runtime deps. PNG/PDF are hand-rolled in memory
(lib/png.ts, lib/pdf.ts) rather than pulled from a binary-processing
library. Failure handling is consistent: every demo runner is
try/catch-wrapped and posts an inline ❌ line with the error message.

Excluded from npm release via .changeset/config.json.
@vercel
Copy link
Copy Markdown
Contributor

vercel Bot commented Apr 20, 2026

@serejke is attempting to deploy a commit to the Vercel Team on Vercel.

A member of the Team first needs to authorize it.

@serejke
Copy link
Copy Markdown
Contributor Author

serejke commented Apr 20, 2026

image

@socket-security
Copy link
Copy Markdown

socket-security Bot commented Apr 20, 2026

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addednpm/​@​types/​node@​22.19.171001008195100

View full report

serejke added 2 commits April 21, 2026 11:20
The MarkdownV2 migration widened a latent truncation bug into a reliable
400. The previous truncator sliced at 4096/1024 chars and appended
literal "..." — but in MarkdownV2 `.` is a reserved character, the slice
can leave an orphan trailing `\`, and it can cut through a paired
entity (`*bold*`, `` `code` ``) leaving it unclosed.

Unify the two truncate methods into one `truncateForTelegram(text,
limit, parseMode)` that appends `\.\.\.` for MarkdownV2 and walks back
past unbalanced entity delimiters or orphan backslashes. Plain text
keeps literal `...`. Adds 8 length-limit tests.

Related cleanup:
- Move MarkdownV2 string utilities and Bot API limits to markdown.ts.
- Type renderMarkdownV2 exhaustively on mdast's `Nodes` union with a
  `never` assertion so new node kinds fail the build. Replaces the
  hand-rolled `AstNode` interface. Adds explicit cases for table /
  tableRow / tableCell (throw — preprocessed by fromAst),
  footnoteDefinition, footnoteReference, yaml.
- Introduce `TelegramParseMode = "MarkdownV2" | "plain"` replacing
  `string | undefined`. `toBotApiParseMode` handles the wire mapping.
- Re-export `Nodes` from the chat package; re-export
  `TelegramReactionType` from the adapter entry.
Three new menu entries exercise the MarkdownV2 truncation path that the
prior commit fixed:

- Long (5000 plain) — basic truncation, verifies escaped `\.\.\.` ellipsis
- Long (bold crosses 4096) — entity-balancing heuristic for unclosed `*`
- Long (code crosses 4096) — entity-balancing heuristic for unclosed `` ` ``

Each button posts a message whose rendered length exceeds Telegram's
4096-char limit and would have produced `can't parse entities` 400s
against the previous truncator. Serves as an interactive smoke test
alongside the unit tests in packages/adapter-telegram.
@serejke
Copy link
Copy Markdown
Contributor Author

serejke commented Apr 21, 2026

While working on this I noticed the truncator was producing bad MarkdownV2 — unescaped dots, orphan \, unclosed *bold* or `code` when the slice landed mid-entity. That's what the new truncateForTelegram handles now.

Also had a look at the other adapters while I was here — Discord has the same truncation bug, WhatsApp silently splits into multiple messages, Slack/GChat/Teams don't check at all. Filed #408 to clean that up separately.

@vercel
Copy link
Copy Markdown
Contributor

vercel Bot commented Apr 21, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
chat Ready Ready Preview, Comment, Open in v0 Apr 21, 2026 0:27am
chat-sdk-nextjs-chat Ready Ready Preview, Comment, Open in v0 Apr 21, 2026 0:27am

@dancer
Copy link
Copy Markdown
Contributor

dancer commented Apr 21, 2026

ty @serejke amazing work, will be in next release cc: @bensabic

@dancer dancer merged commit b9a1961 into vercel:main Apr 21, 2026
9 checks passed
@serejke serejke deleted the fix/telegram-markdownv2 branch April 21, 2026 12:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Telegram adapter: parse_mode not set for markdown messages

2 participants