feat(renderer): font ligature support via GSUB shaping + segmentation cache by patrick-andrew-anchor · Pull Request #129 · junkdog/beamterm

patrick-andrew-anchor · 2026-06-15T17:52:56Z

Programming ligatures (=>, ->, !=, ===, <==>, ...) now render in the dynamic atlas for fonts that ship ligature tables (Fira Code, JetBrains Mono, Cascadia Code, Monaspace Neon).

The dynamic atlas rasterized one grapheme per cell, so the font shaper never saw adjacent codepoints and ligatures never formed. This adds a ligature shaper plus N-cell glyph support end to end:

beamterm-core (new ligatures feature):
- shaper.rs: rustybuzz-based detection. Compares each shaped glyph to the nominal cmap glyph, so it detects the calt "spacer" approach (Fira Code et al. keep glyph count == char count) as well as classic GSUB ligature merges.
- GlyphSlot::Ligature(id, cells) + cell_span(); size-classed ligature pools (widths 3..=8) in the glyph cache with O(1) alloc + LRU eviction. Two-cell ligatures continue to use the existing wide path.
- dynamic atlas: generic split_glyph_n + N consecutive slot uploads; texture layers derived from the region layout so they can't drift.
- terminal grid: 2-cell placement generalized to N cells across all update paths; segment_run + place_ligature helpers.
beamterm-renderer:
- canvas rasterizer sizes each glyph to cell_w * unicode-width, so ligature substrings render at their full width.
- BeamtermRenderer.setFontBytes(Uint8Array) builds the shaper from raw sfnt bytes; Batch.text segments runs into ligature glyphs. Ligatures activate automatically when the supplied font advertises them.

The shaper only detects/segments; the browser canvas still rasterizes, preserving color emoji and font fallback. WOFF/WOFF2 must be decompressed to sfnt before setFontBytes (documented in js/README.md).

Testing:
With this change and my client support, I can render ligatures:

user@pandrew-dev:~$ printf '%s\n' '-> => != == === !== <==> --> |> :: // /* */ >>= <=>'
  -> => != == === !== <==> --> |> :: // /* */ >>= <=>"

shows as

Follow-up: memoize ligature run segmentation

This PR also includes a perf commit on top of the feature.

Shaper::segment() built a rustybuzz Face from the raw font bytes and ran the shaper for every text run on every call. The renderer re-shapes the whole screen each frame, so a static screen was re-segmented ~60×/s — measured at ~38ms/frame of pure shaping on a full screen (render p50 48.8ms with ligatures on vs ~10.5ms off).

segment() is now memoized in an LRU keyed on the run text. Segmentation depends only on the characters and the font, and a font change constructs a fresh Shaper (hence a fresh cache), so no explicit invalidation is needed. A static screen pays the shaping cost once; repeated runs become an O(len) map lookup. lru was already a dependency.

Adds a cache-correctness test asserting the memoized path returns segments identical to the uncached path on both miss and hit (gated on BEAMTERM_LIGATURE_TEST_FONT). Confined to beamterm-core/src/gl/shaper.rs; no new dependencies.

Programming ligatures (=>, ->, !=, ===, <==>, ...) now render in the dynamic atlas for fonts that ship ligature tables (Fira Code, JetBrains Mono, Cascadia Code, Monaspace Neon). The dynamic atlas rasterized one grapheme per cell, so the font shaper never saw adjacent codepoints and ligatures never formed. This adds a ligature shaper plus N-cell glyph support end to end: - beamterm-core (new `ligatures` feature): - shaper.rs: rustybuzz-based detection. Compares each shaped glyph to the nominal cmap glyph, so it detects the `calt` "spacer" approach (Fira Code et al. keep glyph count == char count) as well as classic GSUB ligature merges. - GlyphSlot::Ligature(id, cells) + cell_span(); size-classed ligature pools (widths 3..=8) in the glyph cache with O(1) alloc + LRU eviction. Two-cell ligatures continue to use the existing wide path. - dynamic atlas: generic split_glyph_n + N consecutive slot uploads; texture layers derived from the region layout so they can't drift. - terminal grid: 2-cell placement generalized to N cells across all update paths; segment_run + place_ligature helpers. - beamterm-renderer: - canvas rasterizer sizes each glyph to cell_w * unicode-width, so ligature substrings render at their full width. - BeamtermRenderer.setFontBytes(Uint8Array) builds the shaper from raw sfnt bytes; Batch.text segments runs into ligature glyphs. Ligatures activate automatically when the supplied font advertises them. The shaper only detects/segments; the browser canvas still rasterizes, preserving color emoji and font fallback. WOFF/WOFF2 must be decompressed to sfnt before setFontBytes (documented in js/README.md). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Shaper::segment() built a rustybuzz Face from the raw font bytes and ran the shaper for every text run on every call. The renderer re-shapes the whole screen each frame, so a static screen re-segmented every run ~60×/s — measured at ~38ms/frame of pure shaping on a full screen (render p50 48.8ms with ligatures on vs ~10.5ms off). Memoize segment() results in an LRU keyed on the run text. Segmentation depends only on the characters and the font, and a font change constructs a fresh Shaper (hence a fresh cache), so no explicit invalidation is needed. A static screen now pays the shaping cost once; repeated runs are an O(len) map lookup. `lru` was already a dependency. Adds a cache-correctness test asserting the memoized path returns segments identical to the uncached path on both miss and hit.

is_emoji() treated any pure-ASCII string with len > 1 and width >= 2 as an emoji to catch ASCII-led keycap sequences (e.g. "1️⃣"). That heuristic also matched 2-char programming ligatures like "->", "=>", "==", "<-", "&&". When ligature shaping landed, the 2-cell ligature substring is passed to GlyphCache::resolve_glyph_slot, so the false positive promoted these glyphs to GlyphSlot::Emoji(idx | DYNAMIC_EMOJI_FLAG). The set emoji bit (15) makes the fragment shader sample the glyph texture color directly instead of tinting with the cell foreground — rendering the white glyph mask untinted. The bug was invisible on dark themes (white ≈ light fg) but rendered the ligature white on light themes. 3+ cell ligatures use the separate Ligature slot pool, which never consults is_emoji, so they were unaffected. Require a non-ASCII continuation byte (U+FE0F / U+20E3), which real keycap sequences always carry and ASCII ligature runs never do. Adds regression tests covering keycaps (still emoji) and the ligature substrings (not emoji). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

patrick-andrew-anchor marked this pull request as ready for review June 15, 2026 18:40

patrick-andrew-anchor marked this pull request as draft June 15, 2026 18:40

patrick-andrew-anchor force-pushed the feat/font-ligatures branch 3 times, most recently from 6783ff5 to 40709c3 Compare June 15, 2026 19:06

patrick-andrew-anchor force-pushed the feat/font-ligatures branch from 40709c3 to 60627d3 Compare June 15, 2026 19:27

patrick-andrew-anchor marked this pull request as ready for review June 15, 2026 20:16

patrick-andrew-anchor added 2 commits June 17, 2026 20:07

style(core): rustfmt — drop blank line after segment_uncached brace

a2eaa63

patrick-andrew-anchor changed the title ~~feat(renderer): font ligature support via GSUB shaping (#128)~~ feat(renderer): font ligature support via GSUB shaping + segmentation cache Jun 17, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(renderer): font ligature support via GSUB shaping + segmentation cache#129

feat(renderer): font ligature support via GSUB shaping + segmentation cache#129
patrick-andrew-anchor wants to merge 4 commits into
junkdog:mainfrom
patrick-andrew-anchor:feat/font-ligatures

patrick-andrew-anchor commented Jun 15, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

patrick-andrew-anchor commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Follow-up: memoize ligature run segmentation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

patrick-andrew-anchor commented Jun 15, 2026 •

edited

Loading