Skip to content

filter: fix silent truncation at 131072 bytes for inputs >ZSTD_CStreamInSize#49

Open
t0mtaylor wants to merge 1 commit into
tokers:masterfrom
t0mtaylor:fix/issue-25-truncation-at-131072
Open

filter: fix silent truncation at 131072 bytes for inputs >ZSTD_CStreamInSize#49
t0mtaylor wants to merge 1 commit into
tokers:masterfrom
t0mtaylor:fix/issue-25-truncation-at-131072

Conversation

@t0mtaylor

Copy link
Copy Markdown

Closes #25.

When ZSTD_compressStream returns rc>0 (a hint that ~rc bytes are still pending in libzstd's internal buffer, the natural value being ZSTD_CStreamInSize() == 131072), the body filter flips action to FLUSH and re-enters. After ZSTD_flushStream returns rc=0 (drained), the existing block

if (rc == 0 && (ctx->flush || ctx->last)) {
    b->last_buf = ctx->last;
    ctx->done   = ctx->last;
}

unconditionally marks the chain buffer as last and the context as done — even though ctx->buffer_in.pos < ctx->buffer_in.size and more chain links may still be queued. The next filter sees last_buf=1, stops reading, and ZSTD_endStream is never invoked on the remaining input.

Symptom: any single-buf body filter call carrying a chunk larger than ZSTD_CStreamInSize and last_buf=1 silently terminates the zstd frame at exactly 131072 decompressed bytes. Multi-buf streams (where the first chunk has last_buf=0) dodge the bug because the FLUSH-rc=0 branch is gated on ctx->last, so the action is restored to COMPRESS and the next chain link is consumed.

Reproduced live with nginx 1.30.0 + libzstd 1.4.4 against a 141186-byte CSS file via Accept-Encoding: zstd: wire 84969 bytes, decompressed to exactly 131072 bytes. After patch: wire 92265 bytes, decompressed to the full 141186. Verified across input sizes 5 KB / 141 KB / 208 KB / 290 KB / 1.5 MB; all decompress to the full original size.

Two-part fix:

  1. Gate the END transition on the current ZSTD_inBuffer being fully drained AND no more chain links queued, so we don't finalize the frame while libzstd still has unconsumed input:

    } else if (ctx->last && ctx->action != END
               && ctx->buffer_in.pos >= ctx->buffer_in.size
               && ctx->in == NULL) {
        ctx->action = END;
        return NGX_AGAIN;
    } else if (ctx->action != END) {
        ctx->action = COMPRESS; /* restore */
    }
    

    The new else if (... != END) clause is required because the original unconditional action = COMPRESS restore would reset the END action immediately after ZSTD_endStream returned rc=0, defeating the done-flag gate below and causing infinite output (observed: 422 MB output for 141 KB input during patch development).

  2. Gate the EOF marker on action == END so b->last_buf=1 and ctx->done=1 only fire after ZSTD_endStream has emitted the frame epilogue:

    if (rc == 0
        && (ctx->flush
            || (ctx->last && ctx->action == END))) {
        ...
    }
    

    The mid-stream flush case (ctx->flush) is unaffected — that is the client-flush path, not the EOF path.

…mInSize

Closes tokers#25.

When `ZSTD_compressStream` returns rc>0 (a hint that ~rc bytes are still
pending in libzstd's internal buffer, the natural value being
`ZSTD_CStreamInSize() == 131072`), the body filter flips action to FLUSH
and re-enters. After `ZSTD_flushStream` returns rc=0 (drained), the
existing block

    if (rc == 0 && (ctx->flush || ctx->last)) {
        b->last_buf = ctx->last;
        ctx->done   = ctx->last;
    }

unconditionally marks the chain buffer as last and the context as done —
even though `ctx->buffer_in.pos < ctx->buffer_in.size` and more chain
links may still be queued. The next filter sees `last_buf=1`, stops
reading, and `ZSTD_endStream` is never invoked on the remaining input.

Symptom: any single-buf body filter call carrying a chunk larger than
`ZSTD_CStreamInSize` and `last_buf=1` silently terminates the zstd
frame at exactly 131072 decompressed bytes. Multi-buf streams (where
the first chunk has `last_buf=0`) dodge the bug because the FLUSH-rc=0
branch is gated on `ctx->last`, so the action is restored to COMPRESS
and the next chain link is consumed.

Reproduced live with nginx 1.30.0 + libzstd 1.4.4 against a 141186-byte
CSS file via `Accept-Encoding: zstd`: wire 84969 bytes, decompressed
to exactly 131072 bytes. After patch: wire 92265 bytes, decompressed
to the full 141186. Verified across input sizes 5 KB / 141 KB / 208 KB /
290 KB / 1.5 MB; all decompress to the full original size.

Two-part fix:

1. Gate the END transition on the current `ZSTD_inBuffer` being fully
   drained AND no more chain links queued, so we don't finalize the
   frame while libzstd still has unconsumed input:

       } else if (ctx->last && ctx->action != END
                  && ctx->buffer_in.pos >= ctx->buffer_in.size
                  && ctx->in == NULL) {
           ctx->action = END;
           return NGX_AGAIN;
       } else if (ctx->action != END) {
           ctx->action = COMPRESS; /* restore */
       }

   The new `else if (... != END)` clause is required because the
   original unconditional `action = COMPRESS` restore would reset the
   END action immediately after `ZSTD_endStream` returned rc=0,
   defeating the done-flag gate below and causing infinite output
   (observed: 422 MB output for 141 KB input during patch development).

2. Gate the EOF marker on `action == END` so `b->last_buf=1` and
   `ctx->done=1` only fire after `ZSTD_endStream` has emitted the
   frame epilogue:

       if (rc == 0
           && (ctx->flush
               || (ctx->last && ctx->action == END))) {
           ...
       }

   The mid-stream flush case (`ctx->flush`) is unaffected — that is
   the client-flush path, not the EOF path.
eilandert added a commit to eilandert/zstd-nginx-module that referenced this pull request May 9, 2026
…amInSize

ISSUE: Data is silently truncated to exactly 131072 bytes for responses larger
than libzstd's internal buffer size (ZSTD_CStreamInSize). This affects any
single-buffer response with last_buf=1 and size >131K.

ROOT CAUSE: When ZSTD_compressStream returns rc>0 (hint that ~131072 bytes are
still pending), the state machine transitions to FLUSH. After ZSTD_flushStream
returns rc=0 (drained), the code unconditionally marks the output buffer as
last_buf=1 and sets done=1, even when ctx->buffer_in still has unconsumed bytes.
The next filter sees last_buf=1 and stops reading. ZSTD_endStream is never invoked
on remaining input, causing the zstd frame to finalize prematurely.

SYMPTOMS:
- Single-buf static file responses sized 131073..buffer_size truncate to 131072
- Multi-buf responses (first chunk last_buf=0) unaffected because FLUSH-rc=0 is
  gated on ctx->last
- Reproduced: 141186-byte CSS file decompresses to 131072; after patch: full 141186

TWO-PART FIX:
1. Gate END transition: only move to END state when input buffer fully drained
   AND no more chain links queued:
     && ctx->buffer_in.pos >= ctx->buffer_in.size && ctx->in == NULL

2. Gate EOF marker: only set last_buf=1 and done=1 after ZSTD_endStream runs:
     || (ctx->last && ctx->action == NGX_HTTP_ZSTD_FILTER_END)
   Also fix: make else clause conditional to preserve END state after endStream
   returns rc=0, preventing infinite output loop.

REFERENCE: tokers#49
           tokers#25

IMPACT: Eliminates data truncation for all response sizes. Fixes critical
data loss bug affecting production deployments.
eilandert referenced this pull request in eilandert/zstd-nginx-module May 10, 2026
Fixes GitHub issue #41 - worker processes stuck at 100% CPU usage

Applied two critical upstream fixes:

PR #23 (drawing, Jun 2023): Fix infinite loop when upstream returns
abnormal content-length by allocating fresh buffer during flush cycles.
Prevents memory recycling without re-buffering that caused CPU spinning.

PR #49 (t0mtaylor, May 2026): Introduce action state machine (COMPRESS,
FLUSH, END) to prevent premature END transitions and 131072-byte
truncation. Gates END transition on:
  - Input buffer fully drained
  - No more chain links queued
  - Last flag set
Also gates last_buf=1 on action==END to prevent early frame finalization.

Impact: Resolves infinite loop where empty output buffers get re-processed
without progress, causing 100% CPU burn. Prevents silent data truncation
at 131072-byte boundaries on large streams.

Verified: Both fixes integrate cleanly with existing codebase.
Next: Build and test with nginx/angie packages.
hsw added a commit to hsw/zstd-nginx-module that referenced this pull request May 16, 2026
PR tokers#49 description references multiplex conditions — single-stream
test_h2_truncation doesn't surface that. Open N concurrent streams on
ONE httpx.AsyncClient (max_connections=1 forces multiplex via the
shared HTTPCore connection) and byte-compare each decoded body against
its origin file.

Three sub-tests:
  * roundtrip[10] / roundtrip[20] — 10 / 20 streams, sizes round-robin
    across the 4 PR tokers#49 boundaries (200000 / 131071 / 131072 / 131073)
  * mixed_sizes_round_trip — 12 streams, all 4 sizes 3× each, mixed
    concurrency on the multiplex layer

All 3 PASS on test1 (master baseline) in ~0.7s. PR tokers#49 bug not triggered
by simple multiplex; reproduction likely needs specific buffer / socket
settings — V2 investigation. Current value: regression net for any
refactor that breaks H2 multiplex handling (compressStream2 migration,
worker-pool, etc.).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
hsw added a commit to hsw/zstd-nginx-module that referenced this pull request May 16, 2026
…okers#49

Existing test_h2_truncation.py passed on TDD baseline despite the bug
being present on master — default output_buffers (2 32k) split the
fixture file into ~32 KiB chain links, so no single body-filter call
ever saw a chunk > ZSTD_CStreamInSize (131072) with last_buf=1, which
is the precondition for the truncation-on-FLUSH path documented in
upstream PR tokers#49.

Fix: render with `output_buffers 1 1m; sendfile on; aio off;` so the
whole file flows through as one in_file chain link. With this, the
bug fires cleanly on sizes > 131072 (decoded body truncated to exactly
131072 bytes).

Also add test_h1_truncation.py — the bug is not HTTP/2-specific
(HTTP/2 framing originally just made the truncation more visible as
a short DATA frame). Verified both axes fail symmetrically on test1:
sizes 131073 and 200000 decompress to exactly 131072; 131071 and
131072 (at-or-below boundary) round-trip cleanly.

After PR tokers#49 lands, all 8 parametrized cases (4 sizes × 2 protocols)
will pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@eilandert

Copy link
Copy Markdown
Contributor

somehow this works for me in proxy mode but not in php-fpm fcgi mode, any idea?

eilandert referenced this pull request in eilandert/zstd-nginx-module May 18, 2026
…49

Existing test_h2_truncation.py passed on TDD baseline despite the bug
being present on master — default output_buffers (2 32k) split the
fixture file into ~32 KiB chain links, so no single body-filter call
ever saw a chunk > ZSTD_CStreamInSize (131072) with last_buf=1, which
is the precondition for the truncation-on-FLUSH path documented in
upstream PR #49.

Fix: render with `output_buffers 1 1m; sendfile on; aio off;` so the
whole file flows through as one in_file chain link. With this, the
bug fires cleanly on sizes > 131072 (decoded body truncated to exactly
131072 bytes).

Also add test_h1_truncation.py — the bug is not HTTP/2-specific
(HTTP/2 framing originally just made the truncation more visible as
a short DATA frame). Verified both axes fail symmetrically on test1:
sizes 131073 and 200000 decompress to exactly 131072; 131071 and
131072 (at-or-below boundary) round-trip cleanly.

After PR #49 lands, all 8 parametrized cases (4 sizes × 2 protocols)
will pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
eilandert referenced this pull request in eilandert/zstd-nginx-module May 18, 2026
Add gated ngx_log_debug at the output emit-decision point in
ngx_http_zstd_filter_compress:

  zstd emit?: outsz:N rc0:N last:N ctx_last:N ctx_flush:N action:N
  zstd emit: suppressed empty non-terminal buffer

NGX_DEBUG-gated (the ngx_log_debug* macros compile to nothing in
release builds), so zero runtime cost when off; visible with
`error_log ... debug`. Behaviour is unchanged — this only observes the
existing guard, it does not alter it.

Rationale: this module has a recurring truncation / zero-size-buffer /
terminal-frame bug class (a209f96, 2af5889, PR#23/#49, the 2026-05
bug B). Every prior diagnosis required temporarily patching in debug
lines and rebuilding, repeatedly. This makes the single most diagnostic
probe permanent: the exact state the emit guard saw. Verified during
bug-B work — it immediately exposed the forced-flush mechanism without
a patch/rebuild loop.

Builds clean under the project -Werror flags (nginx 1.31.0).
@hsw

hsw commented May 31, 2026

Copy link
Copy Markdown

#52 also fixes #25, via a different route. Instead of patching the legacy API, it migrates the filter to ZSTD_compressStream2 (ngx_brotli per-call-op), which drops the input-drain bookkeeping and also closes #23 / #41 in the same change.

Verified the same 131072 case you did. Thanks for the repro write-up — flagging the overlap so the two don't get merged on top of each other.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

endless loop

3 participants