Skip to content

Add fault-injection CI workflow and OOM-guard test for NGX_NTLM_TEST_CLEANUP_NULL#3

Draft
Copilot wants to merge 3 commits intomainfrom
copilot/add-ci-workflow-for-nginx
Draft

Add fault-injection CI workflow and OOM-guard test for NGX_NTLM_TEST_CLEANUP_NULL#3
Copilot wants to merge 3 commits intomainfrom
copilot/add-ci-workflow-for-nginx

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 8, 2026

TEST 6 in t/001-sanity.t — which validates that a failed ngx_pool_cleanup_add() prevents caching a stale upstream connection (ABA/session-hijack guard) — has always been unconditionally skipped because it requires nginx to be built with -DNGX_NTLM_TEST_CLEANUP_NULL. This flag was never set in CI, so the OOM guard was never exercised.

New files

  • .github/workflows/test-cleanup-null.yml — mirrors test-nginx-1.30.yml but builds nginx with the fault-injection flag and exports NGX_NTLM_TEST_CLEANUP_NULL=1 before running prove:

    ./auto/configure \
      --prefix="$RUNNER_TEMP/nginx" \
      --add-module="$GITHUB_WORKSPACE" \
      --with-cc-opt="-DNGX_NTLM_TEST_CLEANUP_NULL"
    export NGX_NTLM_TEST_CLEANUP_NULL=1
    prove -r "$GITHUB_WORKSPACE/t"
  • t/004-cleanup-null.t — self-contained test file containing TEST 6 (without --- SKIP), guarded by the env var so the normal workflow skips it gracefully:

    plan(skip_all => 'requires NGX_NTLM_TEST_CLEANUP_NULL build')
        unless $ENV{NGX_NTLM_TEST_CLEANUP_NULL};

The result: normal workflow continues unchanged; the new fault-injection workflow actually exercises the ngx_pool_cleanup_add → NULL error path and asserts no upstream session is cached.

Original prompt

Goal

Add a dedicated CI workflow that builds nginx with the fault-injection flag -DNGX_NTLM_TEST_CLEANUP_NULL and runs the OOM-guard test (TEST 6) that is currently always skipped.


Background

ngx_http_upstream_ntlm_module.c contains a fault-injection guard:

#ifdef NGX_NTLM_TEST_CLEANUP_NULL
#  undef  ngx_pool_cleanup_add
#  define ngx_pool_cleanup_add(pool, size)  NULL
#endif

This makes ngx_pool_cleanup_add() always return NULL inside the module, exercising the OOM guard in ngx_http_upstream_free_ntlm_peer that prevents caching the upstream connection with a stale client-connection pointer (ABA / session-hijack).

t/001-sanity.t TEST 6 covers this path but has --- SKIP because it requires that special build. It is therefore never executed in CI.


Changes required

1. New file: .github/workflows/test-cleanup-null.yml

Create a new GitHub Actions workflow (modelled on the existing .github/workflows/test-nginx-1.30.yml) with the following differences:

  • name: test cleanup-null (fault injection)
  • nginx version matrix: 1.30.0 (same as existing workflow)
  • Build step: pass --with-cc-opt="-DNGX_NTLM_TEST_CLEANUP_NULL" to ./auto/configure
    e.g.:
    ./auto/configure \
      --prefix="$RUNNER_TEMP/nginx" \
      --add-module="$GITHUB_WORKSPACE" \
      --with-cc-opt="-DNGX_NTLM_TEST_CLEANUP_NULL"
  • Test step: run prove -r "$GITHUB_WORKSPACE/t" as usual; it will pick up the new t/004-cleanup-null.t

Triggers: push to main and pull_request (same as the existing workflow).

2. New file: t/004-cleanup-null.t

Create a new Perl test file that is a standalone version of TEST 6 from t/001-sanity.t, without --- SKIP.

Copy the prologue (fork + backend start + cleanup handler + run_tests()) from t/001-sanity.t and include only TEST 6:

=== TEST 1: OOM in cleanup handler must not cache the upstream connection
When ngx_pool_cleanup_add returns NULL the upstream session MUST NOT be
inserted into the cache. A subsequent request on the same client connection
must therefore NOT inherit the old authenticated upstream session.
This test requires a build with -DNGX_NTLM_TEST_CLEANUP_NULL.
--- http_config
    upstream backend {
        server localhost:19841;
        ntlm;
    }
--- config
    location /t {
        proxy_pass http://backend;
        proxy_http_version 1.1;
        proxy_set_header Connection "";
    }
--- pipelined_requests eval
["GET /t", "GET /t"]
--- more_headers eval
["Authorization: NTLM " . $::random_token, "Authorization: NTLM " . $::random_token]
--- response_body eval
["OK", "OK"]
--- error_log
ntlm: failed to allocate cleanup handler
--- raw_response_headers_unlike eval
["========", "X-NGX-NTLM-AUTH: "]

The test must not have --- SKIP.

Use String::Random and generate our $random_token the same way t/001-sanity.t does. Fork and start node t/backend/index.js and register add_cleanup_handler to kill it, exactly as in t/001-sanity.t.

Important: t/004-cleanup-null.t must be self-contained (does not depend on t/001-sanity.t running first). The regular workflow (.github/workflows/test-nginx-1.30.yml) also runs prove -r t, which means it will also pick up t/004-cleanup-null.t. Since that workflow builds without the flag, the test will fail under the normal build — guard against this by checking for the NGX_NTLM_TEST_CLEANUP_NULL environment variable or by using a --- SKIP conditional at the top of the file:

use Test::Nginx::Socket 'no_plan';
...
# Only run when nginx was built with -DNGX_NTLM_TEST_CLEANUP_NULL
plan(skip_all => 'requires NGX_NTLM_TEST_CLEANUP_NULL build')
    unless $ENV{NGX_NTLM_TEST_CLEANUP_NULL};

Then in the new workflow, export NGX_NTLM_TEST_CLEANUP_NULL=1 before running prove.

This way:

  • Normal workflow: t/004-cleanup-null.t is skipped gracefully.
  • New fault-injection workflow: the env var is set, the test runs and validates the OOM guard.

The following is the prior conversation context from the user's chat exploration (may be truncated):

User: what is the maximum value of c->requests

Assistant: The maximum value of c->requests is controlled by the ntlm_requests directive, which sets max_requests. The connection is considered invalid (and not reused) once:

if (c->requests >= hndp->conf->max_requests) {
    goto invalid;
}

So the effective maximum allowed value of c->requests before a connection is retired is:

Setting Value
Default (ntlm_requests) 1000
Configurable via directive Any positive integer

That is, c->requests can reach up to max_requests - 1 (i.e., 999 by default) while still being reused. Once `c->requests >= max_...

This pull request was created from Copilot chat.

…leanup-null.t)

Agent-Logs-Url: https://github.com/Securepoint/nginx-ntlm-modulev2/sessions/be466b04-5a70-4721-a318-386dd299eb74

Co-authored-by: matthias-lay <163420385+matthias-lay@users.noreply.github.com>
Copilot AI changed the title [WIP] Add CI workflow to build nginx with fault-injection flag Add fault-injection CI workflow and OOM-guard test for NGX_NTLM_TEST_CLEANUP_NULL May 8, 2026
Copilot AI requested a review from matthias-lay May 8, 2026 08:55
…M cache assertion

Agent-Logs-Url: https://github.com/Securepoint/nginx-ntlm-modulev2/sessions/60b8cb2e-4d95-44e4-9503-c0459c86962e

Co-authored-by: matthias-lay <163420385+matthias-lay@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants