Conversation
|
There are two critical problems in the implementation. I'm interested to see this implemented immediately. I was distrustful at first so I pulled the PR and conducted a security audit. Here's the gist of this PR: full transcript of codebase investigation with Q&A Further questioning after initial analysis revealed a few critical vulnerabilities which should be rectified before merge.
Future improvements
|
|
Great feedback! Let me get into it |
…ecurity hardening Implements full LDAP/Active Directory authentication for nginx-proxy-manager: Core features: - Low-level LDAP client with connection pooling (ldapjs + tarn) - JIT user sync: first-login provisioning with group-to-role mapping - LDAP Settings API + Admin UI (get/put/test-connection/test-auth endpoints) - Phase 2-6: DB schema, login flow integration, group-based access control - 2FA support for LDAP users, Objection.js patchById compatibility Security hardening: - TCP keep-alive + idle connection reaper (dead socket prevention) - Semaphore-based pool DoS protection (LDAP_POOL_MAX_QUEUE env var) - Account hijacking prevention: auth_source column enforces per-source identity - OOM prevention: configurable paged LDAP search (page_size, max_entries) Schema & docs: - OpenAPI schema for /settings/ldap endpoints with examples - auth_source field added to User object schema - docs/ldap-authentication.md with full configuration reference - Docker integration test environment (OpenLDAP + bootstrap LDIFs)
Adds full Jest test coverage for the LDAP authentication module: Test suites (196 tests total, all passing): - ldap-client.test.js: connection pool, TCP keep-alive, idle reaper, semaphore DoS protection, bind/search/unbind lifecycle - ldap-env.test.js: env var parsing, defaults, LDAP_POOL_MAX_QUEUE wiring - ldap-internal.test.js: account hijacking prevention (auth_source isolation), 2FA enforcement for LDAP users, group membership resolution - ldap-sync.test.js: JIT user provisioning, paged search OOM bounds, memory-bounding on large result sets Test infrastructure: - Jest ESM config (--experimental-vm-modules) with manual mocks - Mock modules: bcrypt, config, db, lodash, moment, node-rsa, objection, signale, tarn - ESM-compatible resolver (jest.resolver.cjs) - backend/__tests__/README.md with test run instructions
5047c66 to
82f669a
Compare
- Remove spurious backend/biome.json (nested root config conflict) - Fix __mocks__: noUselessConstructor, noThisInStatic, useArrowFunction, hasOwnProperty → Object.hasOwn, forEach callback returns - Fix ldap-client.js: unused parameter in searchPaged page handler - Fix ldap-sync.test.js: forEach callback return value - Fix jest.resolver.cjs: unused import - Suppress false positive: nextCalled used inside setImmediate callback
|
@adamoutler feel free to check it out once again ;-) |
|
Hello, I am Adam's agent. I have reviewed the While most issues like OOM on Mass Sync, Dead Sockets, and Account Hijacking have robust mechanisms in place now, my analysis found that CRITICAL and HIGH issues still remain:
There are also a couple of follow-up questions/observations:
Please address the CRITICAL semaphore leak and HIGH socket exhaustion issues. Let us know if you have any questions! |
…uarantee returnToPool
authenticateUser() now routes through a dedicated login semaphore (LDAP_MAX_LOGIN_CONNECTIONS, default 10) to prevent socket exhaustion from concurrent login bursts. Includes 5 new tests (204 total).
… patches - Migration adds UNIQUE index on user.email (MySQL, SQLite, PostgreSQL) - provisionUser catches unique violation on INSERT, retries as UPDATE - _updateExistingLdapUser skips DB write when attributes unchanged - Cross-source hijack still prevented under race conditions - 16 new tests (216 total)
# Conflicts: # backend/package.json
…ft-delete compatibility The plain UNIQUE constraint on user.email conflicted with soft-deletes: re-creating a user with the same email as a soft-deleted row (is_deleted=1) caused a unique violation. This broke CI tests that use resetUsers(). Changes: - Migration: use a partial unique index (WHERE is_deleted = 0) for SQLite/PostgreSQL, and a virtual generated column with UNIQUE for MySQL - user.js create(): wrap insert in try/catch — on unique violation, hard-delete the soft-deleted duplicate and retry - user.js create(): default auth_source to 'local' when not provided - setup.js: add explicit auth_source: 'local' to default admin user Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
MySQL CI DB persists between builds. Previous run may have partially applied the migration (column added but migration not recorded as complete). Use INFORMATION_SCHEMA checks before ADD COLUMN/CREATE INDEX to make the migration safe to re-run. Also adds IF NOT EXISTS for SQLite/PostgreSQL partial index and drops any stale plain unique index from the earlier migration version.
VIRTUAL generated columns cannot have indexes in MySQL. Changed to STORED. Also simplified idempotency: always drop+recreate the column to handle leftover VIRTUAL columns from prior failed migration runs.
MySQL STORED generated columns appear in SELECT * results. The email_active column is an internal implementation detail for the partial unique constraint — it must not leak into API responses or Swagger schema validation will fail.
The previous omission fix only covered direct user API responses. Endpoints like streams, proxy hosts, and certificates JOIN the user table for the owner field — email_active leaked through those paths. Stripping in $parseDatabaseJson ensures the column never reaches application code regardless of how the user is queried.
|
Ready for next test round Mr bot account @MrMeatikins ! |
|
Automated follow-up review for LDAP authentication fixes (Semaphore Leak & Socket Exhaustion). Please find the final security audit report below: I will begin by reading Security Analysis SummaryThe investigation confirms that the reported vulnerabilities have been systematically addressed. The codebase now employs a defensive architecture utilizing structured concurrency, mandatory timeouts, and explicit resource cleanup paths. Vulnerability Assessment1. Semaphore Leak (CRITICAL) — STATUS: FIXED
2. Socket Exhaustion (HIGH) — STATUS: MITIGATED
3. Security & Edge Cases
Prioritized Findings
Strategic Recommendations
Conclusion: The LDAP implementation is now robust and follows industry best practices for resource safety and concurrency control. The previously reported CRITICAL and HIGH vulnerabilities have been effectively neutralized. |
|
Automated follow-up review for LDAP authentication fixes (Semaphore Leak & Socket Exhaustion). Please find the final security audit report below: I will start by reading The security investigation into the LDAP Authentication implementation in LDAP Security Investigation Summary
Detailed Vulnerability Analysis1. Semaphore Leak (CRITICAL) — STATUS: FIXEDThe previous vulnerability involved semaphores being acquired but not released during error conditions, eventually deadlocking the LDAP subsystem.
2. Socket Exhaustion (HIGH) — STATUS: FIXEDThe previous vulnerability allowed unbounded socket growth, leading to process-level file descriptor exhaustion.
3. Plaintext Credentials (MEDIUM) — STATUS: REMAININGThe
4. Residual Socket Leak in
|
|
I think we are on good enough state unless Project Maintainer says otherwise. Will park it now and wait for more community feedback! |
|
@jc21 Would you please grant access to the build so that I may perform dynamic testing? I tried earlier and could not access it. The docker hub repo is private. https://hub.docker.com/repository/docker/nginxproxymanager/nginx-proxy-manager-dev I am Anything I can do to expedite this, let me know. I was very disappointed when the last LDAP PR attempt failed last year and I don't want to see that happen again. I am of the opinion this is high quality code. I'm very interested in getting this feature in production |
|
So am I to understand that AI was used to both create this PR and Review it? And am I also to understand that all users in the LDAP directory would be created in the NPM user table regardless of their "need" to login? |
Do you mean you can't pull the built image referred to in The docker image is definitely NOT private: https://hub.docker.com/layers/nginxproxymanager/nginx-proxy-manager-dev/pr-5345/images/sha256-a2d2c0a9c4cc3ac4c5de9ef4d2d2dc05da93f0f1884aa62ea08651a095578ebd |
Thank you! That's just what I need!
My initial review was based on a mix of human and automated review with AI formatting at the end. If AI were capable of doing what I do there wouldn't be so much slop :D. Think of it as lint checks on steroids. I'll never let untrusted code run on my servers. AI can establish at least some level of trust. It's way fast to do quality and security checks. MrMeatikins' post was fully AI. I was working on another project when he notified me he responded. He is my bot. There's a free one called CodeRabbit which can triage PRs for open source projects for quality and security and it's free.
I'm testing. This will be first on my list when I can get it running. One SHOULD typically be able to set the since the User Filter is used to determine valid objects in Active Directory and populate Users, you could set a group of Nginx Proxy Manager users, or several groups. However there may be a big problem here. @Wadera I'm entering
Based on the 10 errors shown: And the 9 text boxes + verify/use bools, I'm thinking the form data is using wrong field names. |
|
@jc21 — On AI Usage and User Provisioning Great questions, and totally fair to ask. On AI involvement: This is a human–AI collaboration. I designed the architecture, wrote the initial implementation, and make all the decisions on what goes in and how it works. My AI assistant helps with the heavy lifting — writing boilerplate, running through test suites, catching edge cases I might miss, and iterating on code faster than I could solo. Every commit is reviewed by me before it's pushed. Think of it as pair programming where one of the partners never gets tired. Adam's initial review was his own analysis with AI-assisted formatting (as he described). The follow-up from MrMeatikins was Adam's automated security bot doing a focused audit on the fixes. Having that kind of fast, thorough security feedback loop is what let us turn around all four CRITICAL/HIGH fixes in under a day. On user provisioning: No, this does not bulk-import your entire LDAP directory into NPM. It uses Just-In-Time (JIT) provisioning — a user is only created in NPM's database the first time they successfully authenticate via LDAP. If someone in your directory never logs into NPM, they never appear in the user table. The "Sync" feature (available as a button in the UI and via API) only re-checks users that already exist in NPM with So for a typical AD environment with thousands of users, only the handful who actually use NPM will appear in the user table. The I'd love to hear any other concerns or suggestions you might have — this is a feature a lot of people have been asking for, and getting it right matters. Thanks to @adamoutlerHuge thanks for the thorough security audit and hands-on testing. Having someone actually pull the image, spin it up against a real AD environment and report back with screenshots is incredibly valuable. The security review from both your initial analysis and MrMeatikins' follow-up gave us real confidence in the implementation. The "serverUrl" / Form Field BugGreat catch. You're right — the form is sending wrong field names. I found the root cause: The frontend uses This means:
The fix is straightforward — aligning the frontend TypeScript interfaces to use LDAP User Provisioning — JIT Not BulkTo clarify the question about whether all LDAP users get created in the NPM user table: No — this implementation uses Just-In-Time (JIT) provisioning. Users are only created in the NPM database when they actually log in. There is no bulk import or directory sync that pre-creates all LDAP users. The Your User Filter approach is exactly right: This filter is applied during the login search — only users matching the filter can authenticate. Combined with the group-to-role mapping ( Remaining Items from Security ReviewThe final security audit from MrMeatikins confirmed all CRITICAL and HIGH issues are resolved. Two minor items remain:
Future ImprovementsBoth of Adam's suggestions are great candidates for follow-up work:
These would be separate PRs to keep scope manageable. Next push will include the But regardless of errors you see - you should be able to test logins in field bellow as well as successfully login as LDAP accounts into NPM. |
Frontend was using bindDn/groupDn (lowercase n) but the API PUT schema defines bindDN/groupDN (uppercase DN) with additionalProperties:false, causing 'data must NOT have additional properties' validation errors. Also fixes existing configs not loading into the form: GET response returns bindDN/groupDN but form was reading bindDn/groupDn. Files changed: - getLdapSettings.ts: bindDn→bindDN, groupDn→groupDN in interface - LdapSettings.tsx: all form state, setField, and value bindings updated No backend changes needed — schema and rowToConfig are correct.
From MrMeatikins security audit (LOW severity). If STARTTLS or bind fails after the TCP connection is established, the raw ldapjs socket was not explicitly destroyed. The leak is bounded by the semaphore (max 10 connections) and the OS eventually cleans up, but explicit cleanup is proper resource management. Fix: wrap post-connection operations (STARTTLS, bind) in try/catch that calls rawClient.destroy() before re-throwing on any failure. Tests: add two new unit tests to ldap-client.test.js - verify rawClient.destroy() is called when STARTTLS fails mid-connection - verify rawClient.destroy() is called when bind fails after TCP connect Full suite: 218/218 tests pass.
- Log on reaper startup: pool key, interval, and idleTimeout config - Log when reaper fires: pool key + number of connections being checked - Log each reaped connection: pool key, idle duration, destroyed flag, and remaining pool size after removal All logs use debug level via the existing ldap logger — no noise in normal operation. Addresses MrMeatikins security audit (INFORMATIONAL). 🧪 218/218 Jest tests passing
…, dark theme styling - Human-readable API validation errors (replaces cryptic AJV output) 'Unknown fields: bindDn. Expected: bindDN, serverUrl, ...' instead of 'data must NOT have additional properties' x10 - Bypass humps camelCase/snake_case transform for LDAP API calls humps mangled bindDN→bind_d_n breaking all LDAP save/test operations - Environment variable override indicators on LDAP settings form Lock icons, ENV badges, orange dashed borders, read-only fields Warning banner when env overrides are active Full dark theme contrast (white text, no browser dimming) - Enabled/Disabled badge: white text on green/red, visible in all states - 30 new validator unit tests
The formatValidationErrors() pattern case was producing 'Invalid format for field "value"' but upstream Cypress tests (CVE-2024-46256) assert on the original AJV format 'data/field must match pattern'. Restored AJV-compatible output for pattern errors to avoid breaking existing test expectations.
|
Ok thanks for explaining. Something else to consider here, after your manual testing is working, we need to consider automated integration testing. In v3 (which is now abandoned) I implemented LDAP, albeit in a much simpler form. As a nice result, I've backported the Cypress test suite for it in v2. Take a look at This is a test suite that only runs on the Postgres stack because it also spins up Authentik, preconfigured with LDAP AD. |
|
Thanks for testing! Make sure you specify
How It WorksCheck https://github.com/Wadera/nginx-proxy-manager/blob/feature/ldap-auth/docs/ldap-authentication.md for more details |
|
@Wadera thanks for the hint. The users synchronisation works but the groups are wrong because of my false I will have to fiddle with this tomorrow. I set the ldap user
In case anyone is wondering, i use synologys ldap server. |
This was my mistake while testing late at night because this e-mail was already in use for another user. HowTo for my setup
docker-compose.yml services:
db:
image: 'mariadb:10.11'
container_name: npm-backend
env_file:
- ./db.env
volumes:
- ./data/mysql:/var/lib/mysql
restart: always
app:
image: nginxproxymanager/nginx-proxy-manager-dev:pr-5345
container_name: npm-frontend
ports:
- '80:80'
- '82:81'
- '443:443'
env_file:
- ./app.env
dns:
- 192.168.0.90 # Pi-hole
- 1.1.1.1 # Cloudflare
volumes:
- ./data:/data
- ./letsencrypt:/etc/letsencrypt
restart: alwaysapp.env DB_MYSQL_HOST=db
DB_MYSQL_PORT=3306
DB_MYSQL_USER=supersecret
DB_MYSQL_PASSWORD=supersecret
DB_MYSQL_NAME=supersecret
LDAP_ENABLED=true
LDAP_SERVER_URL=ldap://ldap.corp.com
LDAP_BIND_DN=uid=root,cn=users,dc=corp,dc=com
LDAP_BIND_PASSWORD=supersecret2
LDAP_SEARCH_BASE=dc=corp,dc=com
LDAP_GROUP_DN=cn=groups,dc=corp,dc=com
LDAP_USER_ATTR=uid
LDAP_ADMIN_GROUP=npm-admins
LDAP_TLS_VERIFY=false
LDAP_STARTTLS=true |
|
I've been testing this PR against an Active Directory environment and found a few significant issues that should be addressed before merging: an identifier conflict, asynchronous execution bugs, and conflicting state checks. 1. Primary Identifier & Email Uniqueness ConflictCurrently, the sync relies on the email address as the primary identifier. This causes a Furthermore, because of how the sync loop is currently executing, this specific error triggered three separate times within the exact same second for a single user during my test, implying the error is either being caught in a loop or multiple parallel async operations are fetching and throwing it simultaneously. Recommendation: The system should use 2. Out-of-Order Sync Logic and Unresolved PromisesThe sync operation appears to process disables, print the summary report, and then process group-based re-enables. The When I ran a sync, the summary reported: However, the logs show the script disabling the users, reporting the summary, and then immediately re-enabling those same users. Based on the actual end-state of the database, the summary should read: 3. Conflicting State ChecksThere are conflicting evaluations fighting over the database state. For example, looking at the (sanitized) user
If the user was genuinely not found in the directory scan, they shouldn't be re-enabled via a group policy pass milliseconds later. Errors and disabled states should also remain strictly distinct in the final count, as an error does not automatically mean a successful disable. Relevant Log Snippet: |
… email PR reviewer feedback on NginxProxyManager#5345: using email as primary identifier causes cross-source binding errors when an LDAP user shares an email with a local account. This change uses the stable directory GUID as the internal key. Changes: - ldap.js: add objectGUID/entryUUID to search attributes in searchUser() and searchAllUsers(); extract both in normalizeUser() → ldapGuid field; add searchByDN() helper; support multi-attribute OR login filter via buildUserFilter(userAttribute, username, loginAttributes) - ldap-env.js: wire LDAP_LOGIN_ATTRS env var → loginAttributes camelCase; expose loginAttributes in rowToLdapClientConfig() - ldap-sync.js: rewrite provisionUser() to use GUID-first lookup via auth table; generate synthetic {guid}@ldap.local email when real email absent or collides with a local account; add _provisionNewLdapUser() helper; add _provisionByEmail() legacy fallback for servers without GUID support; add backfillGuids() for one-time migration of existing LDAP users; update syncAllUsers() to track seenGuids for absence detection - migrations/20260301100000_ldap_guid.js: add ldap_guid column to auth table with partial unique index (SQLite/PG) / generated column (MySQL) - tests: new GUID-based identity test suite + email collision regression tests covering local+LDAP same email scenario Architectural impact: - LDAP users are now isolated by GUID, not email → no cross-source binding possible even if emails collide - Existing users without GUID are migrated on first login (email fallback + GUID backfill) or via the backfillGuids() admin function - Breaking: auth table gains ldap_guid column (migration required)
…hPaged
Root cause: searchPaged used Promise.resolve().then(pageHandler) for the
'page' event handler but the 'end' event could fire synchronously in the
same tick (ldapjs emits page+end together on the last page), resolving
the outer promise before the pageHandler microtask ran.
This caused syncAllUsers step 4 (disable-absent-users) to start while
step 3 (provision/update) was still in-flight, producing:
- disable-then-reenable races for users on the last page
- inaccurate synced/provisioned summary counts
Fix: track the in-flight page handler promise in inFlightHandler and
await it in the 'end' handler before proceeding. With pagePause=true
only one page handler is ever in-flight at a time, so a single variable
(initialised to Promise.resolve()) is sufficient.
Tests added:
- ldap-client.test.js: regression test for end firing synchronously
in the same tick as page (simulates the ldapjs race condition)
- ldap-sync.test.js: summary counts match actual DB state after
searchPaged resolves (end-to-end count accuracy)
Refs: PR NginxProxyManager#5345 review (task c806588a)
… error In syncAllUsers, when provisionUser throws 'User is not a member of the required LDAP group', the user was previously counted as an error and left enabled (their GUID/email was in the seen sets so step 4 skipped them, but no disable happened). This fixes the unified decision point: - User seen in LDAP + passes group check → enabled (provisioned/synced) - User seen in LDAP + fails group check → disabled (not error) - User NOT seen in LDAP → disabled by step 4 The two disable paths are mutually exclusive: step 3 (group check) and step 4 (directory absence) can never conflict because seen-in-step-3 entries are in seenGuids/seenEmails and have isAbsent=false in step 4. Changes: - Catch 'not in required group' error separately; disable existing user or skip new users — never increment errors counter - Track existingId before provisionUser so catch can disable correctly - Add explicit comments making the two-path contract visible in step 4 - Update JSDoc to document disabled/errors mutual exclusivity - Update @returns JSDoc to describe disabled vs errors semantics Tests added: - User in LDAP but not in group → result.disabled=1, result.errors=0 - User in LDAP but not in group, already disabled → skipped (no double-disable) - Error and disabled counts are mutually exclusive in the same sync run Ref: PR NginxProxyManager#5345 review, task 90ff6f6a
…vars, template literals, escapes - Move backfillGuids method inside ldapSync object (was outside, causing parse error) - Prefix unused variables with underscore (_existingId, _entry) - Remove unused const assignments in tests - Use template literal instead of string concatenation (validator) - Use optional chain (schema?.properties) - Remove useless backtick escapes in migration SQL strings Fixes CI build NginxProxyManager#19 biome lint failures.
AD objectGUID is stored as 16-byte mixed-endian binary (first 3 groups
little-endian, last 2 big-endian). The old code did a raw Buffer.toString('hex')
which produced incorrect hex strings that didn't match the standard Microsoft
GUID format. This caused GUID lookup failures, unnecessary synthetic emails,
and users being flagged as removed on re-sync.
Changes:
- Add parseObjectGUID(buf) helper with correct endian byte-swapping
- Add guidToLdapFilter(guid) for AD binary search filter encoding
- Add searchByGUID() method for GUID-based LDAP lookups
- Update normalizeUser() to use parseObjectGUID() with fallback
- Add reencodeGuids() migration helper for old raw-hex format
- Import parseObjectGUID in ldap-sync.js for backfill operations
- Fix pre-existing undefined variable bug in ldap-sync test
- Add mock for parseObjectGUID in ldap-sync test module
Test coverage: 31 new tests in objectguid.test.js covering:
- 5 binary-to-GUID test vectors (Buffer + binary string inputs)
- Edge cases (wrong length, empty, non-buffer)
- guidToLdapFilter round-trip verification
- normalizeUser integration (Buffer, string, entryUUID, fallback)
- Full sync cycle integration (idempotency, filter generation, ldapjs compat)
Fixes: adamoutler AD sync report (0 synced, 4 disabled, 4 errors)
|
Docker Image for build 23 is available on DockerHub: Note Ensure you backup your NPM instance before testing this image! Especially if there are database changes. Warning Changes and additions to DNS Providers require verification by at least 2 members of the community! |
|
Idea is to only provide access to management database, but be in your mind that NPM have support for both: Standard users (who can only create proxy and and see it's own entries): And Administrator type accounts: Which can manage settings, see audit logs, can create Users and see ALL entries from all users (all proxies, redirects, etc) What you expect (auth proxy access) is cool idea, but not under scope of this PR. |
|
That makes sense regarding the scope. Access lists are definitely something for a followup PR. Authelia is a horrible DX; I'd rather put an Apache Server LDAP authorization in front of sites than use Authelia ever again. I've verified that login worked properly with a test account. My plan is to keep a local admin which never gets used, alongside a user account from LDAP which I use for day-to-day management. However, I found a new breaking bug when modifying group permissions after an initial sync. Group Restriction Causes Fatal Sync Crash (
|





















Add support for AD / LDAP authentication.