Skip to content

fix: prevent daemon privilege mismatch broken state#1316

Open
svarlamov wants to merge 18 commits into
mainfrom
fix/daemon-privilege-mismatch
Open

fix: prevent daemon privilege mismatch broken state#1316
svarlamov wants to merge 18 commits into
mainfrom
fix/daemon-privilege-mismatch

Conversation

@svarlamov
Copy link
Copy Markdown
Member

@svarlamov svarlamov commented May 9, 2026

Summary

  • Partially addresses [Bug]: Daemon lock file not recovered after macOS sleep/wake kills daemon (async mode) #1041
  • Adds privilege de-escalation at daemon startup: Unix drops root via setuid/setgid when SUDO_UID/SUDO_GID detected; Windows respawns with linked (non-elevated) token via CreateProcessWithTokenW
  • Adds daemon_allow_root feature flag (default false) for CI/container environments that legitimately run as root
  • Relaxes daemon socket permissions from 0o600 to 0o660 so same-user processes at different privilege levels can connect
  • Replaces opaque "lock held" errors with smart diagnostics: detects dead processes (auto-cleans stale files) and privilege mismatches (shows platform-specific fix commands)

Problem

When the daemon is started as a privileged user (root/admin), non-privileged users get stuck in an unrecoverable loop: can't connect (socket perms), can't start new daemon (lock held), can't kill old daemon (insufficient privileges).

Solution

Four-layer defense:

  1. De-escalation — daemon drops privileges before acquiring lock
  2. Permission relaxation — sockets accessible across privilege levels
  3. Detection — actionable error with platform-specific fix command
  4. Recovery — auto-cleans stale locks from dead processes

Test plan

  • Unit tests for PID status detection (dead process, alive process)
  • Unit test for privilege check when not elevated
  • Integration test for stale lock auto-recovery
  • Full test suite passes
  • Lint and format pass
  • CI: Ubuntu checks pass
  • CI: Windows checks (linked token path exercised)

🤖 Generated with Claude Code


Open in Devin Review

svarlamov and others added 12 commits May 8, 2026 20:02
Addresses the broken state where a privileged daemon (root/admin) blocks
non-privileged users from connecting, starting, or killing the daemon.
Four-layer solution: de-escalation, permission relaxation, actionable
errors, and dead-process auto-recovery.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Nine-task plan covering: feature flag, privilege module (Unix+Windows),
daemon integration, socket permission relaxation, mismatch detection
with auto-recovery, and integration tests.

Also updates spec to use feature flag instead of CLI --allow-root flag.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Moves windows-sys from dev-only to runtime cfg(windows) dependency
with Win32_Security feature for elevation detection and linked token
de-escalation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Calls check_and_deescalate_privileges() before lock acquisition in
run_daemon(). This ensures the daemon always runs as the real user
when invoked via sudo, and refuses to run as true root unless
daemon_allow_root is enabled.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…cess

Since ~/.git-ai/ is already 0700, expanding socket perms to include
group allows same-user processes at different privilege levels to
connect (belt-and-suspenders alongside de-escalation).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add unit tests for privilege module:
- check_pid_status_dead_process: validates PID detection for dead processes
- check_pid_status_alive_process: validates PID detection for current process
- privilege_check_not_elevated: validates non-root privilege checking

Add integration test for daemon stale lock recovery:
- stale_lock_cleaned_up_on_start: ensures daemon auto-recovers when lock
  file points to a dead process

Also fix missing daemon_allow_root field in FeatureFlags test initializers.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The ~/.git-ai/ directory isn't explicitly set to 0700; access is
bounded by parent directory permissions in the home dir tree.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@svarlamov svarlamov requested a review from heapwolf May 9, 2026 02:22
Shorter, cleaner name. Env var is now GIT_AI_ALLOW_ROOT=true.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 potential issues.

View 4 additional findings in Devin Review.

Open in Devin Review

Comment thread src/privilege.rs
Comment on lines +61 to +69
// Clear SUDO_* env vars now that we've dropped privileges
unsafe {
std::env::remove_var("SUDO_UID");
std::env::remove_var("SUDO_GID");
std::env::remove_var("SUDO_USER");
std::env::remove_var("SUDO_COMMAND");
}

return PrivilegeAction::Continue;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Privilege de-escalation doesn't update HOME or recompute DaemonConfig, causing daemon failure on Linux

After setuid() drops root privileges, the HOME environment variable is not updated and the DaemonConfig (already computed before de-escalation) still points to root's home directory paths. On Linux, sudo changes HOME to /root by default. The flow is: handle_run() resolves DaemonConfig using home_dir() which reads HOME=/root (src/commands/daemon.rs:182), then passes it to run_daemon(config). Inside run_daemon, privileges are dropped (src/daemon.rs:8436-8441), but config still contains paths like /root/.git-ai/internal/daemon/. The subsequent DaemonLock::acquire(&config.lock_path) at src/daemon.rs:8444 tries to create/open a file in root's directory as the now-non-root user, which fails with permission denied. This defeats the entire purpose of the privilege de-escalation feature on Linux systems.

Prompt for agents
The privilege de-escalation in check_and_deescalate_privileges() (src/privilege.rs) drops root via setuid/setgid and clears SUDO_* env vars, but does NOT update the HOME environment variable to the real user's home directory. Since DaemonConfig is computed in handle_run() (src/commands/daemon.rs:182) BEFORE run_daemon() calls check_and_deescalate_privileges() (src/daemon.rs:8436), the config paths are already resolved using root's HOME.

Two things need to happen:
1. After setuid(), update HOME to the real user's home directory. You can look it up via libc::getpwuid_r(uid) to get the passwd entry, or simply use the dirs crate which internally calls getpwuid after the UID has changed.
2. The architectural issue is that DaemonConfig is computed before de-escalation. Either: (a) move privilege de-escalation to happen before DaemonConfig resolution in handle_run(), or (b) have run_daemon() recompute the DaemonConfig after de-escalation by calling DaemonConfig::from_env_or_default_paths() again. Option (a) is cleaner - call check_and_deescalate_privileges() at the start of handle_run() before daemon_config_from_env_or_default_paths().
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment thread src/privilege.rs Outdated
Comment on lines +226 to +227
let exe_path =
std::env::current_exe().map_err(|e| format!("failed to get current exe: {}", e))?;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Windows respawn uses current_exe() instead of current_git_ai_exe(), risking git shim fork bomb

The respawn_deescalated_windows() function at line 227 uses std::env::current_exe() to determine the binary path for respawning. However, the codebase explicitly warns against this pattern in spawn_daemon_run_detached (src/commands/daemon.rs:414-417): when the current exe is the git shim (e.g. ~/.local/bin/git), current_exe() returns the shim path, causing the respawned process to enter handle_git() instead of handle_git_ai(), which can cause a fork bomb. The respawn should use crate::utils::current_git_ai_exe() to resolve through symlinks.

Suggested change
let exe_path =
std::env::current_exe().map_err(|e| format!("failed to get current exe: {}", e))?;
let exe_path = crate::utils::current_git_ai_exe()
.map_err(|e| format!("failed to get current git-ai exe: {}", e))?;
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

svarlamov and others added 2 commits May 8, 2026 22:32
windows-sys 0.61 uses *mut c_void for HANDLE, not integer.
Use std::ptr::null_mut() for handle initialization and .is_null()
for null checks.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Move privilege check before DaemonConfig resolution so paths use real
  user's HOME instead of root's
- Add lookup_home_dir() to update HOME env var after setuid
- Use current_git_ai_exe() instead of current_exe() to avoid git shim
  fork bomb on Windows respawn

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Comment thread src/privilege.rs Fixed
svarlamov and others added 3 commits May 8, 2026 22:46
CodeQL flagged the uid value as cleartext logging of sensitive
information. Remove the numeric values from the message.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Windows CI runs as elevated and the new privilege check refuses to
start the daemon. Pass the env var so tests can proceed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The previous run failed on pre-existing timing-dependent tests
(concurrent burst race, graphite session timeouts) unrelated to
this PR's changes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants