fix: prevent daemon privilege mismatch broken state#1316
Conversation
Addresses the broken state where a privileged daemon (root/admin) blocks non-privileged users from connecting, starting, or killing the daemon. Four-layer solution: de-escalation, permission relaxation, actionable errors, and dead-process auto-recovery. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Nine-task plan covering: feature flag, privilege module (Unix+Windows), daemon integration, socket permission relaxation, mismatch detection with auto-recovery, and integration tests. Also updates spec to use feature flag instead of CLI --allow-root flag. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Moves windows-sys from dev-only to runtime cfg(windows) dependency with Win32_Security feature for elevation detection and linked token de-escalation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Calls check_and_deescalate_privileges() before lock acquisition in run_daemon(). This ensures the daemon always runs as the real user when invoked via sudo, and refuses to run as true root unless daemon_allow_root is enabled. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…cess Since ~/.git-ai/ is already 0700, expanding socket perms to include group allows same-user processes at different privilege levels to connect (belt-and-suspenders alongside de-escalation). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add unit tests for privilege module: - check_pid_status_dead_process: validates PID detection for dead processes - check_pid_status_alive_process: validates PID detection for current process - privilege_check_not_elevated: validates non-root privilege checking Add integration test for daemon stale lock recovery: - stale_lock_cleaned_up_on_start: ensures daemon auto-recovers when lock file points to a dead process Also fix missing daemon_allow_root field in FeatureFlags test initializers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The ~/.git-ai/ directory isn't explicitly set to 0700; access is bounded by parent directory permissions in the home dir tree. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Shorter, cleaner name. Env var is now GIT_AI_ALLOW_ROOT=true. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| // Clear SUDO_* env vars now that we've dropped privileges | ||
| unsafe { | ||
| std::env::remove_var("SUDO_UID"); | ||
| std::env::remove_var("SUDO_GID"); | ||
| std::env::remove_var("SUDO_USER"); | ||
| std::env::remove_var("SUDO_COMMAND"); | ||
| } | ||
|
|
||
| return PrivilegeAction::Continue; |
There was a problem hiding this comment.
🔴 Privilege de-escalation doesn't update HOME or recompute DaemonConfig, causing daemon failure on Linux
After setuid() drops root privileges, the HOME environment variable is not updated and the DaemonConfig (already computed before de-escalation) still points to root's home directory paths. On Linux, sudo changes HOME to /root by default. The flow is: handle_run() resolves DaemonConfig using home_dir() which reads HOME=/root (src/commands/daemon.rs:182), then passes it to run_daemon(config). Inside run_daemon, privileges are dropped (src/daemon.rs:8436-8441), but config still contains paths like /root/.git-ai/internal/daemon/. The subsequent DaemonLock::acquire(&config.lock_path) at src/daemon.rs:8444 tries to create/open a file in root's directory as the now-non-root user, which fails with permission denied. This defeats the entire purpose of the privilege de-escalation feature on Linux systems.
Prompt for agents
The privilege de-escalation in check_and_deescalate_privileges() (src/privilege.rs) drops root via setuid/setgid and clears SUDO_* env vars, but does NOT update the HOME environment variable to the real user's home directory. Since DaemonConfig is computed in handle_run() (src/commands/daemon.rs:182) BEFORE run_daemon() calls check_and_deescalate_privileges() (src/daemon.rs:8436), the config paths are already resolved using root's HOME.
Two things need to happen:
1. After setuid(), update HOME to the real user's home directory. You can look it up via libc::getpwuid_r(uid) to get the passwd entry, or simply use the dirs crate which internally calls getpwuid after the UID has changed.
2. The architectural issue is that DaemonConfig is computed before de-escalation. Either: (a) move privilege de-escalation to happen before DaemonConfig resolution in handle_run(), or (b) have run_daemon() recompute the DaemonConfig after de-escalation by calling DaemonConfig::from_env_or_default_paths() again. Option (a) is cleaner - call check_and_deescalate_privileges() at the start of handle_run() before daemon_config_from_env_or_default_paths().
Was this helpful? React with 👍 or 👎 to provide feedback.
| let exe_path = | ||
| std::env::current_exe().map_err(|e| format!("failed to get current exe: {}", e))?; |
There was a problem hiding this comment.
🔴 Windows respawn uses current_exe() instead of current_git_ai_exe(), risking git shim fork bomb
The respawn_deescalated_windows() function at line 227 uses std::env::current_exe() to determine the binary path for respawning. However, the codebase explicitly warns against this pattern in spawn_daemon_run_detached (src/commands/daemon.rs:414-417): when the current exe is the git shim (e.g. ~/.local/bin/git), current_exe() returns the shim path, causing the respawned process to enter handle_git() instead of handle_git_ai(), which can cause a fork bomb. The respawn should use crate::utils::current_git_ai_exe() to resolve through symlinks.
| let exe_path = | |
| std::env::current_exe().map_err(|e| format!("failed to get current exe: {}", e))?; | |
| let exe_path = crate::utils::current_git_ai_exe() | |
| .map_err(|e| format!("failed to get current git-ai exe: {}", e))?; |
Was this helpful? React with 👍 or 👎 to provide feedback.
windows-sys 0.61 uses *mut c_void for HANDLE, not integer. Use std::ptr::null_mut() for handle initialization and .is_null() for null checks. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Move privilege check before DaemonConfig resolution so paths use real user's HOME instead of root's - Add lookup_home_dir() to update HOME env var after setuid - Use current_git_ai_exe() instead of current_exe() to avoid git shim fork bomb on Windows respawn Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
CodeQL flagged the uid value as cleartext logging of sensitive information. Remove the numeric values from the message. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Windows CI runs as elevated and the new privilege check refuses to start the daemon. Pass the env var so tests can proceed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The previous run failed on pre-existing timing-dependent tests (concurrent burst race, graphite session timeouts) unrelated to this PR's changes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
setuid/setgidwhenSUDO_UID/SUDO_GIDdetected; Windows respawns with linked (non-elevated) token viaCreateProcessWithTokenWdaemon_allow_rootfeature flag (default false) for CI/container environments that legitimately run as rootProblem
When the daemon is started as a privileged user (root/admin), non-privileged users get stuck in an unrecoverable loop: can't connect (socket perms), can't start new daemon (lock held), can't kill old daemon (insufficient privileges).
Solution
Four-layer defense:
Test plan
🤖 Generated with Claude Code