Skip to content

Add disk quota support#171

Open
parkervcp wants to merge 17 commits intopelican-dev:mainfrom
parkervcp:feature/disk_quotas
Open

Add disk quota support#171
parkervcp wants to merge 17 commits intopelican-dev:mainfrom
parkervcp:feature/disk_quotas

Conversation

@parkervcp
Copy link
Contributor

@parkervcp parkervcp commented Mar 1, 2026

This adds disk quota support for the EXT4 filesystem. May also work on XFS but that has not been tested yet.

Requires setup called out in the docs. https://pelican.dev/docs/guides/disk-quotas/about

Quotas will hard stop any file writes when they would go beyond the allocated space.

This also replaces the "syscall" package with "golang.org/x/sys/unix". Syscall was deprecated in 2014.

Summary by CodeRabbit

  • New Features

    • Filesystem quota management added for supported filesystems; servers can register and enforce quotas when enabled.
    • Server PID exposed for quota identification.
  • Configuration

    • New optional "quotas" toggle to enable/disable quota management; system validates filesystem quota support automatically.
  • Bug Fixes / Stability

    • Improved quota-aware disk checks, error handling and logging; minor wording and formatting cleanups.

This uses the built in quota management services to manage quotas for EXT4. May potentially work on XFS but has not been tested.

See the docs for more information.

https://pelican.dev/docs/guides/disk-quotas/about
builds were failing due to un-used err
This uses the built in quota management services to manage quotas for EXT4. May potentially work on XFS but has not been tested.

See the docs for more information.

https://pelican.dev/docs/guides/disk-quotas/about
builds were failing due to un-used err
This uses the built in quota management services to manage quotas for EXT4. May potentially work on XFS but has not been tested.

See the docs for more information.

https://pelican.dev/docs/guides/disk-quotas/about
builds were failing due to un-used err
update internal name for the panel id to Pid to avoid overlap with ID as the uuid

moves quota checking earlier to avoid an out of order problem

add logging to quota management

update detection of supported filesystems

update how directory xattr is handled
@parkervcp parkervcp requested a review from a team as a code owner March 1, 2026 17:52
@coderabbitai
Copy link

coderabbitai bot commented Mar 1, 2026

📝 Walkthrough

Walkthrough

Adds a new filesystem quota subsystem (EXT4/XFS) and integrates it across server lifecycle: configuration flag, startup validation, project/xattr management, quota APIs (Add/Del/Set/Get), and quota-aware disk checks. Also migrates syscall usages to golang.org/x/sys/unix and updates module versions.

Changes

Cohort / File(s) Summary
CLI & Config
cmd/root.go, config/config.go, go.mod
Adds system quotas config flag and startup validation; bumps Go/tooling deps and adds quota-related modules.
Quota Subsystem
server/filesystem/quotas/functions.go, server/filesystem/quotas/exfs.go, server/filesystem/quotas/syscall_xattr.go
New quota module: FS detection, project management, xattr/ioctl helpers, and public Add/Del/Set/Get quota APIs (EXT4/XFS-focused).
Server Integration
server/manager.go, server/server.go, server/configuration.go, server/resources.go, server/power.go
Wires quotas into init/sync/resource reporting/pre-start checks; adds Configuration.Pid, Configuration.GetPID(), and Server.PID() to support quota operations.
Router & Deletion Flow
router/router_server.go, router/router_server_files.go, router/router_transfer.go
On server deletion, attempts asynchronous quota cleanup (DelQuota); other router files only have minor formatting tweaks.
Syscall → unix Migration
internal/ufs/error.go, system/system.go, server/filesystem/stat_linux.go
Replaces syscall types/usages (Errno, Stat_t, Statfs) with golang.org/x/sys/unix equivalents across codebase.
Misc Formatting
server/activity.go, server/crash.go, server/transfer/archive.go
Whitespace/formatting and small comment/text adjustments only.

Sequence Diagram(s)

sequenceDiagram
    participant Manager as Server Manager
    participant Server as Server Instance
    participant Quotas as quotas package
    participant FS as Filesystem (EXT4/XFS)

    Manager->>Server: InitServer()
    Server->>Quotas: AddQuota(pid, uuid)
    Quotas->>FS: Detect FS / ensure supported
    FS-->>Quotas: FS supported / unsupported
    Quotas-->>Server: success / error
    Server->>Quotas: SetQuota(limit, uuid) note right of Quotas: on Sync()
    Quotas->>FS: write projid/projects, set xattr, apply quota
    FS-->>Quotas: applied / error
    Router->>Quotas: DelQuota(uuid) note right of Quotas: on deletion (async)
    Quotas-->>Router: success / error
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • QuintenQVD0

Poem

🐰
I nibbled projids late at night,
Set xattrs till the moon was bright.
EXT4 and XFS in line,
Quotas snug — the disk is fine. 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 76.19% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title "Add disk quota support" directly and accurately describes the main change in the pull request, which introduces comprehensive disk quota functionality across multiple files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Nitpick comments (2)
server/filesystem/stat_linux.go (1)

18-21: Remove duplicated Stat_t branch in CTime().

The second type-assertion block repeats the first one exactly and is unreachable as a distinct fallback.

♻️ Proposed cleanup
 func (s *Stat) CTime() time.Time {
 	if st, ok := s.Sys().(*unix.Stat_t); ok {
 		// Do not remove these "redundant" type-casts, they are required for 32-bit builds to work.
 		return time.Unix(int64(st.Ctim.Sec), int64(st.Ctim.Nsec))
 	}
-	if st, ok := s.Sys().(*unix.Stat_t); ok {
-		// Do not remove these "redundant" type-casts, they are required for 32-bit builds to work.
-		return time.Unix(int64(st.Ctim.Sec), int64(st.Ctim.Nsec))
-	}
 	return time.Time{}
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/filesystem/stat_linux.go` around lines 18 - 21, In CTime() remove the
redundant second type-assertion branch that repeats the first (the block
checking s.Sys().(*unix.Stat_t) and returning time.Unix(int64(st.Ctim.Sec),
int64(st.Ctim.Nsec))); keep the original single branch (including the existing
32-bit build comment) and any fallback behavior after it, so only one Stat_t
type-assertion and return remains in the function.
router/router_server.go (1)

287-290: Use a goroutine-local error variable for quota cleanup.

Avoid assigning to outer err from inside the goroutine.

♻️ Proposed refactor
 		if config.Get().System.Quotas.Enabled {
-			if err = quotas.DelQuota(s.Config().Uuid); err != nil {
-				log.WithFields(log.Fields{"server_id": s.Config().Pid, "error": err}).
+			if qerr := quotas.DelQuota(s.Config().Uuid); qerr != nil {
+				log.WithFields(log.Fields{"server_id": s.Config().Pid, "error": qerr}).
 					Warn("failed to remove quota during deletion process")
 			}
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@router/router_server.go` around lines 287 - 290, The goroutine is assigning
to the outer variable err when calling quotas.DelQuota(s.Config().Uuid); change
that to use a goroutine-local variable (e.g., qerr :=
quotas.DelQuota(s.Config().Uuid)) and log or handle qerr inside the goroutine
(log.WithFields(..., "error": qerr).Warn(...)); if the caller needs the result,
send qerr back via a channel or sync.WaitGroup rather than reassigning the outer
err to avoid race/variable-shadowing on err.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@server/filesystem/quotas/exfs.go`:
- Around line 15-16: exfsProjects (type exfsProject) is shared mutable state and
must be synchronized: add a package-level sync.RWMutex (e.g. exfsMu) and use
exfsMu.RLock/RUnlock around readers and exfsMu.Lock/Unlock around writers that
mutate exfsProjects; for file writes (the code paths that write /etc/projid and
/etc/projects) take the lock, make a shallow copy/snapshot of exfsProjects while
holding the lock, then release the lock and perform the actual disk I/O using
the snapshot to avoid holding the mutex during blocking file operations; update
all places that read or modify exfsProjects to use the new mutex.
- Around line 153-167: The writeTemplate function currently returns early on
t.Execute failure and leaks the open file f; modify writeTemplate (which accepts
*template.Template, file string, data interface{}) to always close f by
deferring a close right after successful os.Create, and ensure any Close error
is propagated or merged with the existing err (e.g., if Execute fails, preserve
that error but still attempt to Close and return the first non-nil error). Use a
defer that captures and assigns the Close error into the named return variable
so f is closed on all paths.

In `@server/filesystem/quotas/functions.go`:
- Around line 35-79: In IsSupportedFS/getFSType logging, fix the typos in
filesystem labels: change the FSBTRFS case's log field value from "brtfs" to
"btrfs" and update the FSZFS branch error message (currently "btrfs is not
supported") to correctly state "zfs is not supported"; ensure the log.WithField
keys remain "fs-type" and "path" and update only the string literals in the
FSBTRFS case and the FSZFS error branch to keep messages consistent with
FSEXT4/FSXFS branches.
- Around line 107-115: SetQuota currently casts the int64 limit to uint64 which
will wrap negative values into huge quotas; before calling getProject or
fsProject.setQuota, validate that the incoming limit is non-negative and return
a sensible error (e.g., invalid argument) if limit < 0. Update SetQuota to check
the limit, reject negatives, and only call fsProject.setQuota(uint64(limit))
when the check passes; reference SetQuota, getProject and fsProject.setQuota to
locate where to add the guard.

In `@server/filesystem/quotas/syscall_xattr.go`:
- Around line 49-55: The fsXAttr struct is too small compared to the kernel ABI
(28 bytes) and is missing the CowExtSize field; update the struct definition for
fsXAttr by adding a CowExtSize uint32 field and expanding FSXPad to provide the
remaining padding so the total struct size is 28 bytes (match kernel ABI used by
FS_IOC_FSGETXATTR and FS_IOC_FSSETXATTR), ensuring the field order remains
XFlags, ExtSize, CowExtSize, NextENTs, ProjectID, and then padding to reach 28
bytes.

In `@server/power.go`:
- Around line 190-203: The quota branch wrongly treats unlimited-disk servers as
over quota because it compares used >= s.DiskSpace() when DiskSpace() <= 0;
modify the quota check in the block guarded by
config.Get().System.Quotas.Enabled so that you first handle the unlimited-disk
case (if s.DiskSpace() <= 0) by skipping the quota check and
returning/continuing normally, or after getting used from
quotas.GetQuota(s.Config().Uuid) explicitly ignore the comparison when
s.DiskSpace() <= 0; ensure you keep calls like s.PublishConsoleOutputFromDaemon
and the error logging around quotas.GetQuota unchanged except for this
additional guard so unlimited servers are not blocked from starting.

In `@server/resources.go`:
- Around line 38-44: The code currently calls quotas.GetQuota and
unconditionally calls atomic.StoreInt64(&s.resources.Disk, used), which
overwrites the disk value with 0 on errors; change the flow in the block guarded
by config.Get().System.Quotas.Enabled so that you only update s.resources.Disk
(via atomic.StoreInt64) when quotas.GetQuota returns nil error (i.e., on
success), and on error leave the existing s.resources.Disk value unchanged (or
optionally use a documented fallback) while still logging the error from
quotas.GetQuota with s.ID() for diagnostics.

---

Nitpick comments:
In `@router/router_server.go`:
- Around line 287-290: The goroutine is assigning to the outer variable err when
calling quotas.DelQuota(s.Config().Uuid); change that to use a goroutine-local
variable (e.g., qerr := quotas.DelQuota(s.Config().Uuid)) and log or handle qerr
inside the goroutine (log.WithFields(..., "error": qerr).Warn(...)); if the
caller needs the result, send qerr back via a channel or sync.WaitGroup rather
than reassigning the outer err to avoid race/variable-shadowing on err.

In `@server/filesystem/stat_linux.go`:
- Around line 18-21: In CTime() remove the redundant second type-assertion
branch that repeats the first (the block checking s.Sys().(*unix.Stat_t) and
returning time.Unix(int64(st.Ctim.Sec), int64(st.Ctim.Nsec))); keep the original
single branch (including the existing 32-bit build comment) and any fallback
behavior after it, so only one Stat_t type-assertion and return remains in the
function.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ecf678f and f977c45.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (20)
  • cmd/root.go
  • config/config.go
  • go.mod
  • internal/ufs/error.go
  • router/router_server.go
  • router/router_server_files.go
  • router/router_transfer.go
  • server/activity.go
  • server/configuration.go
  • server/crash.go
  • server/filesystem/quotas/exfs.go
  • server/filesystem/quotas/functions.go
  • server/filesystem/quotas/syscall_xattr.go
  • server/filesystem/stat_linux.go
  • server/manager.go
  • server/power.go
  • server/resources.go
  • server/server.go
  • server/transfer/archive.go
  • system/system.go
💤 Files with no reviewable changes (1)
  • server/activity.go
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Analyze (go)
  • GitHub Check: Build and Test (ubuntu-22.04, 1.25.1, linux, amd64)
  • GitHub Check: Build and Test (ubuntu-22.04, 1.24.7, linux, amd64)
🧰 Additional context used
🧬 Code graph analysis (9)
server/filesystem/quotas/functions.go (1)
config/config.go (1)
  • Get (445-453)
server/server.go (3)
config/config.go (1)
  • Get (445-453)
system/system.go (1)
  • System (61-68)
server/filesystem/quotas/functions.go (1)
  • SetQuota (107-117)
router/router_server.go (2)
config/config.go (1)
  • Get (445-453)
server/filesystem/quotas/functions.go (1)
  • DelQuota (95-104)
cmd/root.go (3)
config/config.go (1)
  • Get (445-453)
system/system.go (1)
  • System (61-68)
server/filesystem/quotas/functions.go (1)
  • IsSupportedFS (55-82)
server/power.go (2)
config/config.go (1)
  • Get (445-453)
server/filesystem/quotas/functions.go (1)
  • GetQuota (120-129)
server/manager.go (2)
config/config.go (1)
  • Get (445-453)
server/filesystem/quotas/functions.go (1)
  • AddQuota (85-92)
server/filesystem/quotas/syscall_xattr.go (1)
internal/ufs/error.go (1)
  • NewSyscallError (54-56)
server/resources.go (5)
config/config.go (1)
  • Get (445-453)
system/system.go (1)
  • System (61-68)
server/filesystem/quotas/functions.go (1)
  • GetQuota (120-129)
server/filesystem/errors.go (1)
  • Error (25-37)
server/filesystem/filesystem.go (1)
  • Filesystem (24-34)
server/filesystem/quotas/exfs.go (1)
server/filesystem/quotas/syscall_xattr.go (1)
  • FS_XFLAG_PROJINHERIT (24-24)
🔇 Additional comments (12)
server/crash.go (1)

102-102: No action needed — formatting-only change.

This hunk does not introduce behavioral changes.

router/router_server_files.go (1)

243-243: No action needed — formatting-only change.

No semantic delta in this hunk.

router/router_transfer.go (1)

149-151: No action needed — formatting/alignment-only changes.

No functional behavior changed in these segments.

Also applies to: 210-210, 219-219, 222-222, 230-230, 238-238, 243-243, 245-245

server/transfer/archive.go (1)

58-77: Backup selection flow looks good.

The explicit set-based filtering and early return when no matching backup files exist improves clarity and avoids unnecessary work.

server/configuration.go (1)

49-52: PID wiring is clean and thread-safe.

Adding the Pid field and exposing it via a locked accessor is consistent with the configuration access pattern in this file.

Also applies to: 115-119

config/config.go (1)

217-221: Quotas config block is well-scoped.

The new quotas.enabled setting is explicit and defaults to disabled, which is a safe rollout posture.

system/system.go (1)

188-190: Statfs migration preserves behavior.

The updated calls keep the existing disk-to-mountpoint matching logic intact.

Also applies to: 194-195

server/server.go (1)

273-279: Quota-aware disk limit switching in Sync() looks good.

The enabled/disabled branch keeps behavior explicit and preserves the non-quota fallback.

cmd/root.go (1)

171-180: Good startup preflight for quota capability.

Failing fast before manager initialization is the right behavior here.

server/resources.go (1)

44-47: Good fallback path for non-quota environments.

Preserving the cached usage path when quotas are disabled keeps previous runtime behavior stable.

server/manager.go (2)

204-209: Quota provisioning placement looks good.

Conditionally calling quotas.AddQuota during init and failing fast on error is a solid integration point.


258-277: Closure capture is safe — Go 1.24 handles loop variable semantics correctly.

The codebase targets Go 1.24.0, which uses the Go 1.22+ range semantics that re-creates loop variables on each iteration. The data variable captured in the worker closure at line 259 is safe without explicit rebinding.

Comment on lines +15 to +16
var exfsProjects []exfsProject

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Protect exfsProjects and project-file writes with synchronization.

exfsProjects is shared mutable state across add/remove/get/write flows. With concurrent server operations, this can race and produce inconsistent in-memory state and malformed /etc/projid//etc/projects writes.

🔒 Proposed fix (mutex + snapshot for writes)
 import (
 	"fmt"
 	"html/template"
 	"os"
+	"sync"
 	"strings"
@@
-var exfsProjects []exfsProject
+var (
+	exfsProjects   []exfsProject
+	exfsProjectsMu sync.RWMutex
+)
@@
 func (q exfsProject) addProject() (err error) {
@@
+	exfsProjectsMu.Lock()
 	exfsProjects = append(exfsProjects, q)
+	exfsProjectsMu.Unlock()
@@
 func (q exfsProject) removeProject() (err error) {
+	exfsProjectsMu.Lock()
+	defer exfsProjectsMu.Unlock()
 	for pos, project := range exfsProjects {
 		if project.Name == q.Name {
 			exfsProjects = append(exfsProjects[:pos], exfsProjects[pos+1:]...)
+			break
 		}
 	}
@@
 func getProject(serverUUID string) (serverProject exfsProject, err error) {
+	exfsProjectsMu.RLock()
+	defer exfsProjectsMu.RUnlock()
 	for _, project := range exfsProjects {
@@
 func writeEXFSProjects() (err error) {
+	exfsProjectsMu.RLock()
+	projects := append([]exfsProject(nil), exfsProjects...)
+	exfsProjectsMu.RUnlock()
@@
-	if err = writeTemplate(idtmpl, projidFile, exfsProjects); err != nil {
+	if err = writeTemplate(idtmpl, projidFile, projects); err != nil {
@@
-	if err = writeTemplate(projtmpl, projectFile, exfsProjects); err != nil {
+	if err = writeTemplate(projtmpl, projectFile, projects); err != nil {

Also applies to: 95-99, 110-114, 120-127, 130-151

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/filesystem/quotas/exfs.go` around lines 15 - 16, exfsProjects (type
exfsProject) is shared mutable state and must be synchronized: add a
package-level sync.RWMutex (e.g. exfsMu) and use exfsMu.RLock/RUnlock around
readers and exfsMu.Lock/Unlock around writers that mutate exfsProjects; for file
writes (the code paths that write /etc/projid and /etc/projects) take the lock,
make a shallow copy/snapshot of exfsProjects while holding the lock, then
release the lock and perform the actual disk I/O using the snapshot to avoid
holding the mutex during blocking file operations; update all places that read
or modify exfsProjects to use the new mutex.

Comment on lines +153 to +167
func writeTemplate(t *template.Template, file string, data interface{}) (err error) {
f, err := os.Create(file)
if err != nil {
return err
}

err = t.Execute(f, data)
if err != nil {
return err
}

err = f.Close()
if err != nil {
return err
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Always close the file in writeTemplate, including execute-failure paths.

If t.Execute fails, the current code returns without closing f.

🛠️ Proposed fix
 func writeTemplate(t *template.Template, file string, data interface{}) (err error) {
 	f, err := os.Create(file)
 	if err != nil {
 		return err
 	}
+	defer func() {
+		if cerr := f.Close(); err == nil && cerr != nil {
+			err = cerr
+		}
+	}()
 
-	err = t.Execute(f, data)
-	if err != nil {
+	if err = t.Execute(f, data); err != nil {
 		return err
 	}
-
-	err = f.Close()
-	if err != nil {
-		return err
-	}
 
 	return nil
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func writeTemplate(t *template.Template, file string, data interface{}) (err error) {
f, err := os.Create(file)
if err != nil {
return err
}
err = t.Execute(f, data)
if err != nil {
return err
}
err = f.Close()
if err != nil {
return err
}
func writeTemplate(t *template.Template, file string, data interface{}) (err error) {
f, err := os.Create(file)
if err != nil {
return err
}
defer func() {
if cerr := f.Close(); err == nil && cerr != nil {
err = cerr
}
}()
if err = t.Execute(f, data); err != nil {
return err
}
return nil
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/filesystem/quotas/exfs.go` around lines 153 - 167, The writeTemplate
function currently returns early on t.Execute failure and leaks the open file f;
modify writeTemplate (which accepts *template.Template, file string, data
interface{}) to always close f by deferring a close right after successful
os.Create, and ensure any Close error is propagated or merged with the existing
err (e.g., if Execute fails, preserve that error but still attempt to Close and
return the first non-nil error). Use a defer that captures and assigns the Close
error into the named return variable so f is closed on all paths.

Comment on lines +107 to +115
func SetQuota(limit int64, serverUUID string) (err error) {
log.WithField("server", serverUUID).Debug("setting quota")
if fsType == FSEXT4 || fsType == FSXFS {
fsProject, err := getProject(serverUUID)
if err != nil {
return err
}
return fsProject.setQuota(uint64(limit))
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n server/filesystem/quotas/functions.go | head -130 | tail -30

Repository: pelican-dev/wings

Length of output: 958


🏁 Script executed:

# Search for SetQuota function and its callers
rg -nB3 -A10 'func SetQuota' --type=go

Repository: pelican-dev/wings

Length of output: 1001


🏁 Script executed:

# Search for calls to SetQuota
rg 'SetQuota\s*\(' --type=go -B2 -A2

Repository: pelican-dev/wings

Length of output: 693


🏁 Script executed:

# Check for disk space limit validation in related files
fd . server/filesystem --type f -name '*.go' | head -20

Repository: pelican-dev/wings

Length of output: 291


🏁 Script executed:

# Search for DiskSpace() function definition
rg -nB2 -A10 'func.*DiskSpace\s*\(' --type=go

Repository: pelican-dev/wings

Length of output: 747


🏁 Script executed:

# Search for DiskSpace field or property in server struct
rg -nB5 -A5 'DiskSpace' --type=go | head -60

Repository: pelican-dev/wings

Length of output: 3617


🏁 Script executed:

# Check if there's validation in server initialization or update
rg -nB3 -A3 'diskSpace|DiskSpace' server/server.go | head -40

Repository: pelican-dev/wings

Length of output: 396


🏁 Script executed:

# Find Build struct definition
rg -nB3 -A15 'type Build struct' --type=go

Repository: pelican-dev/wings

Length of output: 43


🏁 Script executed:

# Search for DiskSpace field definition in config
rg -nB3 -A3 'DiskSpace\s+' --type=go | grep -v 'func\|test\|//'

Repository: pelican-dev/wings

Length of output: 1756


🏁 Script executed:

# Check configuration parsing/validation
rg -nB2 -A10 'type.*Config.*struct|Build.*DiskSpace' --type=go | head -80

Repository: pelican-dev/wings

Length of output: 4711


🏁 Script executed:

# Find Build struct definition in environment package
rg -nB5 -A20 'type Build struct' server/environment --type=go

Repository: pelican-dev/wings

Length of output: 119


🏁 Script executed:

# Check for validation in config unmarshaling
rg -nB3 -A10 'UnmarshalJSON.*Build|Build.*Unmarshal' --type=go

Repository: pelican-dev/wings

Length of output: 43


🏁 Script executed:

# Check if DiskSpace has validation constraints in documentation or code
rg -nB5 -A5 'DiskSpace.*int64|disk.*space.*valid' --type=go -i

Repository: pelican-dev/wings

Length of output: 1994


🏁 Script executed:

# Find Build struct definition - search more broadly
rg -nB5 -A20 'type Build struct' --type=go

Repository: pelican-dev/wings

Length of output: 43


🏁 Script executed:

# Look for where Build struct is populated/unmarshaled
rg -nB3 -A10 'Build.*DiskSpace|\.DiskSpace\s*=' --type=go

Repository: pelican-dev/wings

Length of output: 756


🏁 Script executed:

# Check remote types that define server configuration
rg -nB5 -A25 'ServerConfigurationResponse|ProcessConfiguration' remote/types.go

Repository: pelican-dev/wings

Length of output: 4047


🏁 Script executed:

# Find where Build is populated - check server configuration parsing
rg -nB10 -A10 '\.Build\s*=' server/configuration.go | head -50

Repository: pelican-dev/wings

Length of output: 43


🏁 Script executed:

# Check the full Configuration struct definition to understand Build field
rg -nA100 'type Configuration struct' server/configuration.go | head -120

Repository: pelican-dev/wings

Length of output: 2824


🏁 Script executed:

# Look for any validation functions or constraints
rg -n 'func.*valid|Valid.*DiskSpace|negative|>= 0' --type=go server/ | head -20

Repository: pelican-dev/wings

Length of output: 232


🏁 Script executed:

# Find environment.Limits struct definition
rg -nB5 -A20 'type Limits struct' --type=go

Repository: pelican-dev/wings

Length of output: 1828


🏁 Script executed:

# Check if there's any configuration validation when loading from remote
rg -nB5 -A15 'ServerConfigurationResponse.*json|UnmarshalJSON' remote/types.go

Repository: pelican-dev/wings

Length of output: 873


🏁 Script executed:

# Look for any validation or bounds checking on DiskSpace in whole codebase
rg -n 'DiskSpace.*<|DiskSpace.*negative|if.*DiskSpace' --type=go

Repository: pelican-dev/wings

Length of output: 412


🏁 Script executed:

# Check the context around line 203 in power.go where DiskSpace <= 0 is checked
rg -nB10 -A10 'if s.DiskSpace.*<= 0' server/power.go

Repository: pelican-dev/wings

Length of output: 840


🏁 Script executed:

# Check if there are other guards or assertions on DiskSpace before it's used
rg -nB5 -A5 'DiskSpace.*<|DiskSpace.*negative' server/

Repository: pelican-dev/wings

Length of output: 715


Guard against negative limits before casting to uint64.

uint64(limit) will wrap negative values into huge quotas. While DiskSpace() is typically positive, there is no validation in the Limits struct or configuration loading to prevent negative values, and SetQuota receives the raw int64 without guards.

✅ Proposed fix
 func SetQuota(limit int64, serverUUID string) (err error) {
 	log.WithField("server", serverUUID).Debug("setting quota")
+	if limit < 0 {
+		return errors.New("quota limit must be non-negative")
+	}
 	if fsType == FSEXT4 || fsType == FSXFS {
 		fsProject, err := getProject(serverUUID)
 		if err != nil {
 			return err
 		}
 		return fsProject.setQuota(uint64(limit))
 	}
 	return
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/filesystem/quotas/functions.go` around lines 107 - 115, SetQuota
currently casts the int64 limit to uint64 which will wrap negative values into
huge quotas; before calling getProject or fsProject.setQuota, validate that the
incoming limit is non-negative and return a sensible error (e.g., invalid
argument) if limit < 0. Update SetQuota to check the limit, reject negatives,
and only call fsProject.setQuota(uint64(limit)) when the check passes; reference
SetQuota, getProject and fsProject.setQuota to locate where to add the guard.

Comment on lines +49 to +55
type fsXAttr struct {
XFlags uint32
ExtSize uint32
NextENTs uint32
ProjectID uint32
FSXPad byte
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

cat -n server/filesystem/quotas/syscall_xattr.go

Repository: pelican-dev/wings

Length of output: 3617


🏁 Script executed:

rg -i "FS_IOC_FSGETXATTR|FS_IOC_FSSETXATTR" -A 2 -B 2

Repository: pelican-dev/wings

Length of output: 2063


🏁 Script executed:

# Check the ioctl constant values and how they're defined
rg "0x[0-9a-fA-F]{1,8}" server/filesystem/quotas/syscall_xattr.go

Repository: pelican-dev/wings

Length of output: 1563


🌐 Web query:

Linux kernel struct fsxattr XFS_IOC_FSGETXATTR ioctl payload size

💡 Result:

XFS_IOC_FSGETXATTR (and the generic FS_IOC_FSGETXATTR) uses a userspace buffer of type struct fsxattr as its ioctl payload (_IOR('X', 31, struct fsxattr)).[1][2]

From the UAPI definition:

  • struct fsxattr = 5 × __u32 fields (20 bytes) + fsx_pad[8] (8 bytes) = 28 bytes total (sizeof(struct fsxattr) == 28).[1]

Fix fsXAttr struct to match kernel ABI (28 bytes).

The struct is missing the CowExtSize field and FSXPad is undersized. The ioctl constants encode a 28-byte payload size, but the current struct is only ~20 bytes. This causes FS_IOC_FSGETXATTR and FS_IOC_FSSETXATTR to read/write beyond the allocated struct, corrupting memory.

Fix
 type fsXAttr struct {
 	XFlags    uint32
 	ExtSize   uint32
 	NextENTs  uint32
 	ProjectID uint32
+	CowExtSize uint32
-	FSXPad    byte
+	FSXPad    [8]byte
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/filesystem/quotas/syscall_xattr.go` around lines 49 - 55, The fsXAttr
struct is too small compared to the kernel ABI (28 bytes) and is missing the
CowExtSize field; update the struct definition for fsXAttr by adding a
CowExtSize uint32 field and expanding FSXPad to provide the remaining padding so
the total struct size is 28 bytes (match kernel ABI used by FS_IOC_FSGETXATTR
and FS_IOC_FSSETXATTR), ensuring the field order remains XFlags, ExtSize,
CowExtSize, NextENTs, ProjectID, and then padding to reach 28 bytes.

Comment on lines +190 to +203
if config.Get().System.Quotas.Enabled {
s.PublishConsoleOutputFromDaemon("checking disk space via quota, just a second")
// get used disk space
used, err := quotas.GetQuota(s.Config().Uuid)
if err != nil {
log.WithField("error", err).Error("failed to get quota for server")
return err
}
// used space is greater than the configured disk space
if used >= s.DiskSpace() {
return errors.New("currently used disk space is more than allocated")
}
} else {
if s.DiskSpace() <= 0 {
Copy link

@coderabbitai coderabbitai bot Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Unlimited-disk servers can be blocked from starting when quotas are enabled.

The unlimited-disk guard is bypassed in the quota path. For s.DiskSpace() <= 0, used >= s.DiskSpace() will usually be true, causing incorrect startup failure.

Suggested fix
-	if config.Get().System.Quotas.Enabled {
+	if s.DiskSpace() <= 0 {
+		s.Filesystem().HasSpaceAvailable(true)
+	} else if config.Get().System.Quotas.Enabled {
 		s.PublishConsoleOutputFromDaemon("checking disk space via quota, just a second")
 		// get used disk space
 		used, err := quotas.GetQuota(s.Config().Uuid)
 		if err != nil {
 			log.WithField("error", err).Error("failed to get quota for server")
 			return err
 		}
 		// used space is greater than the configured disk space
 		if used >= s.DiskSpace() {
 			return errors.New("currently used disk space is more than allocated")
 		}
 	} else {
-		if s.DiskSpace() <= 0 {
-			s.Filesystem().HasSpaceAvailable(true)
-		} else {
-			s.PublishConsoleOutputFromDaemon("checking server disk space usage, this could take a few seconds...")
-			if err := s.Filesystem().HasSpaceErr(false); err != nil {
-				return err
-			}
-		}
+		s.PublishConsoleOutputFromDaemon("checking server disk space usage, this could take a few seconds...")
+		if err := s.Filesystem().HasSpaceErr(false); err != nil {
+			return err
+		}
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/power.go` around lines 190 - 203, The quota branch wrongly treats
unlimited-disk servers as over quota because it compares used >= s.DiskSpace()
when DiskSpace() <= 0; modify the quota check in the block guarded by
config.Get().System.Quotas.Enabled so that you first handle the unlimited-disk
case (if s.DiskSpace() <= 0) by skipping the quota check and
returning/continuing normally, or after getting used from
quotas.GetQuota(s.Config().Uuid) explicitly ignore the comparison when
s.DiskSpace() <= 0; ensure you keep calls like s.PublishConsoleOutputFromDaemon
and the error logging around quotas.GetQuota unchanged except for this
additional guard so unlimited servers are not blocked from starting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignore this. It works as expected.

The disk quota is managed by the filesystem.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@parkervcp, understood — thank you for the clarification! If the filesystem quota management ensures this scenario is handled correctly, then the current implementation is fine.


✏️ Learnings added
Learnt from: parkervcp
Repo: pelican-dev/wings PR: 171
File: server/power.go:190-203
Timestamp: 2026-03-02T13:53:08.995Z
Learning: In `server/power.go`, when quotas are enabled via `config.Get().System.Quotas.Enabled`, the disk space check using `used >= s.DiskSpace()` does not need a special guard for unlimited-disk servers (DiskSpace() <= 0) because the disk quota is managed by the filesystem itself, which handles such cases appropriately.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Comment on lines +38 to +44
if config.Get().System.Quotas.Enabled {
used, err := quotas.GetQuota(s.ID())
if err != nil {
log.WithFields(log.Fields{"server-uuid": s.ID(), "error": err.Error()}).Error("there was an issue getting the used disk space")
}
atomic.StoreInt64(&s.resources.Disk, used)
} else {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don’t overwrite Disk with zero when quota lookup fails.

On quota read failure, used remains 0 and is still stored, which can falsely report empty disk usage. Keep the previous value or use a fallback only on success.

Suggested fix
 	if config.Get().System.Quotas.Enabled {
 		used, err := quotas.GetQuota(s.ID())
 		if err != nil {
 			log.WithFields(log.Fields{"server-uuid": s.ID(), "error": err.Error()}).Error("there was an issue getting the used disk space")
-		}
-		atomic.StoreInt64(&s.resources.Disk, used)
+		} else {
+			atomic.StoreInt64(&s.resources.Disk, used)
+		}
 	} else {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if config.Get().System.Quotas.Enabled {
used, err := quotas.GetQuota(s.ID())
if err != nil {
log.WithFields(log.Fields{"server-uuid": s.ID(), "error": err.Error()}).Error("there was an issue getting the used disk space")
}
atomic.StoreInt64(&s.resources.Disk, used)
} else {
if config.Get().System.Quotas.Enabled {
used, err := quotas.GetQuota(s.ID())
if err != nil {
log.WithFields(log.Fields{"server-uuid": s.ID(), "error": err.Error()}).Error("there was an issue getting the used disk space")
} else {
atomic.StoreInt64(&s.resources.Disk, used)
}
} else {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/resources.go` around lines 38 - 44, The code currently calls
quotas.GetQuota and unconditionally calls atomic.StoreInt64(&s.resources.Disk,
used), which overwrites the disk value with 0 on errors; change the flow in the
block guarded by config.Get().System.Quotas.Enabled so that you only update
s.resources.Disk (via atomic.StoreInt64) when quotas.GetQuota returns nil error
(i.e., on success), and on error leave the existing s.resources.Disk value
unchanged (or optionally use a documented fallback) while still logging the
error from quotas.GetQuota with s.ID() for diagnostics.

    adds coxextsize and limits fsxpad to 8bytes

corrects some errors on filesystem type detection

handles when limits are lower than 0. Treats them as unlimited.

now locks exfs projects when working with them to avoid any issues
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

♻️ Duplicate comments (1)
server/filesystem/quotas/exfs.go (1)

170-177: ⚠️ Potential issue | 🟠 Major

Propagate Close errors in writeTemplate.

Line 170 closes the file via defer but ignores close failures, which can hide partial write/flush errors for /etc/projid and /etc/projects.

Suggested patch
 func writeTemplate(t *template.Template, file string, data interface{}) (err error) {
 	f, err := os.Create(file)
 	if err != nil {
 		return err
 	}
 
-	defer f.Close()
+	defer func() {
+		if cerr := f.Close(); err == nil && cerr != nil {
+			err = cerr
+		}
+	}()
 
 	err = t.Execute(f, data)
 	if err != nil {
 		return err
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/filesystem/quotas/exfs.go` around lines 170 - 177, writeTemplate
currently defers f.Close() and ignores its error which can hide write/flush
failures for /etc/projid and /etc/projects; modify writeTemplate so the deferred
close captures its error and is propagated: capture the error returned by
t.Execute into a local variable (err), then in a deferred closure call cerr :=
f.Close() and if err == nil && cerr != nil set/return cerr, else if err != nil
and cerr != nil wrap both (e.g., using fmt.Errorf("%v; close error: %w", err,
cerr)) so the function returns the most relevant error; reference the
writeTemplate function, the file variable f, and the t.Execute call when making
the change.
🧹 Nitpick comments (1)
server/filesystem/quotas/syscall_xattr.go (1)

56-65: Add a compile-time ABI size check for fsXAttr.

This struct is ioctl ABI-sensitive; a compile-time size assertion prevents accidental regressions.

💡 Proposed hardening
 type fsXAttr struct {
 	XFlags     uint32
 	ExtSize    uint32
 	NextENTs   uint32
 	ProjectID  uint32
 	CowExtSize uint32
 	FSXPad     [8]byte
 }
+
+const fsXAttrSize = 28
+
+var (
+	_ [fsXAttrSize - int(unsafe.Sizeof(fsXAttr{}))]struct{}
+	_ [int(unsafe.Sizeof(fsXAttr{})) - fsXAttrSize]struct{}
+)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/filesystem/quotas/syscall_xattr.go` around lines 56 - 65, Add a
compile-time ABI size check to ensure fsXAttr remains the expected size (28
bytes): add a build-time assertion using unsafe.Sizeof(fsXAttr{}) compared
against 28 (for example via a const/var zero-length array or a type alias trick)
so the compiler will fail if fsXAttr's size changes; reference the fsXAttr type
in the assertion to locate the code.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@go.mod`:
- Line 3: The go.mod declares "go 1.25.0" but CI and Docker files are pinned to
1.24.7 (and push matrix contains outdated/wrong entries); update the toolchain
refs to match go.mod by changing the release workflow's go-version to 1.25.0 (or
1.25.1), update the Dockerfile FROM to golang:1.25.0-alpine (or 1.25.1-alpine),
and fix the push workflow matrix to ["1.25.0","1.25.1"] (or a single consistent
1.25.x value) so that go.mod's "go 1.25.0" aligns with CI and image builds.

In `@server/filesystem/quotas/exfs.go`:
- Around line 114-124: The deletion routine takes exfs.lock, mutates
exfs.projects, then calls writeEXFSProjects() while still holding exfs.lock,
which deadlocks because writeEXFSProjects also tries to lock the same mutex; fix
by releasing the lock before calling writeEXFSProjects (unlock exfs.lock right
after the append/remove and before calling writeEXFSProjects), or alternatively
add an internal unexported helper (e.g., writeEXFSProjectsLocked) that assumes
the lock is already held and have writeEXFSProjects call that under its own
lock—use exfs.lock, exfs.projects, writeEXFSProjects and ensure no path attempts
to re-acquire the same mutex.
- Around line 98-103: addProject currently appends q to exfs.projects while
holding exfs.lock, then releases the lock and calls writeEXFSProjects; if
writeEXFSProjects fails the in-memory state remains inconsistent with /etc. Fix
by performing the persistence first or by rolling back the in-memory change on
write failure: inside addProject (the function that manipulates exfs.projects),
keep the lock while appending, call writeEXFSProjects while still holding or
immediately after appending but detect an error and if writeEXFSProjects returns
an error remove the appended entry from exfs.projects (or restore the previous
slice) before returning, ensuring exfs.lock protects both mutation and rollback
so memory and on-disk state stay consistent.

In `@server/filesystem/quotas/functions.go`:
- Around line 109-112: The code currently resets limit to 0 before logging so
requested_limit is always 0; preserve the original negative value by capturing
it into a temporary variable (e.g., originalLimit) or log the existing limit
before assigning limit = 0, then call
log.WithFields(log.Fields{"requested_limit": originalLimit}).Error(...) so the
log shows the actual negative input; update the block around the limit check
(the variable limit and the log.WithFields call) to use the preserved value.
- Around line 65-68: The call to
fsquota.ProjectQuotasSupported(config.Get().System.Data) returns an error that
is immediately returned with no logging; update the error path so it logs the
error with context before returning. Specifically, in the block handling the
result of fsquota.ProjectQuotasSupported (the supported, err = ...; if err !=
nil { ... } section) log a descriptive message including the data path
(config.Get().System.Data) and the returned err (for example via the file's
existing logger/processLogger) and then return the error as before.

In `@server/filesystem/quotas/syscall_xattr.go`:
- Around line 89-97: The setXAttr function is casting projectID (int) to uint32
without validation which can wrap negative or out-of-range values; update
setXAttr to validate projectID before assigning to fxattr.ProjectID (e.g., check
projectID >= 0 and <= math.MaxUint32) and return a descriptive error when
invalid so fxattr.ProjectID is only set with a safe uint32 value; reference
symbols: function setXAttr, parameter projectID, and field fxattr.ProjectID.

---

Duplicate comments:
In `@server/filesystem/quotas/exfs.go`:
- Around line 170-177: writeTemplate currently defers f.Close() and ignores its
error which can hide write/flush failures for /etc/projid and /etc/projects;
modify writeTemplate so the deferred close captures its error and is propagated:
capture the error returned by t.Execute into a local variable (err), then in a
deferred closure call cerr := f.Close() and if err == nil && cerr != nil
set/return cerr, else if err != nil and cerr != nil wrap both (e.g., using
fmt.Errorf("%v; close error: %w", err, cerr)) so the function returns the most
relevant error; reference the writeTemplate function, the file variable f, and
the t.Execute call when making the change.

---

Nitpick comments:
In `@server/filesystem/quotas/syscall_xattr.go`:
- Around line 56-65: Add a compile-time ABI size check to ensure fsXAttr remains
the expected size (28 bytes): add a build-time assertion using
unsafe.Sizeof(fsXAttr{}) compared against 28 (for example via a const/var
zero-length array or a type alias trick) so the compiler will fail if fsXAttr's
size changes; reference the fsXAttr type in the assertion to locate the code.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f977c45 and f55b2a8.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (4)
  • go.mod
  • server/filesystem/quotas/exfs.go
  • server/filesystem/quotas/functions.go
  • server/filesystem/quotas/syscall_xattr.go
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Build and Test (ubuntu-22.04, 1.24.7, linux, amd64)
  • GitHub Check: Analyze (go)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: parkervcp
Repo: pelican-dev/wings PR: 171
File: server/power.go:190-203
Timestamp: 2026-03-02T13:53:08.995Z
Learning: In `server/power.go`, when quotas are enabled via `config.Get().System.Quotas.Enabled`, the disk space check using `used >= s.DiskSpace()` does not need a special guard for unlimited-disk servers (DiskSpace() <= 0) because the disk quota is managed by the filesystem itself, which handles such cases appropriately.
📚 Learning: 2026-03-02T13:53:08.995Z
Learnt from: parkervcp
Repo: pelican-dev/wings PR: 171
File: server/power.go:190-203
Timestamp: 2026-03-02T13:53:08.995Z
Learning: In the server package, when quotas are enabled via config.Get().System.Quotas.Enabled, the disk space check using used >= s.DiskSpace() does not require a special guard for unlimited-disk scenarios (DiskSpace() <= 0). The filesystem handles such cases, so the existing check is sufficient. Apply this pattern to similar quota-related disk checks in the server package and ensure tests/docs reflect that unlimited-disk behavior is governed by the filesystem, not by an extra guard in code.

Applied to files:

  • server/filesystem/quotas/exfs.go
  • server/filesystem/quotas/syscall_xattr.go
  • server/filesystem/quotas/functions.go
🔇 Additional comments (2)
server/filesystem/quotas/syscall_xattr.go (2)

58-65: fsXAttr layout update looks correct.

Including CowExtSize and [8]byte padding aligns with the intended struct fsxattr shape.


42-45: The ioctl request codes are actually architecture-independent. FS_IOC_FSGETXATTR and FS_IOC_FSSETXATTR encode to identical values (0x801c581f and 0x401c5820) across amd64, arm64, arm, and ppc64le because the underlying struct fsxattr is composed of fixed-width types (__u32 fields with explicit padding), not variable-width types like long. Unlike FS_IOC_GETFLAGS/SETFLAGS (which use long and thus differ between 32-bit and 64-bit ABIs), these constants are stable across all architectures. The hard-coded values are correct and require no changes.

module github.com/pelican-dev/wings

go 1.24.0
go 1.25.0
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "go.mod directive:"
rg -n '^go\s+' go.mod

echo
echo "GitHub Actions Go version pins:"
if [ -d .github/workflows ]; then
  fd -HI '.ya?ml$' .github/workflows | xargs -r rg -n 'go-version|setup-go|golang:' || echo "No matches found"
else
  echo "No .github/workflows directory found"
fi

echo
echo "Dockerfile Go base images:"
fd -HI 'Dockerfile*' | xargs -r rg -n 'FROM\s+golang:' || echo "No matches found"

echo
echo "Other explicit Go version pins:"
fd -HI 'Makefile*' | xargs -r rg -n 'GO_VERSION|GOLANG_VERSION|go1\.[0-9]+' || echo "No matches found"

Repository: pelican-dev/wings

Length of output: 508


🏁 Script executed:

rg -B 10 'matrix.go' .github/workflows/push.yaml | head -30

Repository: pelican-dev/wings

Length of output: 934


Fix Go toolchain version mismatches—go.mod declares 1.25.0 but CI/Docker pinned to 1.24.7.

Current state:

  • go.mod line 3: go 1.25.0
  • .github/workflows/release.yaml line 23: go-version: "1.24.7"
  • Dockerfile line 2: FROM golang:1.24.7-alpine
  • .github/workflows/push.yaml matrix: ["1.24.7", "1.25.1"] (includes stale 1.24.7 and wrong 1.25.1) ❌

Required fixes:

  • Update release.yaml to go-version: "1.25.0" or "1.25.1"
  • Update Dockerfile to FROM golang:1.25.0-alpine or 1.25.1-alpine
  • Update push.yaml matrix to ["1.25.0", "1.25.1"] (or specify one version if only 1.25.0 is supported)

Release builds and Docker image builds will fail with mismatched module and toolchain versions.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go.mod` at line 3, The go.mod declares "go 1.25.0" but CI and Docker files
are pinned to 1.24.7 (and push matrix contains outdated/wrong entries); update
the toolchain refs to match go.mod by changing the release workflow's go-version
to 1.25.0 (or 1.25.1), update the Dockerfile FROM to golang:1.25.0-alpine (or
1.25.1-alpine), and fix the push workflow matrix to ["1.25.0","1.25.1"] (or a
single consistent 1.25.x value) so that go.mod's "go 1.25.0" aligns with CI and
image builds.

Comment on lines +98 to +103
exfs.lock.Lock()
exfs.projects = append(exfs.projects, q)
exfs.lock.Unlock()
if err = writeEXFSProjects(); err != nil {
return
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

addProject can leave memory and /etc state inconsistent on write failure.

Line 99 appends to exfs.projects before persistence. If Line 101 fails, the method returns an error but the in-memory project remains, which can break subsequent lookups/operations.

Suggested patch (rollback on persistence failure)
 	exfs.lock.Lock()
 	exfs.projects = append(exfs.projects, q)
 	exfs.lock.Unlock()
 	if err = writeEXFSProjects(); err != nil {
+		exfs.lock.Lock()
+		for i := len(exfs.projects) - 1; i >= 0; i-- {
+			if exfs.projects[i].Name == q.Name {
+				exfs.projects = append(exfs.projects[:i], exfs.projects[i+1:]...)
+				break
+			}
+		}
+		exfs.lock.Unlock()
 		return
 	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
exfs.lock.Lock()
exfs.projects = append(exfs.projects, q)
exfs.lock.Unlock()
if err = writeEXFSProjects(); err != nil {
return
}
exfs.lock.Lock()
exfs.projects = append(exfs.projects, q)
exfs.lock.Unlock()
if err = writeEXFSProjects(); err != nil {
exfs.lock.Lock()
for i := len(exfs.projects) - 1; i >= 0; i-- {
if exfs.projects[i].Name == q.Name {
exfs.projects = append(exfs.projects[:i], exfs.projects[i+1:]...)
break
}
}
exfs.lock.Unlock()
return
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/filesystem/quotas/exfs.go` around lines 98 - 103, addProject currently
appends q to exfs.projects while holding exfs.lock, then releases the lock and
calls writeEXFSProjects; if writeEXFSProjects fails the in-memory state remains
inconsistent with /etc. Fix by performing the persistence first or by rolling
back the in-memory change on write failure: inside addProject (the function that
manipulates exfs.projects), keep the lock while appending, call
writeEXFSProjects while still holding or immediately after appending but detect
an error and if writeEXFSProjects returns an error remove the appended entry
from exfs.projects (or restore the previous slice) before returning, ensuring
exfs.lock protects both mutation and rollback so memory and on-disk state stay
consistent.

Comment on lines +114 to +124
exfs.lock.Lock()
defer exfs.lock.Unlock()
for pos, project := range exfs.projects {
if project.Name == q.Name {
exfs.projects = append(exfs.projects[:pos], exfs.projects[pos+1:]...)
break
}
}

err = writeEXFSProjects()
return
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Nested mutex acquisition can deadlock project deletion.

Line 114 acquires exfs.lock, then Line 123 calls writeEXFSProjects, which tries to lock again on Line 140. sync.Mutex is not re-entrant, so this can block indefinitely.

Suggested patch
 func (q exfsProject) removeProject() (err error) {
 	exfs.lock.Lock()
-	defer exfs.lock.Unlock()
 	for pos, project := range exfs.projects {
 		if project.Name == q.Name {
 			exfs.projects = append(exfs.projects[:pos], exfs.projects[pos+1:]...)
 			break
 		}
 	}
+	exfs.lock.Unlock()
 
-	err = writeEXFSProjects()
-	return
+	return writeEXFSProjects()
 }

Also applies to: 139-142

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/filesystem/quotas/exfs.go` around lines 114 - 124, The deletion
routine takes exfs.lock, mutates exfs.projects, then calls writeEXFSProjects()
while still holding exfs.lock, which deadlocks because writeEXFSProjects also
tries to lock the same mutex; fix by releasing the lock before calling
writeEXFSProjects (unlock exfs.lock right after the append/remove and before
calling writeEXFSProjects), or alternatively add an internal unexported helper
(e.g., writeEXFSProjectsLocked) that assumes the lock is already held and have
writeEXFSProjects call that under its own lock—use exfs.lock, exfs.projects,
writeEXFSProjects and ensure no path attempts to re-acquire the same mutex.

reorganize handling limits being below 0

handle if the project id is outside the range
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (4)
server/filesystem/quotas/exfs.go (3)

167-180: ⚠️ Potential issue | 🟠 Major

Propagate Close errors in writeTemplate.

defer f.Close() at Line 173 drops close failures, which can hide final writeback errors for /etc files.

✅ Suggested error-safe close handling
 func writeTemplate(t *template.Template, file string, data interface{}) (err error) {
 	f, err := os.Create(file)
 	if err != nil {
 		return err
 	}

-	defer f.Close()
+	defer func() {
+		if cerr := f.Close(); err == nil && cerr != nil {
+			err = cerr
+		}
+	}()

-	err = t.Execute(f, data)
-	if err != nil {
+	if err = t.Execute(f, data); err != nil {
 		return err
 	}

 	return nil
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/filesystem/quotas/exfs.go` around lines 167 - 180, The function
writeTemplate currently defers f.Close() and drops any Close error; update it to
capture and propagate Close errors by using the named return error variable
(err) and in the deferred close check assign err = first non-nil error between
err and the result of f.Close() so that a failing file close after t.Execute
(e.g., writeback to /etc) is returned; reference the writeTemplate function, the
t.Execute call, the os.File f and f.Close() in your changes.

98-103: ⚠️ Potential issue | 🔴 Critical

Release exfs.lock before calling writeEXFSProjects to avoid self-deadlock.

Line 98 and Line 117 hold exfs.lock, then call writeEXFSProjects (Line 102 / Line 126), which locks again at Line 143. This can deadlock the goroutine.

🛠️ Minimal fix
 func (q exfsProject) addProject() (err error) {
@@
 	exfs.lock.Lock()
-	defer exfs.lock.Unlock()
 	exfs.projects = append(exfs.projects, q)
+	exfs.lock.Unlock()

 	if err = writeEXFSProjects(); err != nil {
 		log.WithError(err).Error("failed to write exfs projects")
 		return
@@
 func (q exfsProject) removeProject() (err error) {
 	exfs.lock.Lock()
-	defer exfs.lock.Unlock()
 	for pos, project := range exfs.projects {
 		if project.Name == q.Name {
 			exfs.projects = append(exfs.projects[:pos], exfs.projects[pos+1:]...)
 			break
 		}
 	}
-
-	err = writeEXFSProjects()
-	return
+	exfs.lock.Unlock()
+	return writeEXFSProjects()
 }

Also applies to: 117-127, 142-145

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/filesystem/quotas/exfs.go` around lines 98 - 103, The code currently
holds exfs.lock while calling writeEXFSProjects, which itself acquires the same
lock and can deadlock; update the callers (where exfs.lock is acquired around
appending to exfs.projects, e.g., the block that uses exfs.lock.Lock()/defer
exfs.lock.Unlock() and the similar block at lines that append/remove projects)
to release the lock before invoking writeEXFSProjects: perform the mutation
while holding exfs.lock, unlock it (remove or move the defer unlock as needed),
then call writeEXFSProjects afterwards so writeEXFSProjects can safely acquire
the lock; ensure no shared state is accessed after unlocking except via safe
copies or re-locking if required.

100-105: ⚠️ Potential issue | 🟠 Major

Rollback in-memory append if /etc persistence fails in addProject.

Line 100 mutates exfs.projects before disk write. If Line 102 fails, in-memory and /etc/projid//etc/projects diverge.

🧩 Suggested rollback on failure
 	exfs.projects = append(exfs.projects, q)
 	exfs.lock.Unlock()

 	if err = writeEXFSProjects(); err != nil {
 		log.WithError(err).Error("failed to write exfs projects")
+		exfs.lock.Lock()
+		for i := len(exfs.projects) - 1; i >= 0; i-- {
+			if exfs.projects[i].Name == q.Name {
+				exfs.projects = append(exfs.projects[:i], exfs.projects[i+1:]...)
+				break
+			}
+		}
+		exfs.lock.Unlock()
 		return
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/filesystem/quotas/exfs.go` around lines 100 - 105, The code appends q
to exfs.projects before persisting and doesn't revert if writeEXFSProjects()
fails, causing in-memory/des on-disk divergence; in addProject, capture the
prior slice length (or copy the slice), append q to exfs.projects, then if
writeEXFSProjects() returns an error remove the appended entry (restore the
slice to its previous length or replace with the saved copy) before
returning—i.e., rollback the in-memory mutation on failure to keep exfs.projects
consistent with disk.
server/filesystem/quotas/functions.go (1)

65-68: ⚠️ Potential issue | 🟡 Minor

Add data-path context to quota-support probe failures.

When fsquota.ProjectQuotasSupported(...) fails, the log currently omits the mount path, which makes diagnosis harder.

✏️ Small logging improvement
 		supported, err = fsquota.ProjectQuotasSupported(config.Get().System.Data)
 		if err != nil {
-			log.WithError(err).Error("error checking for quota support")
+			log.WithError(err).WithField("path", config.Get().System.Data).Error("error checking for quota support")
 			return
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/filesystem/quotas/functions.go` around lines 65 - 68, The log for
quota-support probe omits the data mount path; modify the error branch where
fsquota.ProjectQuotasSupported(config.Get().System.Data) returns an error to
include the data path (config.Get().System.Data) in the log message — e.g.,
change the log.WithError(err).Error call to include context about the mount path
so the error reads something like "error checking for quota support on data path
<path>" and still attaches the error via WithError.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@server/filesystem/quotas/exfs.go`:
- Around line 167-180: The function writeTemplate currently defers f.Close() and
drops any Close error; update it to capture and propagate Close errors by using
the named return error variable (err) and in the deferred close check assign err
= first non-nil error between err and the result of f.Close() so that a failing
file close after t.Execute (e.g., writeback to /etc) is returned; reference the
writeTemplate function, the t.Execute call, the os.File f and f.Close() in your
changes.
- Around line 98-103: The code currently holds exfs.lock while calling
writeEXFSProjects, which itself acquires the same lock and can deadlock; update
the callers (where exfs.lock is acquired around appending to exfs.projects,
e.g., the block that uses exfs.lock.Lock()/defer exfs.lock.Unlock() and the
similar block at lines that append/remove projects) to release the lock before
invoking writeEXFSProjects: perform the mutation while holding exfs.lock, unlock
it (remove or move the defer unlock as needed), then call writeEXFSProjects
afterwards so writeEXFSProjects can safely acquire the lock; ensure no shared
state is accessed after unlocking except via safe copies or re-locking if
required.
- Around line 100-105: The code appends q to exfs.projects before persisting and
doesn't revert if writeEXFSProjects() fails, causing in-memory/des on-disk
divergence; in addProject, capture the prior slice length (or copy the slice),
append q to exfs.projects, then if writeEXFSProjects() returns an error remove
the appended entry (restore the slice to its previous length or replace with the
saved copy) before returning—i.e., rollback the in-memory mutation on failure to
keep exfs.projects consistent with disk.

In `@server/filesystem/quotas/functions.go`:
- Around line 65-68: The log for quota-support probe omits the data mount path;
modify the error branch where
fsquota.ProjectQuotasSupported(config.Get().System.Data) returns an error to
include the data path (config.Get().System.Data) in the log message — e.g.,
change the log.WithError(err).Error call to include context about the mount path
so the error reads something like "error checking for quota support on data path
<path>" and still attaches the error via WithError.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f55b2a8 and ca09896.

📒 Files selected for processing (3)
  • server/filesystem/quotas/exfs.go
  • server/filesystem/quotas/functions.go
  • server/filesystem/quotas/syscall_xattr.go
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Analyze (go)
  • GitHub Check: Build and Test (ubuntu-22.04, 1.24.7, linux, amd64)
  • GitHub Check: Build and Test (ubuntu-22.04, 1.25.1, linux, amd64)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: parkervcp
Repo: pelican-dev/wings PR: 171
File: server/power.go:190-203
Timestamp: 2026-03-02T13:53:08.995Z
Learning: In `server/power.go`, when quotas are enabled via `config.Get().System.Quotas.Enabled`, the disk space check using `used >= s.DiskSpace()` does not need a special guard for unlimited-disk servers (DiskSpace() <= 0) because the disk quota is managed by the filesystem itself, which handles such cases appropriately.
📚 Learning: 2026-03-02T13:53:08.995Z
Learnt from: parkervcp
Repo: pelican-dev/wings PR: 171
File: server/power.go:190-203
Timestamp: 2026-03-02T13:53:08.995Z
Learning: In the server package, when quotas are enabled via config.Get().System.Quotas.Enabled, the disk space check using used >= s.DiskSpace() does not require a special guard for unlimited-disk scenarios (DiskSpace() <= 0). The filesystem handles such cases, so the existing check is sufficient. Apply this pattern to similar quota-related disk checks in the server package and ensure tests/docs reflect that unlimited-disk behavior is governed by the filesystem, not by an extra guard in code.

Applied to files:

  • server/filesystem/quotas/functions.go
  • server/filesystem/quotas/exfs.go
  • server/filesystem/quotas/syscall_xattr.go
🧬 Code graph analysis (2)
server/filesystem/quotas/functions.go (1)
config/config.go (1)
  • Get (445-453)
server/filesystem/quotas/exfs.go (2)
config/config.go (1)
  • Get (445-453)
system/system.go (1)
  • System (61-68)
🔇 Additional comments (2)
server/filesystem/quotas/syscall_xattr.go (1)

59-66: fsXAttr ABI shape and projectID guard look good.

The struct layout and range check before uint32 cast are solid in this revision.

Also applies to: 91-93

server/filesystem/quotas/functions.go (1)

110-113: Negative-limit handling here looks correct.

Logging the original value and clamping to zero avoids unsafe int64 -> uint64 wraparound before quota application.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant