fix: use cgroup memory limits for multipart upload concurrency in con… by ProbstenHias · Pull Request #447 · durch/rust-s3

ProbstenHias · 2026-02-12T19:46:57Z

…tainers

calculate_max_concurrent_chunks() uses sysinfo::System::available_memory() which reads /proc/meminfo on Linux. In containerized environments (Docker, Kubernetes), /proc/meminfo reports the host node's total memory, not the container's cgroup-enforced limit. This causes the function to massively overestimate available memory and spawn too many concurrent chunk uploads, leading to OOM kills.

Example: a pod with 512 MiB memory limit running on a 64 GiB node would see ~64 GiB available, calculate 100 concurrent 8 MiB chunks (800 MiB total), and get OOMKilled.

The fix adds cgroup-aware memory detection that runs before the sysinfo fallback:

Parse /proc/self/cgroup to resolve the process's cgroup path; handle both nested paths (e.g. /kubepods.slice/.../cri-containerd-xxx.scope) and root path "/" which is the common case with private cgroup namespaces (Docker default, Kubernetes default since 1.25)
cgroup v2: read memory.max and memory.current from /sys/fs/cgroup//; treat "max" as unlimited
cgroup v1: read memory.limit_in_bytes and memory.usage_in_bytes from /sys/fs/cgroup/memory//; treat values near u64::MAX (PAGE_COUNTER_MAX) as unlimited
Available memory = limit - current usage (saturating_sub)
Return None when not under cgroup constraints, so the caller falls back to the existing sysinfo logic unchanged
All cgroup functions gated on #[cfg(target_os = "linux")] and the async feature flags to avoid dead code warnings in sync builds or on non-Linux platforms
Add unit tests covering cgroup v2 with limit, v2 unlimited, v1 with limit, v1 unlimited, path parsing, and concurrency clamping math

This change is

Summary by CodeRabbit

Improvements
- Enhanced memory detection on Linux systems to better manage upload performance when memory resources are constrained, with improved support for containerized environments.

…tainers calculate_max_concurrent_chunks() uses sysinfo::System::available_memory() which reads /proc/meminfo on Linux. In containerized environments (Docker, Kubernetes), /proc/meminfo reports the host node's total memory, not the container's cgroup-enforced limit. This causes the function to massively overestimate available memory and spawn too many concurrent chunk uploads, leading to OOM kills. Example: a pod with 512 MiB memory limit running on a 64 GiB node would see ~64 GiB available, calculate 100 concurrent 8 MiB chunks (800 MiB total), and get OOMKilled. The fix adds cgroup-aware memory detection that runs before the sysinfo fallback: - Parse /proc/self/cgroup to resolve the process's cgroup path; handle both nested paths (e.g. /kubepods.slice/.../cri-containerd-xxx.scope) and root path "/" which is the common case with private cgroup namespaces (Docker default, Kubernetes default since 1.25) - cgroup v2: read memory.max and memory.current from /sys/fs/cgroup/<cgroup-path>/; treat "max" as unlimited - cgroup v1: read memory.limit_in_bytes and memory.usage_in_bytes from /sys/fs/cgroup/memory/<cgroup-path>/; treat values near u64::MAX (PAGE_COUNTER_MAX) as unlimited - Available memory = limit - current usage (saturating_sub) - Return None when not under cgroup constraints, so the caller falls back to the existing sysinfo logic unchanged - All cgroup functions gated on #[cfg(target_os = "linux")] and the async feature flags to avoid dead code warnings in sync builds or on non-Linux platforms - Add unit tests covering cgroup v2 with limit, v2 unlimited, v1 with limit, v1 unlimited, path parsing, and concurrency clamping math

coderabbitai · 2026-02-12T19:47:15Z

📝 Walkthrough

Walkthrough

This change enhances memory-based calculation for maximum concurrent upload chunks by adding Linux cgroup detection. The implementation introduces a two-tier memory detection system that first attempts to read cgroup v2 memory limits, falls back to cgroup v1 if unavailable, and ultimately uses sysinfo for fallback. The chunk calculation range is extended from 2..10 to 2..100, with platform-gated helper functions and comprehensive test coverage.

Changes

Cohort / File(s)	Summary
Cgroup Memory Detection System `s3/src/bucket.rs`	Added four internal helper functions (`cgroup_available_memory`, `read_cgroup_path`, `cgroup_v2_available_memory`, `cgroup_v1_available_memory`) to read and parse Linux cgroup memory limits from `/proc/self/cgroup`, `memory.max`, `memory.current`, `memory.limit_in_bytes`, and `memory.usage_in_bytes`. Modified `calculate_max_concurrent_chunks()` to use tiered memory detection (cgroup v2 → cgroup v1 → sysinfo fallback), updated clamp range to 2..100, and added default value of 3 when memory is unavailable. Included extensive Linux-only tests for cgroup path parsing, v2/v1 memory calculations, and edge cases. Updated documentation to reflect new range and platform-specific behavior.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🐰 A cgroup memory tale I weave,
With v2, v1, and sysinfo's reprieve,
Chunks now soar from ten to one hundred bright,
Linux containers handled just right,
Upload speeds dance in the morning light! 🚀

🚥 Pre-merge checks | ✅ 4

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: implementing cgroup memory limit detection for multipart upload concurrency in containers, which is the core objective of the PR.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Merge Conflict Detection	✅ Passed	✅ No merge conflicts detected when merging into `master`

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Tip

Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

brownian-motion-v0 · 2026-02-12T19:47:53Z

Brownian Motion (Brass)

Recommendation: Proceed

Summary: PR addresses OOM kills by improving memory limit detection in containers.
Risk: Medium · Confidence: 80%

Highlights

Good test coverage
Clear commit history

Unknowns

No linked issue
Missing performance metrics

Next actions

Keep: cgroup memory limit detection logic
Drop: unnecessary comments in the code
Add: documentation for new functions

Reflection questions

What core assumption underpins this PR's approach?
How does this change align with the project's longer-term goals?
Could there be a simpler way to achieve the primary objective here?

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@s3/src/bucket.rs`:
- Around line 4344-4346: The test function
test_cgroup_available_memory_returns_none_when_no_cgroup inside the cgroup_tests
module is dead because the module is #[cfg(target_os = "linux")] while the test
body is under #[cfg(not(target_os = "linux"))]; either remove the unreachable
#[cfg(not(target_os = "linux"))] block or restructure the tests so they actually
run on Linux: move the assertion into the linux-configured module and call
cgroup_available_memory() (or the real function under test) and assert the
expected Some(...) or None value, or split into two explicit tests with opposite
cfg attributes if you need OS-specific behavior; update or delete the empty test
body for test_cgroup_available_memory_returns_none_when_no_cgroup accordingly.

🧹 Nitpick comments (4)

s3/src/bucket.rs (4)
3036-3066: read_cgroup_path("v2") returns Some("/") for root cgroup, but the corresponding test uses stricter logic.

test_read_cgroup_path_v2 (line 4362) checks !path.is_empty() && path != "/", but the actual v2 branch at line 3050 only checks !path.is_empty(). While this doesn't cause a test failure (since the test input isn't "/"), the duplicated-and-divergent logic is a maintenance hazard.

Consider calling the actual read_cgroup_path function from tests instead of re-implementing the parsing inline. This would catch regressions in the real code and eliminate the risk of test/implementation drift.

3008-3018: Silent failure with no observability when cgroup detection is skipped.

All cgroup helper functions silently return None on any failure (file not found, parse error, unlimited). Consider adding log::debug! or log::trace! messages when cgroup detection falls through, so operators can understand which memory source was used. This is especially valuable when diagnosing OOM issues in containers.
Example logging addition
 fn cgroup_available_memory() -> Option<u64> {
-    cgroup_v2_available_memory().or_else(cgroup_v1_available_memory)
+    let result = cgroup_v2_available_memory().or_else(cgroup_v1_available_memory);
+    if let Some(mem) = result {
+        log::debug!("cgroup memory detection: available={} bytes", mem);
+    } else {
+        log::debug!("cgroup memory detection: no cgroup limits found, falling back to sysinfo");
+    }
+    result
 }
4349-4409: Tests re-implement parsing logic inline instead of calling read_cgroup_path.

test_read_cgroup_path_v2, test_read_cgroup_path_v1, and test_read_cgroup_path_v2_root_returns_root all duplicate the parsing logic from read_cgroup_path rather than invoking it. This means the tests don't actually validate the real implementation — a bug in read_cgroup_path wouldn't be caught.

Since read_cgroup_path reads /proc/self/cgroup (which may not have the expected entries in CI), consider either:

Refactoring read_cgroup_path to accept the file content as a parameter (making it testable), or

At minimum, calling the actual function in the tests and asserting it doesn't panic, even if the exact result is environment-dependent.
Suggested refactor: extract parsing logic into a testable function
+/// Parse a cgroup path from the content of /proc/self/cgroup.
+#[cfg(all(
+    target_os = "linux",
+    any(feature = "with-tokio", feature = "with-async-std")
+))]
+fn parse_cgroup_path(content: &str, controller: &str) -> Option<String> {
+    for line in content.lines() {
+        let parts: Vec<&str> = line.splitn(3, ':').collect();
+        if parts.len() != 3 {
+            continue;
+        }
+        if controller == "v2" {
+            if parts[0] == "0" && parts[1].is_empty() {
+                let path = parts[2].trim();
+                if !path.is_empty() {
+                    return Some(path.to_string());
+                }
+            }
+        } else {
+            let controllers = parts[1];
+            if controllers.split(',').any(|c| c == controller) {
+                let path = parts[2].trim();
+                if !path.is_empty() && path != "/" {
+                    return Some(path.to_string());
+                }
+            }
+        }
+    }
+    None
+}
+
 #[cfg(all(
     target_os = "linux",
     any(feature = "with-tokio", feature = "with-async-std")
 ))]
 fn read_cgroup_path(controller: &str) -> Option<String> {
     let content = std::fs::read_to_string("/proc/self/cgroup").ok()?;
-    for line in content.lines() {
-        ...
-    }
-    None
+    parse_cgroup_path(&content, controller)
 }
Then tests can call parse_cgroup_path directly with known input.
4411-4435: Temp directory tests use fixed paths and may race under parallel execution.

test_cgroup_v2_available_memory_with_limit and test_cgroup_v1_available_memory_with_limit use hardcoded temp directory names (rust_s3_cgroup_test_v2, rust_s3_cgroup_test_v1). If cargo test runs these in parallel (default), they could interfere with each other on repeated runs. Consider using tempfile::tempdir() for unique, auto-cleaned directories, or adding unique suffixes.

Also applies to: 4446-4463

coderabbitai · 2026-02-12T19:52:29Z

s3/src/bucket.rs

+#[cfg(test)]
+#[cfg(target_os = "linux")]
+mod cgroup_tests {


⚠️ Potential issue | 🟡 Minor

test_cgroup_available_memory_returns_none_when_no_cgroup is dead code — the #[cfg(not(target_os = "linux"))] block can never compile.

The entire cgroup_tests module is gated with #[cfg(target_os = "linux")] (line 4345), so the #[cfg(not(target_os = "linux"))] block at line 4479 is unreachable. The test body on Linux is empty, making this a no-op test. Either remove it or restructure so it actually verifies something on Linux (e.g., call the real cgroup_available_memory() and assert it returns Some(…) or None without panicking).

Also applies to: 4473-4483

🤖 Prompt for AI Agents

In `@s3/src/bucket.rs` around lines 4344 - 4346, The test function test_cgroup_available_memory_returns_none_when_no_cgroup inside the cgroup_tests module is dead because the module is #[cfg(target_os = "linux")] while the test body is under #[cfg(not(target_os = "linux"))]; either remove the unreachable #[cfg(not(target_os = "linux"))] block or restructure the tests so they actually run on Linux: move the assertion into the linux-configured module and call cgroup_available_memory() (or the real function under test) and assert the expected Some(...) or None value, or split into two explicit tests with opposite cfg attributes if you need OS-specific behavior; update or delete the empty test body for test_cgroup_available_memory_returns_none_when_no_cgroup accordingly.

swermin · 2026-03-27T21:13:03Z

What’s the status of this PR? We are having issues with uploads taking too much memory and the containers are getting OOMKilled

coderabbitai bot reviewed Feb 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: use cgroup memory limits for multipart upload concurrency in con…#447

fix: use cgroup memory limits for multipart upload concurrency in con…#447
ProbstenHias wants to merge 1 commit intodurch:masterfrom
ProbstenHias:fix/cgroup-memory-detection

ProbstenHias commented Feb 12, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 12, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

brownian-motion-v0 bot commented Feb 12, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Feb 12, 2026

Uh oh!

swermin commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ProbstenHias commented Feb 12, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

brownian-motion-v0 bot commented Feb 12, 2026

Brownian Motion (Brass)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

swermin commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ProbstenHias commented Feb 12, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 12, 2026 •

edited

Loading