Skip to content

Add S3 blob storage with cashier billing to ic-gateway#193

Open
shilingwang wants to merge 21 commits intomainfrom
shiling/blob-storage
Open

Add S3 blob storage with cashier billing to ic-gateway#193
shilingwang wants to merge 21 commits intomainfrom
shiling/blob-storage

Conversation

@shilingwang
Copy link
Copy Markdown
Contributor

@shilingwang shilingwang commented Apr 17, 2026

#NODE-1941

Summary

  • Adds a full blob storage API to ic-gateway, enabling upload/download of content-addressed blobs backed by a single AWS S3 bucket with per-owner billing through the cashier canister.
  • Introduces new /v1/ HTTP endpoints for blob metadata, chunk operations, and owner data management, gated behind --s3-endpoint and --cashier-canister-id CLI flags.
  • Integrates billing (budget checks, usage reporting) via a CashierConnector that caches budgets locally and flushes usage counters periodically, wired into ic-gateway's existing TaskManager and HealthManager.

New modules

  • src/s3/ — S3 client abstraction (BucketLike trait, AWSBucket impl, RamFakeBucket for dev), config
  • src/cashier/ — CashierClient (4 canister calls: whoami, pricelist, budget, usage reporting), CashierConnector (local billing cache + periodic flush)
  • src/storage/ — Shared types (blob metadata, hash tree, chunk constants), S3 key paths, IC egress certificate auth
  • src/routing/storage/ — Axum handlers + router for all /v1/ endpoints

HTTP endpoints (under /v1/)

  • HEAD /v1/blob — Blob metadata headers (size, content type)
  • GET /v1/blob — Download blob with Range header support
  • GET /v1/blob-tree — Raw blob metadata JSON
  • PUT /v1/blob-tree — Upload blob metadata (with IC egress cert auth)
  • GET /v1/chunk — Download a single chunk
  • PUT /v1/chunk — Upload a single chunk (SHA-256 verified)
  • DELETE /v1/owner — Delete all data for an owner (host-gated)

Design decisions

  • Single S3 bucket: One bucket configured via CLI, no multi-bucket routing. Simpler than the multi-instance model in object-storage.
  • Billing gated: Storage routes are only mounted when both --s3-endpoint and --cashier-canister-id are provided. Without either, ic-gateway serves only normal IC traffic.
  • Budget caching: Per-owner budgets cached for 30s to avoid hitting the cashier canister on every request. Usage counters flushed every 10s.
  • IC egress auth: PUT /blob-tree verifies an OwnerEgressSignature certificate from the request body. Bypassable with --fake-ingress-auth for local dev.

@shilingwang shilingwang marked this pull request as ready for review April 17, 2026 14:53
@shilingwang shilingwang requested a review from a team as a code owner April 17, 2026 14:53
Comment thread src/routing/mod.rs Outdated
Comment thread src/routing/mod.rs Outdated
Comment thread src/cli.rs Outdated
Comment thread src/routing/storage/mod.rs Outdated
Comment thread src/routing/storage/mod.rs Outdated
Copy link
Copy Markdown

@frankdavid frankdavid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any plans for testing?

Comment thread src/routing/storage/auth.rs
Comment thread src/routing/storage/auth.rs Outdated
Comment thread src/routing/storage/bucket.rs Outdated
Comment thread src/routing/storage/bucket.rs Outdated
data.len()
};

yield bytes::Bytes::copy_from_slice(&data[s..e]);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if s == 0 and e == data..len(), skip copying. Alternatively you can create Bytes from data and then use Bytes::slice which is O(1).

async fn consume_budget(&self, owner: &Principal, cost: i64) -> Result<(), BillingError> {
{
let mut budgets = self.budgets.write().await;
if let Some(cached) = budgets.get_mut(owner) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe turning around the structure would be easier to read:

if not in cache or too old {
    refresh cache...
}
try_debit()

if divisor == 0 {
return 0;
}
(quantity as i64 * cost) / divisor
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can it ever overflow?

Comment thread src/routing/storage/handler.rs Outdated
let bucket_c = state.bucket.clone();

let stream: async_stream::__private::AsyncStream<Result<bytes::Bytes, std::io::Error>, _> = async_stream::try_stream! {
for hash in &chunk_hashes {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The chunks are downloaded sequentially, is this on purpose?

Comment thread src/routing/storage/handler.rs Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants