fix security: feat(resources): harden HTTP resource ingestion against private-network SSRF by 13ernkastel · Pull Request #1133 · volcengine/OpenViking

13ernkastel · 2026-03-31T13:05:43Z

Summary

This change closes an authenticated SSRF path in the HTTP resource ingestion flow. Before this patch, /api/v1/resources accepted arbitrary remote URLs, the parser stack issued server-side HEAD and GET requests with redirects enabled, and the fetched content could then be read back through normal content APIs. A low-privilege API caller could abuse that behavior to reach loopback, RFC1918, link-local, or metadata services reachable from the OpenViking host.

CVSS v3.1: 8.8 High (CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:C/C:H/I:L/A:L)

Root Cause

Remote-target validation only distinguished URL input from direct filesystem paths.
The HTML fetch path performed outbound requests without checking whether the destination resolved to a private or otherwise non-public address.
Permission-style security rejections in the parser path could be swallowed as parse failures instead of surfacing as hard API errors.

What Changed

Added openviking/utils/network_guard.py to extract destination hosts, resolve them, reject non-public addresses, and build per-request httpx validation hooks.
Enforced public-target validation in openviking/server/local_input_guard.py and enabled it for the HTTP /api/v1/resources route.
Added enforce_public_remote_targets in openviking/service/resource_service.py so the service injects request validation into the parser chain and preserves the enforcement flag for watch-based reprocessing.
Threaded the request validator through openviking/utils/media_processor.py into openviking/parse/parsers/html.py so URL detection, HTML fetches, downloads, redirects, and proxy inheritance are all checked consistently.
Updated openviking/utils/resource_processor.py to re-raise OpenVikingError so blocked requests terminate as structured permission failures instead of degrading into parse warnings.
Added regression coverage in tests/server/test_api_local_input_security.py for loopback HTTP targets, private git/SSH targets, and parser-level enforcement.

Simple PoC (localhost-only, pre-patch behavior)

This is a controlled reproduction against a local test instance only. It demonstrates the SSRF primitive without targeting cloud metadata or real internal services.

Start a loopback-only HTTP server that returns a unique token.

mkdir -p /tmp/ov-ssrf-poc
printf 'SSRF_PROOF_TOKEN_9f2d1b\n' > /tmp/ov-ssrf-poc/index.html
python3 -m http.server 8765 --bind 127.0.0.1 --directory /tmp/ov-ssrf-poc

Ask OpenViking to ingest that loopback URL.

curl -X POST http://localhost:1933/api/v1/resources \
  -H "Content-Type: application/json" \
  -H "X-API-Key: your-key" \
  -d '{
    "path": "http://127.0.0.1:8765/",
    "wait": true,
    "reason": "localhost-only ssrf reproduction"
  }'

Use the returned root_uri to inspect the imported tree and then read back the stored content.

curl -X GET "http://localhost:1933/api/v1/fs/tree?uri=<root_uri>" \
  -H "X-API-Key: your-key"

curl -X GET "http://localhost:1933/api/v1/content/read?uri=<stored_uri>" \
  -H "X-API-Key: your-key"

Before this patch, the content APIs can return SSRF_PROOF_TOKEN_9f2d1b, proving that the server fetched a loopback-only resource and exposed the response through normal ingestion APIs. After this patch, the initial POST /api/v1/resources request is rejected with PERMISSION_DENIED.

Validation

uv run --no-project --python 3.12 python -m py_compile openviking/utils/network_guard.py openviking/server/local_input_guard.py openviking/server/routers/resources.py openviking/service/resource_service.py openviking/utils/media_processor.py openviking/utils/resource_processor.py openviking/parse/parsers/html.py tests/server/test_api_local_input_security.py
Local dynamic verification against the parser path showed that the vulnerable flow could previously send HEAD and GET requests to a loopback-only HTTP server and retrieve a unique response token.
After this patch, the same loopback target is blocked at precheck, URL detection, and fetch time with PERMISSION_DENIED, and the loopback server receives no requests.
A direct pytest run was not completed in this workspace because the repository runtime environment is incomplete here (openviking import currently requires additional dependencies such as requests and bundled AGFS setup).

Follow-up Hardening

Extend the same transport-level validation to non-httpx repository fetchers such as the GitHub ZIP download path and git clone, so repository ingestion gets equivalent redirect and proxy protections.
Consider connection-time IP pinning if the project wants stronger resistance to DNS rebinding between validation and connect.
If controlled intranet ingestion is a legitimate product requirement, gate it behind an explicit administrator allowlist rather than implicit access to private networks.

Code Walkthrough

openviking/utils/network_guard.py
Introduces the destination-host parser, DNS resolution checks, non-public address rejection, and reusable httpx request hooks.
openviking/server/local_input_guard.py and openviking/server/routers/resources.py
Keep the existing remote-input contract but now require public remote targets for server-side resource ingestion.
openviking/service/resource_service.py
Adds enforce_public_remote_targets, re-validates remote sources, injects the request validator into parser kwargs, and preserves the boolean flag for watch-task reprocessing.
openviking/utils/media_processor.py and openviking/parse/parsers/html.py
Propagate the validator into URL detection and fetch helpers so outbound HEAD and GET requests, redirects, and proxy inheritance are checked consistently.
openviking/utils/resource_processor.py
Re-raises framework security errors instead of flattening them into parse warnings.
tests/server/test_api_local_input_security.py
Adds regression tests that fail if loopback or private-network fetches become reachable again through the HTTP resource ingestion path.

github-actions · 2026-03-31T13:22:03Z

Failed to generate code suggestions for PR

github-actions · 2026-03-31T13:22:36Z

Failed to generate code suggestions for PR

Harden HTTP resource ingestion against private-network SSRF

37f1767

github-project-automation bot added this to OpenViking project Mar 31, 2026

github-project-automation bot moved this to Backlog in OpenViking project Mar 31, 2026

13ernkastel marked this pull request as ready for review March 31, 2026 13:08

13ernkastel changed the title ~~Harden HTTP resource ingestion against private-network SSRF~~ security: feat(resources): harden HTTP resource ingestion against private-network SSRF Mar 31, 2026

13ernkastel added 2 commits March 31, 2026 21:28

chore(ci): retrigger checks

94e30e2

style: fix resource service import order

e4d179f

13ernkastel changed the title ~~security: feat(resources): harden HTTP resource ingestion against private-network SSRF~~ fix security: feat(resources): harden HTTP resource ingestion against private-network SSRF Apr 2, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix security: feat(resources): harden HTTP resource ingestion against private-network SSRF#1133

fix security: feat(resources): harden HTTP resource ingestion against private-network SSRF#1133
13ernkastel wants to merge 3 commits intovolcengine:mainfrom
13ernkastel:security/http-resource-ssrf-guard

13ernkastel commented Mar 31, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 31, 2026

Uh oh!

github-actions bot commented Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

13ernkastel commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root Cause

What Changed

Simple PoC (localhost-only, pre-patch behavior)

Validation

Follow-up Hardening

Code Walkthrough

Uh oh!

github-actions bot commented Mar 31, 2026

Uh oh!

github-actions bot commented Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

13ernkastel commented Mar 31, 2026 •

edited

Loading