security: validate Hydra _target_ before instantiate() to prevent ACE (CWE-913) by Allen930311 · Pull Request #40 · SapienzaNLP/relik

Allen930311 · 2026-05-19T05:36:05Z

Summary

hydra.utils.instantiate() is called in multiple places with config loaded directly from HuggingFace Hub (a config.yaml fully controlled by the model author). A malicious model can set _target_ to any Python callable — e.g. os.system, builtins.exec, or torch.hub.load pointing to an attacker-controlled GitHub repo — achieving arbitrary code execution on the machine that loads the model.

Affected call sites (all reachable via Relik.from_pretrained("attacker/model")):

File	Function	Line
`relik/inference/utils.py`	`_instantiate_index()`	~222
`relik/inference/utils.py`	`_instantiate_retriever()`	~42
`relik/inference/utils.py`	`load_reader()`	~371
`relik/inference/annotator.py`	`Relik.from_pretrained()`	~775
`relik/retriever/indexers/base.py`	`BaseDocumentIndex.from_pretrained()`	~549

Same vulnerability class as CVE-2025-23304 (NeMo) and CVE-2026-22584 (Uni2TS).

Fix

Added _validate_hydra_target(config) in relik/inference/utils.py that rejects any _target_ not prefixed with relik., and called it before every hydra.utils.instantiate() invocation in all affected files.

_SAFE_HYDRA_PREFIXES = ("relik.",)

def _validate_hydra_target(config: DictConfig) -> None:
    target = OmegaConf.select(config, "_target_", default=None)
    if target is not None and not any(
        target.startswith(p) for p in _SAFE_HYDRA_PREFIXES
    ):
        raise ValueError(
            f"Unsafe Hydra _target_ '{target}': only targets within "
            f"{_SAFE_HYDRA_PREFIXES} are permitted."
        )

Test plan

Relik.from_pretrained("legit/model") with a valid relik.* target continues to work
Config with _target_: os.system raises ValueError before hydra.utils.instantiate is called
Config with _target_: torch.hub.load raises ValueError (blocks the torch.hub.load ACE bypass)
All existing tests pass

Security Impact

CVSS 3.1: AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H = 8.8 HIGH
A user who runs Relik.from_pretrained("attacker/model") with a malicious model executes attacker code with the privileges of the Python process.

`hydra.utils.instantiate()` is called with config loaded directly from HuggingFace Hub (config.yaml supplied by the model author). A malicious model can set `_target_` to any Python callable (e.g. `os.system`, `torch.hub.load` with an attacker-controlled hubconf.py), achieving arbitrary code execution on the loading machine. Add `_validate_hydra_target()` in `relik/inference/utils.py` which rejects any `_target_` that does not start with the `relik.` prefix. Apply the guard before every `hydra.utils.instantiate()` call in: - `_instantiate_retriever()` (utils.py) - `_instantiate_index()` (utils.py) - `load_reader()` (utils.py) - `Relik.from_pretrained()` (annotator.py) - `BaseDocumentIndex.from_pretrained()` (retriever/indexers/base.py) Same vulnerability class as CVE-2025-23304 (NeMo) and CVE-2026-22584 (Uni2TS).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

security: validate Hydra _target_ before instantiate() to prevent ACE (CWE-913)#40

security: validate Hydra _target_ before instantiate() to prevent ACE (CWE-913)#40
Allen930311 wants to merge 1 commit into
SapienzaNLP:mainfrom
Allen930311:fix/hydra-instantiate-arbitrary-code-execution

Allen930311 commented May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Allen930311 commented May 19, 2026

Summary

Fix

Test plan

Security Impact

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant