diff --git a/API.md b/API.md index 3aa396a..0e342e5 100644 --- a/API.md +++ b/API.md @@ -8,7 +8,7 @@ import pyisolate as psi ### Canonical cell execution model -PyIsolate supports exactly seven cell operations: `exec source`, `call dotted function`, `import module`, `post messages`, `stream logs`, `emit metrics`, and `request broker actions`. +PyIsolate supports exactly seven cell operations: `exec`, `call`, `post`, `recv`, `log`, `metric`, and `request`. Isolation mode is explicit in the public API. Use `backend="subinterpreter"` for the execution-cell backend, `backend="process"` for one sandbox per OS process, and `backend="microvm"` for a process placed behind a microVM boundary. The cell contract is the same in every mode, but only the process and microVM modes are intended to represent hard blast-radius boundaries. @@ -33,9 +33,12 @@ result = sb.recv(timeout=0.1) # 1.4142135623 | Method | Semantics | |--------|-----------| | `exec(src)` | Run source in guest. Exceptions are posted to the outbox and must be retrieved with `recv()`. | -| `call(func, *args, **kw)` | Import‑free RPC: call dotted `func` inside guest. | +| `call(func, *args, **kw)` | Call dotted `func` inside guest using policy-controlled module resolution. | | `recv(timeout=None)` | Blocking receive from guest channel. | | `post(obj)` *(guest side)* | Send picklable object to supervisor. | +| `log(level, message, **fields)` *(guest side)* | Emit a structured log event. | +| `metric(name, value, tags=None)` *(guest side)* | Emit a metric datapoint. | +| `request(capability, action, payload=None)` *(guest side)* | Ask the broker to perform a privileged action through an explicit capability. | | `enable_tracing()` | Start recording guest operations. | | `get_syscall_log()` | Return recorded operations. | | `profile()` | Snapshot of current CPU and memory usage. | diff --git a/README.md b/README.md index abe5b87..64e53c9 100644 --- a/README.md +++ b/README.md @@ -198,7 +198,7 @@ Use `pyisolate.policy.refresh("policy/.yml", token="secret")` to hot‑loa ## Canonical execution model -A cell is intentionally limited to seven operations: execute source, call a dotted function, import allowed modules, post messages, stream logs, emit metrics, and request broker actions. +A cell is intentionally limited to seven operations: `exec`, `call`, `post`, `recv`, `log`, `metric`, and `request`. The API makes the isolation choice explicit: `backend="subinterpreter"` means an execution cell, `backend="process"` means a separate OS process boundary, and `backend="microvm"` means a process behind a microVM boundary. The cell contract stays the same across modes, but the security boundary does not: sub-interpreters are not treated as a hard boundary. diff --git a/docs/execution-model.md b/docs/execution-model.md index f6dc870..6a96003 100644 --- a/docs/execution-model.md +++ b/docs/execution-model.md @@ -1,7 +1,10 @@ # Execution model (canonical) -A sandboxed cell has exactly **one** execution contract. +A sandboxed cell has exactly **one** execution contract: the minimal cell ABI. +It is versioned in `pyisolate.runtime.protocol` as `MINIMAL_CELL_ABI` and is +frozen to seven operation names. +## Minimal cell ABI v1 The public API names the isolation backend explicitly: `backend="subinterpreter"` is the execution-cell mode, `backend="process"` is the process-boundary mode, and `backend="microvm"` is the microVM-boundary mode. These modes change the containment boundary, not the seven cell operations below. ## Allowed operations @@ -10,22 +13,34 @@ The public API names the isolation backend explicitly: `backend="subinterpreter" Execute source text inside the cell runtime. 2. **`call(dotted_function, *args, **kwargs)`** Invoke a fully-qualified function path (`module.func`) inside the cell. -3. **`import module`** - Import only modules allowed by policy (`allowed_imports` + policy imports). -4. **`post(message)`** +3. **`post(message)`** Send a single picklable message to the supervisor channel. -5. **`stream logs`** - Emit structured log events as messages on the same channel (log envelope). -6. **`emit metrics`** - Emit metric datapoints as messages on the same channel (metric envelope). -7. **`request broker actions`** - Ask the supervisor/broker to perform privileged actions by posting broker request envelopes. +4. **`recv(timeout=None)`** + Receive the next item from the cell channel. +5. **`log(level, message, **fields)`** + Emit a structured `LogEvent` on the same channel. +6. **`metric(name, value, tags=None)`** + Emit a numeric `MetricEvent` on the same channel. +7. **`request(capability, action, payload=None)`** + Ask the supervisor/broker to perform a privileged action through an explicit + broker capability. If the capability was not granted, the request is rejected. + +## Broker capabilities, not surface growth + +The ABI deliberately does not grow new first-class operations. Filesystem, +network, subprocess, secret, clock, random, IPC, and future privileged behaviors +must be represented as explicit broker capabilities and reached through +`request(...)` or capability objects supplied by policy. + +Allowed imports remain a policy-controlled implementation detail that lets +`call(module.func, ...)` and `exec(...)` resolve code. Importing is not a cell ABI +operation and must not be documented or tested as a separate guest surface. ## Non-goals (intentionally refused) Anything outside the seven operations above is out of model and should be rejected. -In particular, we do **not** add ad-hoc host RPC, shared mutable globals, direct privileged syscalls, -or extra control planes. +In particular, we do **not** add ad-hoc host RPC, shared mutable globals, direct +privileged syscalls, implicit imports, or extra control planes. ## Why this stays small @@ -36,4 +51,5 @@ Production safety improves when the surface area is fixed: - failure modes are bounded, - compatibility is easier to preserve. -If a new feature cannot be expressed as one of the seven operations, it is not a cell feature. +If a new feature cannot be expressed as one of the seven operations or as a +broker capability behind `request(...)`, it is not a cell feature. diff --git a/docs/protocol.md b/docs/protocol.md index 5369f22..dca9491 100644 --- a/docs/protocol.md +++ b/docs/protocol.md @@ -8,18 +8,36 @@ as separate systems. Crossings are intentionally minimal and explicit. +## Minimal cell ABI + +`pyisolate.runtime.protocol.MINIMAL_CELL_ABI` pins the public cell surface at +version 1. The only cell operations are: + +- `exec` -> `ExecRequest(source)` +- `call` -> `CallRequest(target, args, kwargs)` +- `post` -> guest message send +- `recv` -> host receive from the cell channel +- `log` -> `LogEvent(level, message, fields)` +- `metric` -> `MetricEvent(name, value, tags)` +- `request` -> `BrokerRequest(capability, action, payload)` + +Everything else must go through broker capabilities. New filesystem, network, +secret, subprocess, or other privileged behavior should not add new cell ABI +verbs; it should add or refine a broker capability and use `request`. + ## Plane crossings Only structured messages are allowed across the queue boundary. -`pyisolate.runtime.protocol` defines the request vocabulary: +`pyisolate.runtime.protocol` defines the trusted/internal request vocabulary: - `ExecRequest(source)` - `CallRequest(target, args, kwargs)` -- `AttachCgroupRequest(old_path)` -- `StopRequest()` -- `ControlRequest(op, capability, payload)` +- `AttachCgroupRequest(old_path)` (internal supervisor plumbing) +- `StopRequest()` (internal lifecycle sentinel) +- `ControlRequest(op, capability, payload)` (authenticated supervisor control) -This replaces ambient tuple/string payloads with typed requests. +This replaces ambient tuple/string payloads with typed requests while keeping the +public cell ABI frozen. ## Capability handles diff --git a/pyisolate/runtime/protocol.py b/pyisolate/runtime/protocol.py index 8949e23..294e8b4 100644 --- a/pyisolate/runtime/protocol.py +++ b/pyisolate/runtime/protocol.py @@ -1,14 +1,66 @@ -"""Explicit request protocol between trusted and untrusted planes. +"""Explicit minimal cell ABI between trusted and untrusted planes. The trusted control-plane (supervisor, broker, metrics, policy engine) communicates -with untrusted sandbox threads via structured message types only. +with untrusted sandbox threads via a fixed vocabulary only. The public cell ABI is +intentionally tiny: ``exec``, ``call``, ``post``, ``recv``, ``log``, ``metric``, and +``request``. Any operation outside this vocabulary must be expressed as a broker +capability request instead of growing the guest surface. """ from __future__ import annotations -from dataclasses import dataclass +from dataclasses import dataclass, field +from enum import Enum from pathlib import Path -from typing import Any +from typing import Any, Mapping + +ABI_VERSION = 1 +"""Monotonic version for the frozen cell ABI.""" + + +class CellOp(str, Enum): + """The complete public operation set exposed by a cell.""" + + EXEC = "exec" + CALL = "call" + POST = "post" + RECV = "recv" + LOG = "log" + METRIC = "metric" + REQUEST = "request" + + +CELL_ABI: tuple[CellOp, ...] = ( + CellOp.EXEC, + CellOp.CALL, + CellOp.POST, + CellOp.RECV, + CellOp.LOG, + CellOp.METRIC, + CellOp.REQUEST, +) +"""Canonical ordered list of operations in the minimal cell ABI.""" + +CELL_ABI_NAMES: tuple[str, ...] = tuple(op.value for op in CELL_ABI) +"""String names for documentation, validation, and tests.""" + + +@dataclass(frozen=True) +class CellABI: + """Description of the frozen guest/control protocol surface.""" + + version: int = ABI_VERSION + operations: tuple[str, ...] = CELL_ABI_NAMES + + def allows(self, op: str | CellOp) -> bool: + """Return whether *op* is part of the frozen ABI.""" + + name = op.value if isinstance(op, CellOp) else op + return name in self.operations + + +MINIMAL_CELL_ABI = CellABI() +"""Runtime constant used by conformance checks to pin the cell surface.""" @dataclass(frozen=True) @@ -24,6 +76,7 @@ class ExecRequest: """Execute source code in the workload plane.""" source: str + op: CellOp = CellOp.EXEC @dataclass(frozen=True) @@ -33,11 +86,61 @@ class CallRequest: target: str args: tuple[Any, ...] kwargs: dict[str, Any] + op: CellOp = CellOp.CALL + + +@dataclass(frozen=True) +class RecvRequest: + """Receive the next message from the cell channel.""" + + timeout: float | None = None + op: CellOp = CellOp.RECV + + +@dataclass(frozen=True) +class PostEvent: + """Guest-to-supervisor message sent with ``post``.""" + + message: Any + op: CellOp = CellOp.POST + + +@dataclass(frozen=True) +class LogEvent: + """Structured guest log record emitted on the cell channel.""" + + level: str + message: str + fields: Mapping[str, Any] = field(default_factory=dict) + op: CellOp = CellOp.LOG + + +@dataclass(frozen=True) +class MetricEvent: + """Metric datapoint emitted on the cell channel.""" + + name: str + value: int | float + tags: Mapping[str, str] = field(default_factory=dict) + op: CellOp = CellOp.METRIC + + +@dataclass(frozen=True) +class BrokerRequest: + """Request for a privileged broker action through an explicit capability.""" + + capability: str + action: str + payload: Mapping[str, Any] = field(default_factory=dict) + op: CellOp = CellOp.REQUEST @dataclass(frozen=True) class AttachCgroupRequest: - """Control-plane request to (re)attach to a cgroup path.""" + """Control-plane request to (re)attach to a cgroup path. + + This is internal supervisor plumbing, not part of the public cell ABI. + """ old_path: Path | None msg_id: int = 0 @@ -45,12 +148,19 @@ class AttachCgroupRequest: @dataclass(frozen=True) class StopRequest: - """Sentinel request indicating sandbox thread termination.""" + """Sentinel request indicating sandbox thread termination. + + This is internal supervisor plumbing, not part of the public cell ABI. + """ @dataclass(frozen=True) class ControlRequest: - """Authenticated control operation crossing plane boundaries.""" + """Authenticated control operation crossing plane boundaries. + + Supervisor control requests are outside the guest ABI and must carry an + explicit root or policy capability. + """ op: str capability: CapabilityHandle diff --git a/pyisolate/runtime/thread.py b/pyisolate/runtime/thread.py index d0d07e0..cc6900e 100644 --- a/pyisolate/runtime/thread.py +++ b/pyisolate/runtime/thread.py @@ -40,8 +40,11 @@ ) from .protocol import ( AttachCgroupRequest, + BrokerRequest, CallRequest, ExecRequest, + LogEvent, + MetricEvent, StopRequest, ) from ..numa import bind_current_thread @@ -652,7 +655,11 @@ def snapshot(self) -> dict: "policy": self.policy, "cpu_ms": self.cpu_quota_ms, "mem_bytes": self.mem_quota_bytes, - "allowed_imports": sorted(self.allowed_imports), + "allowed_imports": ( + sorted(self.allowed_imports) + if self.allowed_imports is not None + else None + ), "numa_node": self.numa_node, "capabilities": serialize_capabilities(self._capabilities), "wall_time_ms": self.wall_time_ms, @@ -673,7 +680,11 @@ def reset_config(self) -> dict[str, Any]: "network_ops_max": self.network_ops_max, "output_bytes_max": self.output_bytes_max, "child_work_max": self.child_work_max, - "allowed_imports": sorted(self.allowed_imports), + "allowed_imports": ( + sorted(self.allowed_imports) + if self.allowed_imports is not None + else None + ), "numa_node": self.numa_node, "capabilities": serialize_capabilities(self._capabilities), } @@ -703,6 +714,9 @@ def _estimate_output_size(item: Any) -> int: return len(repr(item).encode("utf-8")) def _post(self, item: Any) -> None: + self._emit(item) + + def _emit(self, item: Any) -> None: self._output_bytes += self._estimate_output_size(item) if ( self.output_bytes_max is not None @@ -711,6 +725,27 @@ def _post(self, item: Any) -> None: raise errors.OutputExceeded() self._outbox.put(item) + def _log(self, level: str, message: str, **fields: Any) -> None: + self._emit(LogEvent(level=level, message=message, fields=fields)) + + def _metric( + self, name: str, value: int | float, tags: Optional[dict[str, str]] = None + ) -> None: + self._emit(MetricEvent(name=name, value=value, tags=tags or {})) + + def _request( + self, capability: str, action: str, payload: Optional[dict[str, Any]] = None + ) -> None: + if capability not in self._capabilities: + raise errors.PolicyError(f"capability request blocked: {capability}") + self._emit( + BrokerRequest( + capability=capability, + action=action, + payload=payload or {}, + ) + ) + def _check_open_files_quota(self) -> None: if self.open_files_max is not None and self._open_files >= self.open_files_max: raise errors.OpenFilesExceeded() @@ -915,7 +950,13 @@ def run(self) -> None: self._cpu_time = 0.0 self._start_time = None - local_vars = {"post": self._post, "caps": self._capabilities} + local_vars = { + "post": self._post, + "log": self._log, + "metric": self._metric, + "request": self._request, + "caps": self._capabilities, + } if self.numa_node is not None: bind_current_thread(self.numa_node) diff --git a/tests/test_protocol_plane.py b/tests/test_protocol_plane.py index 59be0f5..334307d 100644 --- a/tests/test_protocol_plane.py +++ b/tests/test_protocol_plane.py @@ -1,9 +1,35 @@ -from pyisolate.capabilities import ROOT -from pyisolate.runtime.protocol import CallRequest, ExecRequest +import pytest + +from pyisolate import errors +from pyisolate.capabilities import ROOT, FilesystemCapability +from pyisolate.runtime.protocol import ( + BrokerRequest, + CallRequest, + CellOp, + ExecRequest, + LogEvent, + MetricEvent, + MINIMAL_CELL_ABI, +) from pyisolate.runtime.thread import SandboxThread from pyisolate.supervisor import Supervisor +def test_minimal_cell_abi_is_frozen_to_seven_operations(): + assert MINIMAL_CELL_ABI.version == 1 + assert MINIMAL_CELL_ABI.operations == ( + "exec", + "call", + "post", + "recv", + "log", + "metric", + "request", + ) + assert {op.value for op in CellOp} == set(MINIMAL_CELL_ABI.operations) + assert not MINIMAL_CELL_ABI.allows("import") + + def test_sandbox_thread_uses_structured_requests(monkeypatch): thread = SandboxThread(name="proto") captured = [] @@ -19,9 +45,56 @@ def capture(msg): try: thread.exec("x=1") assert isinstance(captured[0], ExecRequest) + assert captured[0].op is CellOp.EXEC assert thread.call("builtins.len", [1, 2, 3]) == 3 assert isinstance(captured[1], CallRequest) + assert captured[1].op is CellOp.CALL + finally: + thread.stop() + + +def test_guest_log_metric_and_request_are_channel_events(tmp_path): + cap_root = tmp_path / "allowed" + cap_root.mkdir() + thread = SandboxThread( + name="abi-events", + capabilities={"filesystem": FilesystemCapability.from_paths(cap_root)}, + ) + thread.start() + try: + thread.exec( + "log('info', 'ready', component='guest')\n" + "metric('jobs_total', 1, {'unit': 'count'})\n" + "request('filesystem', 'stat', {'path': 'allowed'})" + ) + + log_event = thread.recv(timeout=0.5) + metric_event = thread.recv(timeout=0.5) + broker_request = thread.recv(timeout=0.5) + + assert log_event == LogEvent( + level="info", message="ready", fields={"component": "guest"} + ) + assert metric_event == MetricEvent( + name="jobs_total", value=1, tags={"unit": "count"} + ) + assert broker_request == BrokerRequest( + capability="filesystem", action="stat", payload={"path": "allowed"} + ) + finally: + thread.stop() + + +def test_guest_request_requires_explicit_capability(): + thread = SandboxThread(name="abi-request-denied") + thread.start() + try: + thread.exec("request('filesystem', 'stat', {'path': 'x'})") + with pytest.raises( + errors.PolicyError, match="capability request blocked: filesystem" + ): + thread.recv(timeout=0.5) finally: thread.stop()