Skip to content

Add safety guardrails example using Sentinel AI#654

Open
MaxwellCalkin wants to merge 1 commit intoanthropics:mainfrom
MaxwellCalkin:add-safety-guardrails-example
Open

Add safety guardrails example using Sentinel AI#654
MaxwellCalkin wants to merge 1 commit intoanthropics:mainfrom
MaxwellCalkin:add-safety-guardrails-example

Conversation

@MaxwellCalkin
Copy link

Summary

Adds a new example showing how to integrate real-time safety scanning into Claude Agent SDK using Sentinel AI, an open-source guardrails library.

The example demonstrates two approaches:

  • PreToolUse hooks — scan tool arguments for dangerous commands, data exfiltration, and prompt injection before execution
  • Tool permission callbacks — use PermissionResultAllow/PermissionResultDeny based on safety analysis, with automatic PII redaction via updated_input

Sentinel AI scans in ~0.05ms with zero heavy dependencies, so it adds negligible latency to tool calls.

Why this is useful

Claude Agent SDK users building agentic applications need safety guardrails for tool calls. This example shows a production-ready pattern for:

  • Blocking dangerous shell commands (rm -rf /, curl | bash, credential access)
  • Detecting prompt injection in tool arguments
  • Automatically redacting PII before it reaches tools
  • Using both hooks and callbacks (the two SDK extension points)

Test plan

  • Example runs with python examples/safety_guardrails.py hooks
  • Example runs with python examples/safety_guardrails.py callback
  • Follows existing example patterns (hooks.py, tool_permission_callback.py)
  • No changes to SDK source code

Generated with Claude Code

Shows two approaches for adding real-time safety scanning to Claude Agent SDK:
1. PreToolUse hooks — scan tool arguments before execution
2. Tool permission callbacks — allow/deny based on safety analysis

Detects dangerous commands, data exfiltration, credential access, prompt
injection, and PII leaks with sub-millisecond latency using Sentinel AI.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant