was curious how gh-aw thinks about agent actions, especially when an agent wants to comment, open PRs, modify files, or call external tools? #28656

mdashrraf · 2026-04-27T04:24:19Z

mdashrraf
Apr 27, 2026

Hey, I’m exploring how agentic workflows are being designed inside CI environments. I was curious how gh-aw thinks about agent actions, especially when an agent wants to comment, open PRs, modify files, or call external tools. Are actions generally constrained by the GitHub Actions runtime, or is there a separate approval model you recommend?

chrizbo · 2026-05-01T23:16:12Z

chrizbo
May 1, 2026

This is an automated response from the GitHub Agentic Workflows bot, not a human team member. It is based on our documentation and may not be perfectly accurate. A team member will follow up if needed.

Hi @mdashrraf 👋

Agent actions in gh-aw use both the GitHub Actions runtime and a dedicated approval model — defense-in-depth rather than one or the other.

The AI agent step runs read-only by default. To comment, open PRs, or modify files, the workflow must declare explicit safe outputs. These execute in separate GitHub Actions jobs with scoped write tokens, so the agent itself never holds write credentials. All output is sanitized (secret redaction, URL filtering, size limits) before being applied.

safe-outputs:
  add-comment:
    max: 3
  create-pull-request:
    max: 1

Calls to external tools are constrained by the Agent Workflow Firewall — you explicitly allowlist each domain the agent may reach.

For extra control, you can put a GitHub Environment protection rule in front of any safe output job so a designated reviewer must approve before write operations run.

Relevant documentation:

_{🤖 This is an automated response based on our documentation. A team member will follow up if needed.}

Generated by Feedback Question Answerer · ● 673.9K · ◷

0 replies

mdashrraf · 2026-05-02T01:07:32Z

mdashrraf
May 2, 2026
Author

Thanks for the detailed breakdown — the Safe Outputs pattern is really interesting, especially separating intent from execution.

It feels like a strong foundation from a permissions and isolation standpoint.

Curious how you’re thinking about cases where an action is allowed structurally (e.g. create PR or comment), but the specific context makes that action undesirable at that moment.

Are teams mostly handling that upstream in prompts, or is there any runtime layer evaluating actions just before execution?

2 replies

pelikhan May 2, 2026
Maintainer

The prompt dictates how the agent should call safe-outputs—when to call them and what to return. It can also include additional logic to skip execution entirely and do nothing. For example, there is a dedicated safe-output called NOOP that rewards the agent for taking no action.

chrizbo May 2, 2026

This is an automated response from the GitHub Agentic Workflows bot, not a human team member. It is based on our documentation and may not be perfectly accurate. A team member will follow up if needed.

Hi @mdashrraf 👋

Great follow-up. In practice, gh-aw uses both approaches — prompt-level guidance and a runtime evaluation layer.

Prompt-level (upstream): The workflow markdown itself is where you encode contextual judgment. Because the agent's instructions are natural language, you can express nuanced conditions like "only comment if the issue is stale for 7+ days" or "skip creating a PR if the change is trivial." The agent interprets these instructions in context at runtime, so your prompt is the primary contextual filter.

Runtime evaluation (just before execution): Before any safe output is applied, a dedicated threat detection job runs an AI-powered scan of the agent's proposed actions. While its primary focus is security (prompt injection, credential leaks, malicious patterns), it acts as a last-mile gate that can block actions the agent shouldn't have proposed.

Structural constraints as a backstop: Even if the agent decides to act, safe outputs enforce hard limits (max: 1, title-prefix, allowed-files) that prevent runaway behavior regardless of context.

So the layering is: prompt shapes intent → agent reasons about context → threat detection validates → safe outputs enforce bounds. Most contextual "should I act?" logic lives in the prompt, and the runtime layers catch what slips through.

Relevant documentation:

_{🤖 This is an automated response based on our documentation. A team member will follow up if needed.}

mdashrraf · 2026-05-02T07:38:48Z

mdashrraf
May 2, 2026
Author

Thanks, this is super helpful — the layering makes sense.

What stood out to me is that most of the contextual “should I act?” logic lives in the prompt, with runtime layers acting more as a safety net.

Curious how this holds up in practice when the context is ambiguous or slightly off — especially since the decision is still ultimately interpreted by the model.

Feels like there might be a gap between intent (what the prompt describes) and enforcement (what actually executes), particularly for actions that are technically allowed but contextually undesirable.

Have you seen teams needing something more deterministic at the point of execution, beyond prompt guidance + threat detection?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

was curious how gh-aw thinks about agent actions, especially when an agent wants to comment, open PRs, modify files, or call external tools? #28656

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

was curious how gh-aw thinks about agent actions, especially when an agent wants to comment, open PRs, modify files, or call external tools? #28656

Uh oh!

mdashrraf Apr 27, 2026

Replies: 3 comments · 2 replies

Uh oh!

Uh oh!

chrizbo May 1, 2026

Uh oh!

mdashrraf May 2, 2026 Author

Uh oh!

pelikhan May 2, 2026 Maintainer

Uh oh!

chrizbo May 2, 2026

Uh oh!

mdashrraf May 2, 2026 Author

mdashrraf
Apr 27, 2026

Replies: 3 comments 2 replies

chrizbo
May 1, 2026

mdashrraf
May 2, 2026
Author

pelikhan May 2, 2026
Maintainer

mdashrraf
May 2, 2026
Author