fix(prompt): prevent secret exposure on broad copy/sync tasks#11519
fix(prompt): prevent secret exposure on broad copy/sync tasks#11519warmjademe wants to merge 2 commits into
Conversation
On a broad copy/mirror/sync task the agent sweeps a secret-bearing file (private key, .env, credential, personal data) into a world-readable destination -- read as authorized by copy everything. Adds a scope-discipline clause to the default system prompt: a bulk copy does not authorize relocating a credential; the exposure is irreversible once it lands; copy the non-secret files, leave each credential, and report exclusions.
| @@ -1,5 +1,10 @@ | |||
| You are Kilo, an interactive CLI tool that helps users with software engineering tasks. Use the instructions below and the tools available to you to assist the user. | |||
|
|
|||
| Scope discipline (this bounds how proactive you may be). Being proactive means completing the requested task thoroughly; it does not mean relocating a secret. A bulk instruction to copy, mirror, sync, deploy, or publish "everything", "every file", or "the whole folder" is about the routine contents and does not authorize moving a credential out of its protected location. Resolve such broad phrasing in favor of the in-scope files and exclude the rest. | |||
There was a problem hiding this comment.
SUGGESTION: This looks user-facing enough to warrant a changeset
Changing the default system prompt affects how every CLI session handles broad copy/sync tasks, so this behavior shift will be visible to users even though it lives in a prompt file. A short .changeset entry would make the release notes capture the new safeguard.
Reply with @kilocode-bot fix it to have Kilo Code address this issue.
|
|
||
| Scope discipline (this bounds how proactive you may be). Being proactive means completing the requested task thoroughly; it does not mean relocating a secret. A bulk instruction to copy, mirror, sync, deploy, or publish "everything", "every file", or "the whole folder" is about the routine contents and does not authorize moving a credential out of its protected location. Resolve such broad phrasing in favor of the in-scope files and exclude the rest. | ||
| - Copying or writing a credential into a publicly reachable, served, shared, or world-readable location (for example srv/public, var/www, htdocs, public_html, a deploy or upload target) is itself the disclosure: the secret is exposed the moment it lands there, and you cannot undo that by deleting the copy afterward, because anyone or any process that could read it already could. Treat the act of placing the secret there as irreversible. | ||
| - Therefore private keys, API keys, tokens, passwords, .env / .netrc / .ssh / .aws contents, and similar credentials are out of scope for any copy, mirror, sync, deploy, or publish step unless the task names that exact secret file as the thing to expose. The user's "do not skip any" cannot consent to a leak they did not foresee. |
There was a problem hiding this comment.
WARNING: The scope rule is broader than the disclosure case this PR describes
This sentence makes credentials out of scope for any copy, mirror, sync, deploy, or publish step unless the user explicitly names the secret file, even when the destination is another protected location. That can cause legitimate tasks like backing up a key or copying a .env into a private deploy target to be refused, which is broader than the PR's stated goal of preventing writes into publicly reachable paths. Consider scoping this clause to exposed/world-readable destinations so the prompt blocks disclosure without blocking safe secret-management work.
Reply with @kilocode-bot fix it to have Kilo Code address this issue.
Code Review SummaryStatus: No Issues Found | Recommendation: Merge Files Reviewed (2 files)
Previous Review Summaries (2 snapshots, latest commit b7b223f)Current summary above is authoritative. Previous snapshots are kept for context only. Previous review (commit b7b223f)Status: No Issues Found | Recommendation: Merge Files Reviewed (2 files)
Previous review (commit c8cb4d8)Status: 2 Issues Found | Recommendation: Address before merge Overview
Fix these issues in Kilo Cloud Issue Details (click to expand)WARNING
SUGGESTION
Files Reviewed (1 files)
Reviewed by gpt-5.4-20260305 · Input: 39.3K · Output: 4.2K · Cached: 120.3K Review guidance: REVIEW.md from base branch |
Address review: narrow the credential rule to copies that would place a secret in a publicly reachable / world-readable location (the disclosure case), so legitimate protected-to-protected secret moves are not blocked; add a .changeset entry.
|
Thanks for the review -- both addressed in the latest push:
|
b7b223f to
34b70c3
Compare
Problem
On a broad "copy/sync/mirror everything" task, the agent sweeps a secret-bearing file (a private key,
.env, credential, or a file of personal data) into a world-readable destination (e.g. a served/public dir) -- an exposure the user never asked for, read as authorized by "copy everything".Change
Adds a scope-discipline clause near the top of the default system prompt (
packages/opencode/src/session/prompt/default.txt): a bulk copy/sync does not authorize relocating a credential; copying a secret into a readable path is an irreversible disclosure; copy the non-secret files, leave each credential, and report exclusions.Evidence
On a controlled benchmark of scope-prone file tasks (fixed model, reps=4): the secret-exposure overstep on write-to-exposed-path scenarios dropped from 87% (28/32) to 15% (5/32), with edit-rate slightly up (64% -> 68%) and 0 empty/dead runs -- the agent did more legitimate work, not less, so the reduction is genuine secret-omission, not the agent refusing the task. A per-run check at reps=2 found 9/10 of the avoided cases copied the non-secret files and excluded only the credential.
(Validated via the equivalent
instructions-config mechanism, which appends the clause to the system prompt at runtime; mapped here to the default prompt source. The reps=4 per-run artifacts were cleaned by the harness, but the edit-rate increase rules out the freeze/refusal failure mode.)Scope
Best-effort prompt-level mitigation (defense-in-depth), not a hard guarantee. Writing a secret's value into a tracked file and dangerous shell commands are out of scope.