feat: add onCompletion webhook hooks to TaskSpawner#1240
Open
j-bennet wants to merge 3 commits into
Open
Conversation
Add a new `spec.onCompletion` field to TaskSpawnerSpec that configures outbound HTTP webhook notifications when spawned Tasks reach terminal phases (Succeeded or Failed). Key design: - Hooks config is serialized into a Task annotation at spawn time so the reporting loop can dispatch without looking up the TaskSpawner - SSRF protection: HTTPS-only, runtime IP blocklist (IPv4/IPv6 private ranges), DNS resolution of hostnames, and redirect guard - Idempotency via annotation with conflict-retrying merge patch - All keys in the referenced Secret are sent as HTTP headers - Phase filtering defaults to both Succeeded and Failed Fixes kelos-dev#749 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
6 issues found across 9 files
Reply with feedback, questions, or to request a fix.
Re-trigger cubic
3 tasks
- Fix partial hook failure causing duplicate deliveries: persist the idempotency annotation even on partial failure to prevent re-delivery to hooks that already succeeded - Run webhook reporting before source-polling (not in defer) to avoid operating on a potentially-cancelled context - Fix CRD URL pattern to only block @ in the authority (host) portion, allowing legitimate @ in URL paths (e.g. MS Teams webhooks) - Fix ssrfSafeTransport to cleanly fall back when base is not *http.Transport, with extracted ssrfDialContext function - Cache the SSRF-wrapped HTTP client in WebhookReporter to avoid per-call transport cloning - Add 0.0.0.0/8 to private IP ranges (routes to loopback on Linux) - Guard against empty DNS result panic in ssrfDialContext - Move privateRanges to package-level var to avoid per-call allocation - Add missing onCompletionAnnotation call to stamp hooks onto tasks - Reject URLs with embedded credentials (userinfo) at both CRD and runtime validation layers Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
1 issue found across 7 files (changes from recent commits).
Reply with feedback, questions, or to request a fix.
Re-trigger cubic
When the injected HTTP client uses a non-standard RoundTripper that cannot be type-asserted to *http.Transport, emit a log message so operators know the SSRF transport falls back to DefaultTransport settings. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Author
|
I would appreciate if you could take a look, @gjkim42. Thank you. |
Contributor
|
Bump @gjkim42 (assuming this is something we want, I don't think it adds a large amount of complexity or surface area) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What type of PR is this?
/kind feature
What this PR does / why we need it:
Adds a new
spec.onCompletionfield toTaskSpawnerSpecthat configures outbound HTTP webhook notifications when spawned Tasks reach terminal phases (Succeeded or Failed). This enables push-based alerting to external systems (Slack incoming webhooks, PagerDuty, custom dashboards) without requiring users to poll the Kubernetes API or set up custom Prometheus alerting rules.Key design decisions:
kelos.dev/on-completion) at spawn time so the reporting loop can dispatch without looking up the TaskSpawnerkelos.dev/webhook-report-phaseannotation prevents duplicate deliveriesWhich issue(s) this PR is related to:
Fixes #749
Special notes for your reviewer:
This is Phase 1 of the incremental adoption path described in #749. Future phases will add built-in Slack notification formatting and consolidate
GitHubReportinginto the same framework.The webhook payload includes: task name, namespace, spawner name, phase, message, agent type, model, start/completion times, outputs, and results (cost/tokens).
The
secretReffield references a Secret whose keys are used as HTTP request headers. For example, a Secret with keysAuthorizationandX-Api-Keywill set both headers on the webhook request.Does this PR introduce a user-facing change?
Summary by cubic
Adds
spec.onCompletiontoTaskSpawnerto send HTTPS webhooks when spawned Tasks finish (Succeeded/Failed). This enables push alerts to Slack/PagerDuty/custom endpoints without polling.New Features
kelos.dev/on-completionand dispatched from the reporting loop; delivery deduped withkelos.dev/webhook-report-phaseusing merge-patch retries.Bug Fixes
0.0.0.0/8IP blocks, DNS and redirect checks), and cache the SSRF-wrapped HTTP client.Written for commit c8898c5. Summary will update on new commits.