Skip to content

Conversation

@LuD1161
Copy link
Contributor

@LuD1161 LuD1161 commented Jan 22, 2026

Summary

This PR adds a Security Analytics platform to ShipSec Studio that enables users to index workflow output data into OpenSearch and visualize it through dashboards. This transforms raw scan outputs into actionable intelligence for security teams.

Commits (for reviewers)

# Commit Description
1 feat(infra) Nginx reverse proxy and production security setup (20 files)
2 feat(workflows) STALE status and status inference improvements (5 files)
3 feat(analytics) Main feature: Security Analytics platform with OpenSearch (67 files)

Suggested review order: Start with commit 1 (infra), then 2 (small workflow change), then 3 (main feature).

Key Features

  • Analytics Sink Component: New workflow node (core.analytics.sink) that indexes output data from any upstream node to OpenSearch

    • Supports array and object inputs with automatic bulk indexing
    • Auto-detects asset correlation keys (host, domain, subdomain, url, ip, etc.)
    • Configurable index suffix and fail-on-error modes
    • Fire-and-forget by default with retry logic (3 attempts with exponential backoff)
  • OpenSearch Integration:

    • Daily index rotation pattern: security-findings-{orgId}-{YYYY.MM.DD}
    • Index template with standard metadata fields
    • Multi-tenant data isolation per organization
  • Analytics API:

    • POST /api/v1/analytics/query endpoint supporting OpenSearch DSL
    • Auto-scopes queries to organization's index pattern
    • Rate limiting: 100 requests/minute per user
  • Analytics Settings Page:

    • Tier-based retention configuration (Free: 30d, Pro: 90d, Enterprise: 365d)
    • Admin-only access controls
  • UI Integration:

    • "Dashboards" link in sidebar (opens OpenSearch Dashboards in new tab)
    • "Analytics Settings" page for retention configuration
    • "View Analytics" button on workflow detail page
  • Nginx Reverse Proxy:

    • Unified entry point at http://localhost
    • Routes: / (frontend), /api (backend), /analytics (OpenSearch Dashboards)
    • Proper CORS and proxy header configuration
  • OpenSearch Dashboards basePath:

    • Configured with /analytics base path for reverse proxy compatibility
    • Updated init scripts and health checks
  • Production Security Infrastructure:

    • TLS encryption for OpenSearch transport and HTTP layers
    • Security plugin with role-based access control
    • SaaS multitenancy with per-customer tenant isolation
    • Index patterns scoped by customer ID ({customer_id}-*)
    • Certificate generation script (just generate-certs)
    • Production deployment guide (docker/PRODUCTION.md)
  • Workflow Status Improvements:

    • New STALE status for orphaned run records (DB/Temporal mismatch)
    • Improved status inference from trace events when Temporal workflow not found
    • Documentation for all execution statuses
  • Component SDK Extensions:

    • generateFindingHash() utility for deduplication
    • Workflow context (workflowId, workflowName, organizationId) passed to components
    • Results output port added to nuclei, trufflehog, and supabase-scanner components
    • Support for optional inputs in components

New Commands

just dev              # Start dev with nginx reverse proxy
just prod-secure      # Start production with security & multitenancy
just generate-certs   # Generate TLS certificates for production

Files Changed

90+ files across backend, frontend, worker, component-sdk, docker, and documentation.

Test plan

  • Run npm run typecheck to verify no type errors
  • Run npm run lint to verify code quality
  • Start infrastructure: just dev or docker compose -f docker/docker-compose.infra.yml up -d
  • Run index template setup: OPENSEARCH_URL=http://localhost:9200 npm run --prefix backend setup:opensearch
  • Test Analytics API endpoint: POST /api/v1/analytics/query (requires Basic Auth: admin:admin)
  • Verify Dashboards accessible at http://localhost/analytics
  • Verify nginx routing works for all paths
  • Create workflow with Analytics Sink component and verify data indexed
  • Test production security setup with just prod-secure

Screenshots

image image

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 42044b8c24

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@LuD1161 LuD1161 force-pushed the eng-42/workflow-analytics-dashboards branch 12 times, most recently from 0284482 to 8c83d0b Compare January 23, 2026 02:39
@LuD1161 LuD1161 requested a review from betterclever January 23, 2026 02:44
@LuD1161 LuD1161 force-pushed the eng-42/workflow-analytics-dashboards branch 9 times, most recently from 7afae76 to bd71e89 Compare January 27, 2026 04:49
@betterclever
Copy link
Contributor

Question: Scope of User Stories

Looking at tasks/prd.json, I see all user stories (US-001 through US-015) are marked with "passes": true, including:

  • US-012: Analytics settings page UI ✅
  • US-013: Retention settings API endpoints ✅

However, frontend/src/pages/AnalyticsSettingsPage.tsx still contains:

  • Mock data for current tier and retention (line 42)
  • TODO comments referencing US-013 for API integration (lines 58, 70)

Question: Are all 15 user stories expected to be completed in this PR, or is US-013 (the backend API) intentionally deferred to a later PR? If it's expected to be complete, the frontend may need to be wired up to the backend endpoints that were reportedly implemented.

LuD1161 and others added 5 commits January 29, 2026 10:08
- Replace embedded shell scripts with clean shell wrapper pattern
- Add buildAmassArgs() and buildSubfinderArgs() TypeScript functions
- Use IsolatedContainerVolume for secure file I/O in both components
- Add -silent flag to amass to prevent progress bar spam
- Add passive mode parameter to amass (default: true for quick scans)
- Add new parameters to subfinder: threads, timeout, rateLimit, etc.
- Mount provider config as file instead of base64 env var in subfinder
- Move output parsing from shell to TypeScript for both components
- Update subfinder image to v2.12.0

Signed-off-by: Aseem Shrey <[email protected]>
- Add default 15-minute timeout to prevent runaway scans
- Add configurable DNS resolvers (Cloudflare, Google, Quad9 defaults)
- Add configurable data sources, default to lightweight sources only
- Exclude wayback/commoncrawl by default (can download 1GB+ per domain)
- Disable recursive brute force by default for faster scans
- Fix -src flag to -include (correct amass v5 syntax)

These optimizations prevent system overload from excessive network I/O
while maintaining useful subdomain enumeration capabilities.

Signed-off-by: Aseem Shrey <[email protected]>
Security tools like amass and subfinder can exit non-zero when some
data sources fail or rate-limit, but still produce valid partial
results. Previously, this would throw ContainerError and lose all
output.

Changes:
- Include stdout in ContainerError details (runner.ts)
- Catch ContainerError in amass/subfinder and extract partial output
- Log warning when preserving partial results

This restores the prior behavior where partial results were returned
instead of failing the entire workflow.

Signed-off-by: Aseem Shrey <[email protected]>
…s-pattern

refactor(worker): migrate amass/subfinder to Dynamic Args Pattern with perf optimizations
- Add nginx reverse proxy for unified entry point at http://localhost
- Routes: / (frontend), /api (backend), /analytics (OpenSearch Dashboards)
- Configure OpenSearch Dashboards with /analytics base path
- Add production deployment with TLS and security plugin
- SaaS multitenancy with per-customer tenant isolation
- Certificate generation script (just generate-certs)
- New commands: just dev, just prod-secure

Signed-off-by: Aseem Shrey <[email protected]>
- Add STALE status for orphaned run records (DB/Temporal mismatch)
- Improve status inference from trace events when Temporal not found
- Use correct TraceEventType values for status detection
- Add amber badge color for STALE status
- Extract WorkflowNode into modular directory structure
- Document all execution statuses with transition diagram

Signed-off-by: Aseem Shrey <[email protected]>
…gration

Analytics Sink Component (core.analytics.sink):
- Index output data from any upstream node to OpenSearch
- Auto-detect asset correlation keys (host, domain, url, ip, etc.)
- Fire-and-forget with retry logic (3 attempts, exponential backoff)
- Configurable index suffix and fail-on-error modes

OpenSearch Integration:
- Daily index rotation: security-findings-{orgId}-{YYYY.MM.DD}
- Index template with standard metadata fields
- Multi-tenant data isolation per organization

Analytics API:
- POST /api/v1/analytics/query with OpenSearch DSL support
- Auto-scope queries to organization's index pattern
- Rate limiting: 100 req/min per user
- Protected routes require authentication
- Session cookie support for analytics route auth

UI Integration:
- Analytics Settings page with tier-based retention
- Dashboards link in sidebar (opens in new tab)
- View Analytics button uses Discover app with proper URL state
- Uses .keyword fields for exact match filtering

Component SDK Extensions:
- generateFindingHash() for deduplication
- Workflow context (workflowId, workflowName, organizationId)
- Results output port on nuclei, trufflehog, supabase-scanner
- Support for optional inputs in components

Bug fixes:
- Fix webhook URLs to include global API prefix (ENG-115)
- Add proper connectionType for list variable types
- Handle invalid_value errors for placeholder fields

Signed-off-by: Aseem Shrey <[email protected]>
…uto-refresh

- Add dynamic inputs editor with auto-populated source tags from workflow
- Add results port to all security components for analytics output
- Fix Data Explorer URL format to preserve time filter
- Hide View Analytics button during running workflows
- Auto-refresh OpenSearch index patterns after bulk indexing
- Add OPENSEARCH_DASHBOARDS_URL env var for worker configuration

Signed-off-by: Aseem Shrey <[email protected]>
@LuD1161 LuD1161 force-pushed the eng-42/workflow-analytics-dashboards branch from 30b1504 to 5d92c8d Compare January 29, 2026 19:41
Keep analytics support (generateFindingHash, analyticsResultSchema, results)
in amass.ts and subfinder.ts from the workflow-analytics-dashboards branch.
@LuD1161
Copy link
Contributor Author

LuD1161 commented Jan 29, 2026

Question: Scope of User Stories

Looking at tasks/prd.json, I see all user stories (US-001 through US-015) are marked with "passes": true, including:

* **US-012**: Analytics settings page UI ✅

* **US-013**: Retention settings API endpoints ✅

However, frontend/src/pages/AnalyticsSettingsPage.tsx still contains:

* Mock data for current tier and retention (line 42)

* TODO comments referencing US-013 for API integration (lines 58, 70)

Question: Are all 15 user stories expected to be completed in this PR, or is US-013 (the backend API) intentionally deferred to a later PR? If it's expected to be complete, the frontend may need to be wired up to the backend endpoints that were reportedly implemented.

This was a relic from the hottest trend in AI, of ralph . Henceforth removed :)

…fixture

Auto-fix ESLint/Prettier formatting issues in security components
and add required allowAny metadata to test analytics fixture.

Signed-off-by: Aseem Shrey <[email protected]>
- Add 'analytics-inputs' to ComponentParameterType union
- Fix analytics-fixture to use no-parameters overload
- Add type assertions for OpenSearch indexer API responses

Signed-off-by: Aseem Shrey <[email protected]>
@LuD1161 LuD1161 force-pushed the eng-42/workflow-analytics-dashboards branch from fd4f74f to 1d9cc8d Compare January 29, 2026 20:55
- Cast defineComponent to any to bypass strict overload matching
- Add explicit type annotations to execute function parameters
- Import ExecutionContext and ExecutionPayload types for type safety
- Update subfinder test to verify analytics results generation

Signed-off-by: Aseem Shrey <[email protected]>
@LuD1161 LuD1161 force-pushed the eng-42/workflow-analytics-dashboards branch from b8d9c3a to bd98d61 Compare January 30, 2026 04:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants