Skip to content

Add agentic fuzzer to CI as regression gate #11

@PLNech

Description

@PLNech

Context

The fuzzer (scripts/fuzz-rtk.py) currently runs manually. After the agentic fuzzing lab week experiment, we have 139 static tests across 35 families that catch real regressions.

Proposal

Add a CI job that runs python3 scripts/fuzz-rtk.py --rounds 0 (static tests only, no LLM) on every PR. Fail if FAIL count exceeds a threshold (currently 22 - all classified as by-design or known limitations).

Requirements

  • Docker available in CI (for docker ps/images tests) or skip those families
  • Python 3.10+ for the fuzzer script
  • rtk binary built from PR branch
  • Threshold stored in a config file so it can be tightened over time

Current baseline

  • 139 tests, 95 PASS, 12 WARN, 22 FAIL, 10 SKIP
  • Failure rate: 15.8% (all classified)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions