From 1e19b709881af1f98c3e03e9f3556dc945121e84 Mon Sep 17 00:00:00 2001 From: mason5052 Date: Tue, 9 Jun 2026 20:04:44 -0400 Subject: [PATCH] docs: add qwen long-flow troubleshooting --- backend/docs/flow_execution.md | 2 ++ examples/guides/vllm-qwen35-27b-fp8.md | 22 ++++++++++++++++++++++ 2 files changed, 24 insertions(+) diff --git a/backend/docs/flow_execution.md b/backend/docs/flow_execution.md index f97691360..372bbde1f 100644 --- a/backend/docs/flow_execution.md +++ b/backend/docs/flow_execution.md @@ -975,6 +975,8 @@ PentAGI implements a sophisticated multi-layered agent supervision system to ens - `EXECUTION_MONITOR_SAME_TOOL_LIMIT` (default: 5) - Consecutive same-tool threshold - `EXECUTION_MONITOR_TOTAL_TOOL_LIMIT` (default: 10) - Total tool calls threshold +For local or custom Qwen-style deployments where flows run for hours or repeat commands, see the [vLLM Qwen3.5-27B-FP8 troubleshooting guide](../../examples/guides/vllm-qwen35-27b-fp8.md#issue-automation-flow-runs-for-hours-or-repeats-commands) for prompt-bounding and log-review guidance. + ### Enhanced Reflector Integration **Automatic Reflector on Generation Failures**: diff --git a/examples/guides/vllm-qwen35-27b-fp8.md b/examples/guides/vllm-qwen35-27b-fp8.md index 6d4942c2e..dff6e65a7 100644 --- a/examples/guides/vllm-qwen35-27b-fp8.md +++ b/examples/guides/vllm-qwen35-27b-fp8.md @@ -337,6 +337,28 @@ These benchmarks demonstrate that Qwen3.5-27B-FP8 provides excellent throughput ## Troubleshooting +### Issue: Automation Flow Runs for Hours or Repeats Commands + +**Cause**: Broad penetration testing prompts can leave smaller local or custom models exploring too much state, especially when the task has no stopping criteria. Repeated commands may indicate model weakness, target complexity, tool-call/provider issues, or a flow that needs additional supervision. + +**Solution**: First check the existing PentAGI execution controls before changing runtime behavior or assuming the target is broken: + +- Enable execution monitoring with `EXECUTION_MONITOR_ENABLED=true` so the Adviser can review repeated or inefficient tool-call patterns. +- Tune `EXECUTION_MONITOR_SAME_TOOL_LIMIT` and `EXECUTION_MONITOR_TOTAL_TOOL_LIMIT` if Adviser reviews happen too late or too often for your model and target. +- Enable planning with `AGENT_PLANNING_STEP_ENABLED=true` for complex pentest flows so specialist agents receive a bounded execution plan. +- Review hard tool-call limits with `MAX_GENERAL_AGENT_TOOL_CALLS` and `MAX_LIMITED_AGENT_TOOL_CALLS`; these remain the final guardrails for runaway executions. + +Also narrow the task prompt. Instead of only `Perform penetration testing on the host 192.168.136.136`, include the authorized target, scope, expected output, and stopping criteria, for example: enumerate exposed services, try likely public exploits, avoid repeating failed commands more than twice, and stop with a concise findings report if no path is found. + +Use PentAGI flow logs, Docker logs, and provider or vLLM logs together when diagnosing a long run: + +- Repeated identical shell/browser actions point toward tool-loop behavior. +- Long gaps between actions point toward slow model generation or provider latency. +- Frequent malformed tool calls point toward provider/tool-call parser compatibility. +- Varied but slow exploration can simply mean the target is complex or the prompt is too broad. + +Qwen3.5-27B-FP8 can run useful local flows, but complex autonomous pentests may still need execution monitoring, task planning, tighter prompts, and careful log review. + ### Issue: "Unknown architecture 'qwen3_5'" **Cause**: Using stable vLLM release instead of nightly.