Red Teaming

Running and reviewing reports

Launch red-team reports and interpret findings from agent and repository tests.

After you configure agents, repositories, and the Superagent Security GitHub App, run red-team reports from the agent detail UI and review outcomes in the dashboard.

Running a report

  1. Open Red Teaming → Agents and select an agent.
  2. Choose the scenario, repository, or report configuration your deployment supports.
  3. Start the report and wait for completion (duration depends on scenario scope).
  4. Optionally repeat against updated prompts, tools, or policies.

Reviewing results

  • Findings — Categorized failures (e.g. jailbreak success, policy bypass, unsafe tool invocation).
  • Evidence — Prompts, responses, or traces that reproduce the issue.
  • Severity — Use findings to prioritize fixes before wider rollout.

What to do with findings

Finding type Typical action
Jailbreak / policy bypass Tighten system prompts, add guardrails, block tool paths
Data leak Redact outputs, restrict retrieval, audit logs
Unsafe tool use Add approval steps, sandbox tools, deny lists

Legacy note

Older Safety tests URLs may redirect to Red Teaming agents in the app. Use /app/red-team/agents as the canonical entry point.

Next steps