CrewAI for Test Automation: Building Self-Healing Multi-Agent QA Systems
What if your test suite could diagnose its own failures and fix them? CrewAI enables multi-agent systems where specialized AI agents collaborate: one analyzes logs, another inspects the DOM, and a third generates the fix. Here is how to build it.
The 3-Agent Architecture
- Log Analyst Agent: Reads CI failure logs, extracts error patterns, identifies root cause category (timing, selector, data, environment)
- DOM Expert Agent: Inspects the current page DOM, finds alternative selectors, validates element existence and state
- Reporter/Fixer Agent: Generates a fix (updated selector, added wait, data correction), creates a PR, and writes a human-readable report
CrewAI Setup
from crewai import Agent, Task, Crew
log_analyst = Agent(
role="CI Failure Analyst",
goal="Analyze test failure logs and identify root cause category",
backstory="You are an expert at reading Playwright test logs and error traces.",
verbose=True
)
dom_expert = Agent(
role="DOM Inspector",
goal="Find stable alternative selectors for broken locators",
backstory="You understand HTML structure and Playwright locator strategies.",
verbose=True
)
fixer = Agent(
role="Test Fixer",
goal="Generate code fixes for broken tests and create pull requests",
backstory="You write clean Playwright TypeScript test code.",
verbose=True
)
analyze_task = Task(
description="Analyze this test failure log and identify the root cause: {failure_log}",
expected_output="Root cause category and specific failing element",
agent=log_analyst
)
fix_task = Task(
description="Based on the analysis, generate a code fix for the broken test",
expected_output="Updated test code with the fix applied",
agent=fixer,
context=[analyze_task]
)
crew = Crew(
agents=[log_analyst, dom_expert, fixer],
tasks=[analyze_task, fix_task],
verbose=True
)
result = crew.kickoff(inputs={"failure_log": log_content})
When Self-Healing Works (and When It Does Not)
| Failure Type | Self-Healable? | Agent Action |
|---|---|---|
| Broken CSS selector | Yes | Find alternative stable selector |
| Changed button text | Yes | Update getByText to match new label |
| Timing issue | Yes | Add proper waitFor assertion |
| Business logic change | No | Flag for human review |
| New feature breaking flow | No | Flag for human review |
| Environment/infra issue | Partial | Retry with environment check |
