Playwright Docker GitHub Actions CI/CD Pipeline: The Complete 10-Minute Setup
I see QA teams run Playwright tests on local machines every single day. They push code, wait for a colleague to “run the suite,” and pray nothing breaks in production. This is not a pipeline. It is a bottleneck.
A real CI/CD pipeline for Playwright takes under 10 minutes to set up if you know exactly what to copy, what to pin, and what to ignore. I have set this up for three teams in the last 18 months. The pattern is the same every time: Docker container for consistency, GitHub Actions for orchestration, and a few aggressive caching rules to keep build times under four minutes.
This guide gives you the complete Playwright Docker GitHub Actions CI/CD pipeline I use in production. No fluff. Just commands, YAML, and the mistakes that cost me hours so you do not have to repeat them.
Table of Contents
- Why Most Teams Fail at Playwright CI/CD
- What You Need Before You Start
- Understanding the Official Playwright Docker Image
- The 7-Step Docker + GitHub Actions Pipeline
- The GitHub Actions Workflow That Actually Works
- Three Performance Tweaks That Cut CI Time by 40%
- Security and Secrets Handling in CI
- India Context: What Startups and Product Companies Actually Use
- Common Failures and How to Fix Them
- Key Takeaways
- Frequently Asked Questions
Contents
Why Most Teams Fail at Playwright CI/CD
The failure is rarely technical. It is organizational. Teams install Playwright locally, write 40 tests, and then realize their CI agent does not have Chromium dependencies. They spend two days debugging libnss3 errors on Ubuntu while their sprint deadline breathes down their neck.
Here is what I see in the wild:
- No Docker pinning: A developer uses
ubuntu-latest, another usesubuntu-22.04, and tests pass on one machine while failing on the other. - Browser version drift: The CI installs Playwright 1.58 while the local machine runs 1.59. Browser executables do not match. Tests explode with
browserType.launch: Executable doesn't exist. - Missing artifacts: When a test fails in CI, there is no trace, no screenshot, and no HTML report. The developer has to reproduce the failure locally, which works half the time because the environment is different.
Playwright now sees 198 million npm downloads per month as of April 2026, compared to Cypress at 29.5 million and selenium-webdriver at 8.1 million. That is a 6.7x gap over Cypress. With that adoption comes a flood of teams who skipped the CI chapter and are now paying for it with flaky nightly builds.
If your team still runs regression suites on a teammate’s laptop, you do not have continuous integration. You have continuous hope. Let us fix that.
What You Need Before You Start
You need four things. Everything else is optional.
- A Playwright project with working tests. Run
npx playwright testlocally and confirm at least one test passes. - A GitHub repository. This guide targets GitHub Actions, but the Docker principles apply to GitLab CI, Azure DevOps, and Bitbucket Pipelines with minor syntax changes.
- Node.js 18 or higher. Playwright 1.59 requires Node 18+. I run 20 LTS in production.
- Docker installed locally. You will use it to verify the image before pushing to CI.
I also recommend reading my Playwright vs Selenium stability analysis if you are still deciding whether Playwright is the right tool for your stack. The data there might save you a migration later.
Understanding the Official Playwright Docker Image
Microsoft maintains an official Playwright Docker image at mcr.microsoft.com/playwright. This is not a community image. It is built from the same repository as Playwright itself, updated on every release, and contains the exact browser binaries and system dependencies needed for Chromium, Firefox, and WebKit.
Image Tags and Version Pinning
The current latest image is mcr.microsoft.com/playwright:v1.59.1-noble, based on Ubuntu 24.04 LTS (Noble Numbat). There is also a -jammy variant based on Ubuntu 22.04 if your organization has not upgraded yet.
Pin your image tag to a specific Playwright version. If your Docker image runs Playwright 1.58 and your package.json installs 1.59, Playwright cannot locate browser executables. The error message is cryptic. The fix is trivial: pin everything.
Compressed, the v1.59.1-noble image is approximately 872 MB. The largest layer is 743 MB and contains the three browser engines. That is hefty, but it is the price of hermetic reproducibility. You download it once per CI agent, cache it, and forget about it.
Docker Flags You Cannot Skip
When running the Playwright image locally or in CI, three flags matter:
--init: Handles PID 1 correctly and prevents zombie processes. Without this, long test suites can leak browser processes.--ipc=host: Required for Chromium. Without shared memory, Chromium runs out of memory and crashes on pages with heavy JavaScript.--user pwuser: Only needed if you are crawling untrusted sites. For end-to-end tests on your own application, root is fine and simplifies permissions.
Pulling and Running Locally
docker pull mcr.microsoft.com/playwright:v1.59.1-noble
docker run -it --rm --ipc=host mcr.microsoft.com/playwright:v1.59.1-noble /bin/bash
Inside the container, your project still needs to run npm ci because the Playwright package itself is not pre-installed. Only the browsers and system dependencies are baked in.
The 7-Step Docker + GitHub Actions Pipeline
This is the exact setup I run for teams building React and Next.js applications. Total time from zero to green CI: under 10 minutes.
Step 1: Pin Your Playwright Version
In your package.json, use an exact version:
"devDependencies": {
"@playwright/test": "1.59.1"
}
No carets. No tildes. Exact pinning prevents the CI from installing a newer version than your Docker image contains.
Step 2: Add a Dockerfile
Create a Dockerfile in your project root:
FROM mcr.microsoft.com/playwright:v1.59.1-noble
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
CMD ["npx", "playwright", "test"]
This Dockerfile uses the official image as a base, installs your project dependencies, and defaults to running the test suite.
Step 3: Create the GitHub Actions Workflow
Create .github/workflows/playwright.yml. We will refine this in the next section, but here is the minimum viable pipeline:
name: Playwright Tests
on:
push:
branches: [main, master]
pull_request:
branches: [main, master]
jobs:
test:
timeout-minutes: 60
runs-on: ubuntu-latest
container:
image: mcr.microsoft.com/playwright:v1.59.1-noble
options: --init --ipc=host
steps:
- uses: actions/checkout@v5
- uses: actions/setup-node@v5
with:
node-version: lts/*
- name: Install dependencies
run: npm ci
- name: Run Playwright tests
run: npx playwright test
- uses: actions/upload-artifact@v4
if: ${{ !cancelled() }}
with:
name: playwright-report
path: playwright-report/
retention-days: 14
Notice the container block. This tells GitHub Actions to run the entire job inside the Playwright Docker image. You do not need npx playwright install --with-deps because the image already contains browsers and dependencies.
Step 4: Commit and Push
git add .
git commit -m "Add Playwright CI pipeline"
git push origin main
Step 5: Verify the First Run
Open the Actions tab in your GitHub repository. The workflow should trigger within 30 seconds of your push. The first run downloads the Docker image, which takes 2-3 minutes depending on GitHub’s network. Subsequent runs reuse the cached image layer and drop to under 60 seconds for the container setup.
Step 6: Configure Artifact Retention
The workflow above uploads the HTML report as an artifact. I set retention-days: 14 to avoid burning through GitHub’s storage quota. If you are on a free plan, artifacts count against your 500 MB limit. For paid plans, the cost is negligible.
Step 7: Enable PR Comments (Optional)
For open-source projects or teams that review a lot of pull requests, I add a step that posts a summary comment back to the PR with passed/failed counts and a link to the HTML report. This removes the friction of clicking through three GitHub UI layers to find out why a test failed.
The GitHub Actions Workflow That Actually Works
The workflow in Step 3 works, but it is naive. It does not cache node_modules, it does not shard tests across multiple workers, and it rebuilds the Docker context on every run. Here is the production version I use.
Optimized Workflow with Caching
name: Playwright Tests
on:
push:
branches: [main, master]
pull_request:
branches: [main, master]
jobs:
test:
timeout-minutes: 30
runs-on: ubuntu-latest
container:
image: mcr.microsoft.com/playwright:v1.59.1-noble
options: --init --ipc=host
steps:
- uses: actions/checkout@v5
- name: Cache Node modules
uses: actions/cache@v4
with:
path: ~/.npm
key: ${{ runner.os }}-node-${{ hashFiles('package-lock.json') }}
restore-keys: |
${{ runner.os }}-node-
- name: Install dependencies
run: npm ci
- name: Run Playwright tests
run: npx playwright test --workers=4
- name: Upload HTML report
uses: actions/upload-artifact@v4
if: ${{ !cancelled() }}
with:
name: playwright-report-${{ github.run_id }}
path: |
playwright-report/
test-results/
retention-days: 7
- name: Upload trace on failure
uses: actions/upload-artifact@v4
if: failure()
with:
name: playwright-traces
path: test-results/
retention-days: 3
Three critical differences from the basic version:
- Node module caching:
actions/cache@v4stores~/.npmbetween runs. On a 200-dependency project, this cuts install time from 90 seconds to 12 seconds. - Parallel workers:
--workers=4runs four test files concurrently. On GitHub’subuntu-latestrunner (2 vCPU, 7 GB RAM), this is the sweet spot. More workers cause memory pressure and browser crashes. - Separate trace uploads: Traces are large. Uploading them only on failure keeps artifact storage lean while preserving the debugging data you actually need.
Sharding for Large Suites
If your suite exceeds 200 tests or takes longer than 10 minutes, shard it across multiple jobs:
jobs:
test:
timeout-minutes: 30
runs-on: ubuntu-latest
container:
image: mcr.microsoft.com/playwright:v1.59.1-noble
options: --init --ipc=host
strategy:
fail-fast: false
matrix:
shardIndex: [1, 2, 3, 4]
shardTotal: [4]
steps:
- uses: actions/checkout@v5
- name: Install dependencies
run: npm ci
- name: Run tests
run: npx playwright test --shard=${{ matrix.shardIndex }}/${{ matrix.shardTotal }}
This splits your test suite into four shards that run in parallel. Total pipeline time drops linearly. I have seen a 47-minute suite collapse to 9 minutes with four shards. GitHub Actions bills by the minute, so the total compute cost is similar, but developer wait time is what matters.
Three Performance Tweaks That Cut CI Time by 40%
After setting up the pipeline, most teams accept whatever runtime GitHub gives them. That is a mistake. Here are three tweaks that consistently shave 40% off my build times.
1. Cache the Docker Image Layer
GitHub Actions runners do not persist Docker images between runs by default. You can use docker/build-push-action with a registry cache, but for Playwright the simpler approach is to avoid Docker entirely for the job setup and instead use npx playwright install --with-deps on ubuntu-latest with apt caching.
- name: Install Playwright with deps
run: |
npx playwright install --with-deps
env:
PLAYWRIGHT_BROWSERS_PATH: 0
With apt caching enabled, this installs Chromium, Firefox, and WebKit in under 45 seconds on a warm cache. The Playwright team officially recommends this approach over the deprecated GitHub Action. The mcr.microsoft.com/playwright image is still valuable for local debugging and Codespaces, but for pure CI speed, installing browsers on the runner wins.
2. Run Only Changed Tests on Pull Requests
Not every PR touches every page. Use --grep or file path filtering to run only tests related to modified files. I tag tests by domain:
test('checkout flow works', { tag: '@checkout' }, async ({ page }) => {
// ...
});
Then in CI:
- name: Run checkout tests
if: contains(github.event.pull_request.changed_files, 'src/checkout')
run: npx playwright test --grep @checkout
This requires discipline in test organization, but the payoff is enormous. A 300-test suite drops to 12 tests on atypical PR.
3. Disable Video and Screenshot on Pass
Playwright can record video and take screenshots for every test. On CI, this adds 15-30% overhead. Configure playwright.config.ts to capture only on failure:
export default defineConfig({
use: {
video: 'retain-on-failure',
screenshot: 'only-on-failure',
trace: 'retain-on-failure',
},
});
Your passing builds run faster, and your failing builds still have full forensic data.
Security and Secrets Handling in CI
CI pipelines are a goldmine for attackers. HTML reports, trace files, and console logs can contain session tokens, test user credentials, and staging API keys. I treat artifacts with the same paranoia I apply to production databases.
Here is my security checklist:
- Never upload artifacts on pull requests from forks. Forks do not have access to repository secrets, but they can still exfiltrate data through artifacts if your workflow uploads them unconditionally.
- Encrypt reports before upload. If you must store artifacts long-term, encrypt them with a repository secret and decrypt locally.
- Use GitHub Environments for staging vs production tests. Environment-specific secrets prevent a compromised staging key from reaching production.
- Scan for secrets in test code. Tools like
truffleHogor GitHub secret scanning catch hardcoded keys before they reach CI.
Playwright’s trace viewer is powerful, but it records everything: network requests, console logs, cookies. If your tests log in as an admin user, that admin session is in the trace. Set retention-days: 3 for trace artifacts and purge them aggressively.
If you are building more advanced agent-based testing pipelines, check out my LangGraph test automation architecture for ideas on how to isolate sensitive test data in planner-generator-healer pipelines.
India Context: What Startups and Product Companies Actually Use
In my experience hiring and consulting for Bengaluru-based startups, the Playwright + GitHub Actions stack is now the default for product companies paying ₹20-40 LPA for SDET roles. Service companies (TCS, Infosys, Wipro) still run Selenium Grid on self-hosted VMs, but that is changing fast.
Here is what I see in job descriptions and technical screenings in 2026:
- Product startups (Series A to D): 80% use GitHub Actions or GitLab CI. Docker is standard. Playwright adoption is near-universal for new projects.
- Fintech and healthtech: These sectors demand audit trails. They keep HTML reports for 90 days and require trace files for every failed test. Artifact retention is a compliance feature, not a convenience.
- Service companies transitioning to automation: Many are still on Jenkins with on-premise agents. The migration path is Jenkins → GitHub Actions → Docker, usually over 12-18 months.
If you are an SDET interviewing in 2026, knowing how to set up a Docker-based Playwright pipeline is table stakes for product companies. I ask candidates to whiteboard this exact workflow in my SDET interview prep sessions. The ones who can draw the container, the cache layer, and the artifact upload get offers. The ones who cannot, do not.
Common Failures and How to Fix Them
After running this pipeline across 12 repositories, I have a short list of failures that appear again and again.
Failure 1: Browser Executable Not Found
Symptom: browserType.launch: Executable doesn't exist at /ms-playwright/chromium-1155/chrome-linux/chrome
Cause: Playwright version in package.json does not match the Docker image tag.
Fix: Pin both to the exact same version. Run npx playwright --version locally and use that tag.
Failure 2: Chromium Crashes in Docker
Symptom: Tests pass locally but Chromium crashes in CI with Target page, context or browser has been closed.
Cause: Missing --ipc=host or insufficient shared memory.
Fix: Add options: --init --ipc=host to the container block in your workflow.
Failure 3: Tests Time Out on CI but Pass Locally
Symptom: Network-dependent tests fail with page.goto: Timeout 30000ms exceeded.
Cause: CI runners have slower network throughput than developer laptops. Also, localhost resolution inside Docker containers behaves differently.
Fix: Increase timeout in playwright.config.ts for CI environments:
timeout: process.env.CI ? 60000 : 30000,
Failure 4: Artifact Upload Fails Silently
Symptom: The workflow passes, but the artifact section shows zero bytes.
Cause: The report path does not match the path glob in actions/upload-artifact.
Fix: Verify that playwright.config.ts outputs to playwright-report/ and that your upload step matches that path exactly.
Key Takeaways
- Pin your Playwright version, your Docker image tag, and your Node version. Drift is the enemy of reproducibility.
- Use the official
mcr.microsoft.com/playwrightimage for consistency, but considernpx playwright install --with-depswith apt caching if raw CI speed is your priority. - Cache
node_modulesand browser installations between runs. The first build is slow; every build after that should be under four minutes. - Shard large suites across multiple jobs. A 47-minute suite becomes a 9-minute suite with four shards.
- Upload traces and reports as artifacts, but set short retention and encrypt sensitive data.
- Know this pipeline cold if you are interviewing for SDET roles in India. It is no longer a nice-to-have. It is expected.
Frequently Asked Questions
Do I need the Playwright GitHub Action from the marketplace?
No. Microsoft deprecated the official GitHub Action because it could not determine which Playwright version your project needed. The recommended approach is npx playwright install --with-deps inside your workflow, or use the official Docker image as a container.
How much does this cost on GitHub Actions?
For public repositories, GitHub Actions is free. For private repositories, the free tier includes 2,000 minutes per month. A 4-minute Playwright build running on every push and pull request gives you roughly 500 builds per month before hitting the limit. Most small teams never exceed this.
Can I use this with GitLab CI or Azure DevOps?
Yes. The Docker image and the caching principles are identical. GitLab CI uses image: and cache: keys. Azure DevOps uses container jobs and pipeline caching. The YAML syntax changes, but the architecture does not.
Should I run tests on every push or only on pull requests?
Run fast smoke tests (5-10 critical paths) on every push. Run the full regression suite on pull requests to main and on a nightly schedule. This balances feedback speed with coverage depth.
What about self-hosted runners?
Self-hosted runners are useful if you need custom hardware, GPU acceleration, or access to internal networks. The downside is maintenance: you now own the Docker daemon, disk cleanup, and security patches. For most teams, GitHub-hosted runners plus Docker caching are simpler and cheaper.
If you are mapping out your broader automation career, my AI SDET roadmap for 2026 breaks down exactly which DevOps and CI/CD skills move the needle from 8 LPA to 25 LPA.
