Contract Testing: How We Achieved Zero API Incidents in Production
A practical guide to implementing consumer-driven contract testing with Pact.js in a microservices architecture. Learn how we eliminated 100% of API-related production incidents over 12 months.
The Problem: Integration Chaos
In early 2023, our microservices ecosystem at EPAM Systems had grown to 20+ services owned by different teams, each deploying independently multiple times per day. While this autonomy was great for velocity, it came with a hidden cost: 3-4 production incidents per month due to API breaking changes.
Each incident followed a similar pattern:
- Backend team deploys a "simple" API change
- Frontend breaks in production
- Emergency rollback initiated
- Post-mortem reveals misaligned expectations between teams
The financial impact: ~$50K per incident in downtime, engineering time, and customer impact.
Traditional Integration Testing Wasn't Enough
We tried the conventional approach: end-to-end integration tests in a staging environment. The problems were immediate:
Issues with E2E Integration Tests
| Problem | Impact |
|---|---|
| Too Slow | 2+ hours to run full suite |
| Too Flaky | 15-20% failure rate due to environment issues |
| Too Late | Only caught issues in staging, not during development |
| Too Broad | Tested everything, not just API contracts |
| No Ownership | Unclear which team should maintain shared tests |
We needed a solution that:
- ✅ Runs in seconds, not hours
- ✅ Tests only the API contract, not implementation
- ✅ Gives immediate feedback during development
- ✅ Clearly defines ownership (consumers own expectations)
Enter consumer-driven contract testing.
What is Contract Testing?
Contract testing is a technique that ensures services can communicate correctly by testing the contract (API interface) between them, rather than testing the full integration.
Key Principles
Consumer-Driven: The consumer (e.g., frontend) defines what they expect from the provider (e.g., backend API).
Isolated Testing: Each service tests their side of the contract independently—no need for the other service to be running.
Fast Feedback: Contracts are verified in CI/CD before deployment, catching issues in minutes, not hours.
How It Works (Simplified)
1. Consumer Team 2. PactBroker 3. Provider Team
└─ Defines contract ───> └─ Stores contracts ───> └─ Verifies implementation
└─ Publishes to broker └─ Version control └─ Can-I-Deploy check
Implementation: Pact.js + PactBroker
We chose Pact.js for contract testing and PactBroker as our central contract registry.
Step 1: Consumer Defines Contract
The frontend team writes a Pact test specifying their expectations:
Key Points:
- Consumer doesn't care about backend implementation
- Uses matchers (
like(),string()) for flexible validation - Generates a JSON contract file automatically
Step 2: Publish Contract to PactBroker
Contracts are published during CI/CD:
Step 3: Provider Verifies Contract
The backend team verifies their API against consumer contracts:
Step 4: Can-I-Deploy Check
Before deploying, teams run a compatibility check:
Integrated into CI/CD:
Results: 12+ Months of Zero Incidents
The impact was immediate and sustained:
| Metric | Before | After | Improvement |
|---|---|---|---|
| API Incidents | 3-4/month | 0 in 12+ months | 100% |
| Deploy Confidence | ~60% | 100% | +40% |
| Integration Test Time | 2+ hours | 15 minutes | -85% |
| Breaking Changes Caught | In staging | Before commit | Shift-left |
| Contract Coverage | 0 contracts | 100+ contracts | Full coverage |
Monthly Progress
| Month | Production Incidents | Contracts Published |
|---|---|---|
| Jan 2023 | 4 | 0 |
| Feb 2023 | 3 | 15 |
| Mar 2023 | 2 | 35 |
| Apr 2023 | 1 | 60 |
| May 2023 | 0 | 85 |
| Jun 2023+ | 0 | 100+ |
Lessons Learned
What Worked Well
1. Gradual Adoption: We didn't try to contract-test everything at once. Started with one critical API path, proved value, then expanded.
2. Clear Ownership: Consumers own contract tests. This aligns with "you build it, you test it" philosophy.
3. Can-I-Deploy Gates: Making deployment conditional on contract compatibility was a game-changer.
4. Provider States: Properly modeling provider states (e.g., "user exists") was critical for reliable verification.
Challenges & Solutions
Challenge 1: Learning Curve
Teams struggled with Pact's concepts initially.
Solution: Created internal documentation with real examples, paired with teams during first implementations.
Challenge 2: Dynamic Data in Responses
Some APIs returned dynamic data (timestamps, UUIDs) that broke exact matching.
Solution: Use Pact matchers extensively:
Challenge 3: Asynchronous APIs
Event-driven services (Kafka, RabbitMQ) didn't fit HTTP-based Pact.
Solution: Use message pacts for async:
Best Practices
1. Version Your Contracts
Use semantic versioning for consumer/provider versions:
2. Test Against Multiple Environments
Verify contracts against:
- Latest
mainbranch (upcoming changes) - Deployed to
production(current state)
3. Use Webhooks for Immediate Feedback
Configure PactBroker to trigger provider verification when contracts change:
4. Don't Over-Specify
Bad (too specific):
Good (flexible):
Tools & Resources
Our Stack
- Pact.js (v12+): Contract testing framework
- PactBroker: Hosted on AWS (self-managed)
- GitLab CI/CD: Automation pipeline
- Slack: Deployment notifications via webhooks
Useful Commands
Conclusion
Contract testing with Pact.js transformed our microservices architecture from fragile to resilient. Zero production incidents in 12+ months speaks for itself.
The key insight: Test the contract, not the implementation. This shift-left approach catches incompatibilities before they reach production, giving teams confidence to deploy independently without fear.
If you're struggling with microservices integration testing, contract testing isn't just nice-to-have—it's essential.
Next Steps
Want to implement contract testing in your organization? Start here:
- Pick one API endpoint with frequent breaking changes
- Write a consumer contract for that endpoint
- Set up PactBroker (use pactflow.io for hosted option)
- Add can-i-deploy check to your CI/CD pipeline
- Measure impact (incidents before/after)
Once you see the value, scaling to 100+ contracts is just more of the same.
Questions? Reach out on LinkedIn or check out the Pact.js documentation.