Contract Testing: How We Achieved Zero API Incidents in Production

The Problem: Integration Chaos

In early 2023, our microservices ecosystem at EPAM Systems had grown to 20+ services owned by different teams, each deploying independently multiple times per day. While this autonomy was great for velocity, it came with a hidden cost: 3-4 production incidents per month due to API breaking changes.

Each incident followed a similar pattern:

Backend team deploys a "simple" API change
Frontend breaks in production
Emergency rollback initiated
Post-mortem reveals misaligned expectations between teams

The financial impact: ~$50K per incident in downtime, engineering time, and customer impact.

Traditional Integration Testing Wasn't Enough

We tried the conventional approach: end-to-end integration tests in a staging environment. The problems were immediate:

Issues with E2E Integration Tests

Problem	Impact
Too Slow	2+ hours to run full suite
Too Flaky	15-20% failure rate due to environment issues
Too Late	Only caught issues in staging, not during development
Too Broad	Tested everything, not just API contracts
No Ownership	Unclear which team should maintain shared tests

We needed a solution that:

✅ Runs in seconds, not hours
✅ Tests only the API contract, not implementation
✅ Gives immediate feedback during development
✅ Clearly defines ownership (consumers own expectations)

Enter consumer-driven contract testing.

What is Contract Testing?

Contract testing is a technique that ensures services can communicate correctly by testing the contract (API interface) between them, rather than testing the full integration.

Key Principles

Consumer-Driven: The consumer (e.g., frontend) defines what they expect from the provider (e.g., backend API).

Isolated Testing: Each service tests their side of the contract independently—no need for the other service to be running.

Fast Feedback: Contracts are verified in CI/CD before deployment, catching issues in minutes, not hours.

How It Works (Simplified)

1. Consumer Team                2. PactBroker               3. Provider Team
   └─ Defines contract     ───>    └─ Stores contracts  ───>    └─ Verifies implementation
   └─ Publishes to broker         └─ Version control           └─ Can-I-Deploy check

Implementation: Pact.js + PactBroker

We chose Pact.js for contract testing and PactBroker as our central contract registry.

Step 1: Consumer Defines Contract

The frontend team writes a Pact test specifying their expectations:

Key Points:

Consumer doesn't care about backend implementation
Uses matchers (like(), string()) for flexible validation
Generates a JSON contract file automatically

Step 2: Publish Contract to PactBroker

Contracts are published during CI/CD:

Step 3: Provider Verifies Contract

The backend team verifies their API against consumer contracts:

Step 4: Can-I-Deploy Check

Before deploying, teams run a compatibility check:

Integrated into CI/CD:

Results: 12+ Months of Zero Incidents

The impact was immediate and sustained:

Metric	Before	After	Improvement
API Incidents	3-4/month	0 in 12+ months	100%
Deploy Confidence	~60%	100%	+40%
Integration Test Time	2+ hours	15 minutes	-85%
Breaking Changes Caught	In staging	Before commit	Shift-left
Contract Coverage	0 contracts	100+ contracts	Full coverage

Monthly Progress

Month	Production Incidents	Contracts Published
Jan 2023	4	0
Feb 2023	3	15
Mar 2023	2	35
Apr 2023	1	60
May 2023	0	85
Jun 2023+	0	100+

Lessons Learned

What Worked Well

1. Gradual Adoption: We didn't try to contract-test everything at once. Started with one critical API path, proved value, then expanded.

2. Clear Ownership: Consumers own contract tests. This aligns with "you build it, you test it" philosophy.

3. Can-I-Deploy Gates: Making deployment conditional on contract compatibility was a game-changer.

4. Provider States: Properly modeling provider states (e.g., "user exists") was critical for reliable verification.

Challenges & Solutions

Challenge 1: Learning Curve

Teams struggled with Pact's concepts initially.

Solution: Created internal documentation with real examples, paired with teams during first implementations.

Challenge 2: Dynamic Data in Responses

Some APIs returned dynamic data (timestamps, UUIDs) that broke exact matching.

Solution: Use Pact matchers extensively:

Challenge 3: Asynchronous APIs

Event-driven services (Kafka, RabbitMQ) didn't fit HTTP-based Pact.

Solution: Use message pacts for async:

Best Practices

1. Version Your Contracts

Use semantic versioning for consumer/provider versions:

2. Test Against Multiple Environments

Verify contracts against:

Latest main branch (upcoming changes)
Deployed to production (current state)

3. Use Webhooks for Immediate Feedback

Configure PactBroker to trigger provider verification when contracts change:

4. Don't Over-Specify

Bad (too specific):

Good (flexible):

Tools & Resources

Our Stack

Pact.js (v12+): Contract testing framework
PactBroker: Hosted on AWS (self-managed)
GitLab CI/CD: Automation pipeline
Slack: Deployment notifications via webhooks

Useful Commands

Conclusion

Contract testing with Pact.js transformed our microservices architecture from fragile to resilient. Zero production incidents in 12+ months speaks for itself.

The key insight: Test the contract, not the implementation. This shift-left approach catches incompatibilities before they reach production, giving teams confidence to deploy independently without fear.

If you're struggling with microservices integration testing, contract testing isn't just nice-to-have—it's essential.

Next Steps

Want to implement contract testing in your organization? Start here:

Pick one API endpoint with frequent breaking changes
Write a consumer contract for that endpoint
Set up PactBroker (use pactflow.io for hosted option)
Add can-i-deploy check to your CI/CD pipeline
Measure impact (incidents before/after)

Once you see the value, scaling to 100+ contracts is just more of the same.

Questions? Reach out on LinkedIn or check out the Pact.js documentation.