Blog Home  /  dora-metrics-explained

DORA Metrics Explained

DORA metrics provide a research-backed framework for measuring software delivery performance. The four key metrics—deployment frequency, lead time for changes, change failure rate, and time to restore service—reveal whether teams can ship reliably and recover quickly when things go wrong.

8 min read
devops sre

Your deployment pipeline is fast. Your test coverage is high. Yet production incidents keep happening after releases, and recovery takes hours instead of minutes. Something is wrong, but where?

DORA metrics provide the framework for answering that question. Developed through years of research into software delivery performance, these four measurements reveal whether your engineering organization can ship reliably and recover quickly when things go wrong.

What Are DORA Metrics?

DORA (DevOps Research and Assessment) metrics emerged from research led by Dr. Nicole Forsgren, Jez Humble, and Gene Kim. Over six years of studying thousands of organizations, they identified four key metrics that predict software delivery performance and organizational outcomes.

The four metrics split into two categories:

Throughput metrics measure how quickly you deliver value:

  • Deployment Frequency
  • Lead Time for Changes

Stability metrics measure how reliably you deliver:

  • Change Failure Rate
  • Time to Restore Service (MTTR)

The research revealed something counterintuitive: high-performing teams excel at both throughput and stability simultaneously. They ship faster and break things less often. Speed and reliability are not tradeoffs—they reinforce each other.

Deployment Frequency

What it measures: How often your organization deploys code to production.

Deployment frequency indicates organizational velocity. Teams that deploy frequently have streamlined their processes, reduced batch sizes, and built confidence in their release mechanisms. Teams that deploy rarely often struggle with large, risky releases that accumulate changes over weeks or months.

How to measure it:

Count production deployments over a time period. A deployment means code reaching production—not staging, not a feature branch, but actual production systems serving users.

Deployment Frequency = Deployments per time period (day/week/month)

Performance benchmarks:

  • Elite: On demand (multiple deploys per day)
  • High: Between once per day and once per week
  • Medium: Between once per week and once per month
  • Low: Between once per month and once every six months

Why it matters for incident response:

Higher deployment frequency correlates with faster incident recovery. Teams that deploy multiple times daily can ship fixes within minutes of identifying problems. Teams that deploy monthly face pressure to batch fixes with feature work, delaying resolution.

Frequent deployment also means smaller changes per deployment. When something breaks, the blast radius is limited and the cause is easier to identify. A deployment containing one change is simpler to debug than one containing fifty.

Lead Time for Changes

What it measures: The time from code commit to code running in production.

Lead time reveals process efficiency. Short lead times mean streamlined pipelines, effective testing, and minimal manual gates. Long lead times indicate bottlenecks—manual approvals, lengthy test suites, deployment windows, or excessive process overhead.

How to measure it:

Track the elapsed time from first commit to production deployment for each change. Average across changes for the metric.

Lead Time = Time from commit to production deployment

Performance benchmarks:

  • Elite: Less than one hour
  • High: Between one day and one week
  • Medium: Between one week and one month
  • Low: Between one month and six months

Why it matters for incident response:

Lead time directly affects how quickly fixes reach production. When your payment API breaks at 2 AM, lead time determines whether the fix ships in 15 minutes or waits for the next deployment window.

Teams with short lead times can afford to ship small, targeted fixes. Teams with long lead times face pressure to bundle fixes together, increasing risk and delaying resolution for some issues.

Change Failure Rate

What it measures: The percentage of deployments that cause failures in production.

Change failure rate reveals deployment quality. It measures how often your release process puts broken code in front of users, requiring rollback, hotfix, or immediate intervention.

How to measure it:

Count deployments that caused production failures divided by total deployments.

Change Failure Rate = (Failed deployments / Total deployments) × 100

What counts as a failure? Incidents requiring rollback, hotfixes deployed to fix deployment-caused issues, and deployments that degraded service requiring immediate intervention. Not all production incidents count—only those directly caused by deployments.

Performance benchmarks:

  • Elite: 0-15%
  • High: 16-30%
  • Medium: 31-45%
  • Low: 46-60%

Why it matters for incident response:

Change failure rate connects deployment practices to incident volume. High failure rates mean more incidents, more on-call burden, and more time spent on reactive firefighting instead of proactive improvement.

Reducing change failure rate requires investment in testing, deployment strategies, and quality practices. But the payoff compounds: fewer deployment-caused incidents means more time for building features and improving reliability systematically.

When incidents do occur, tracking whether they resulted from deployments helps teams identify improvement opportunities. If 80% of your incidents trace back to recent deployments, that signals where to focus quality investments.

Time to Restore Service

What it measures: How long it takes to recover from a production failure.

Time to restore service—often called MTTR (Mean Time to Restore or Mean Time to Recovery)—measures incident response effectiveness. It captures the complete time from incident start to service restoration.

How to measure it:

Track duration from incident detection to service restoration. Average across incidents for the metric.

MTTR = (Sum of recovery times) / (Number of incidents)

Recovery time starts when the incident is detected (monitoring alerts fire) and ends when service is restored to normal operation. The exact definition of “restored” varies by organization—some measure when the fix deploys, others when metrics return to baseline.

Performance benchmarks:

  • Elite: Less than one hour
  • High: Less than one day
  • Medium: Between one day and one week
  • Low: Between one week and one month

Why it matters for incident response:

MTTR is the stability metric that incident response teams most directly control. While change failure rate depends heavily on deployment practices and testing, MTTR depends on detection speed, coordination efficiency, troubleshooting capability, and resolution execution.

Fast MTTR requires investment across the incident response lifecycle:

  • Comprehensive monitoring for rapid detection
  • Effective alerting for quick acknowledgment
  • Clear escalation paths for right-person involvement
  • Documented runbooks for systematic troubleshooting
  • Automated rollback capabilities for fast mitigation

Teams tracking MTTR can identify which phase creates bottlenecks. If detection is fast but resolution is slow, invest in runbooks and troubleshooting automation. If detection is slow, improve monitoring coverage.

For strategies on improving each phase, see Reducing Mean Time to Resolution.

How the Metrics Connect

DORA metrics reveal their full value when examined together. Individual metrics in isolation can mislead.

Velocity without stability creates chaos. Teams that deploy frequently with high change failure rates generate constant incidents. Fast deployment means nothing if every release breaks production.

Stability without velocity creates stagnation. Teams with zero change failure rate but monthly deployments are not shipping value. Perfect reliability achieved through fear of deployment is not a success story.

Elite performance requires both. The research consistently shows that high performers excel at all four metrics simultaneously. They have figured out how to ship quickly while maintaining reliability. The practices that enable fast deployment—small batches, comprehensive testing, automated pipelines—also reduce failure rates.

MTTR compensates for imperfect CFR. Even elite teams experience failures. The difference is recovery speed. When something breaks, elite teams restore service within an hour. Low performers take weeks. Given that failures will occur, MTTR measures organizational resilience.

Consider this scenario: Team A deploys weekly with 10% change failure rate and 4-hour MTTR. Team B deploys daily with 15% change failure rate and 30-minute MTTR. Despite higher failure rate, Team B likely delivers better user experience because failures resolve so quickly.

Measuring DORA Metrics in Practice

Accurate measurement requires tooling that spans the software delivery lifecycle.

Deployment frequency and lead time come from your CI/CD pipeline. Tools like GitHub Actions, GitLab CI, Jenkins, or deployment platforms track when commits occur and when deployments complete. Most modern platforms provide these metrics directly or through integrations.

Change failure rate requires connecting deployment data to incident data. When an incident occurs, can you trace it back to a specific deployment? This connection often requires manual classification—someone must determine whether an incident was deployment-caused or from other factors like traffic spikes, dependency failures, or infrastructure issues.

Time to restore service comes from incident management tooling. Platforms like Upstat automatically track incident duration from creation through resolution, providing accurate MTTR data without manual logging. Built-in analytics show MTTR trends over time, breakdowns by severity, and patterns that reveal improvement opportunities.

The challenge is connecting these data sources. Deployment frequency lives in your CI/CD system. Incidents live in your incident management platform. Connecting a specific incident to a specific deployment requires either manual tagging or automated correlation through deployment markers.

Improving Your DORA Metrics

Improvement starts with measurement. Establish baselines for each metric, then focus improvement efforts on the biggest bottleneck.

To improve deployment frequency:

Reduce batch sizes. Smaller deployments are easier to review, test, and deploy. Instead of accumulating changes for weekly releases, ship individual features as they complete.

Automate manual steps. Every manual gate in your deployment pipeline adds delay and variability. Automate testing, security scanning, and deployment execution.

Build deployment confidence. Fear of deployment—often from past failures—creates deployment avoidance. Improve testing and rollback capabilities until the team trusts the deployment process.

To improve lead time:

Identify bottlenecks. Where do changes wait? Manual code reviews? Slow test suites? Deployment windows? Each bottleneck offers improvement opportunity.

Parallelize where possible. Tests that run sequentially can often run in parallel. Reviews that block deployment can happen after merge with post-deployment validation.

Eliminate unnecessary gates. Every approval step adds delay. Evaluate whether each gate provides value proportional to the delay it creates.

To improve change failure rate:

Invest in automated testing. Comprehensive test suites catch problems before production. Unit tests, integration tests, and end-to-end tests each catch different failure modes.

Use progressive deployment strategies. Canary releases expose new code to limited traffic before full rollout. Blue-green deployments enable instant rollback. Feature flags decouple deployment from release.

Conduct post-deployment validation. Smoke tests immediately after deployment catch configuration errors and integration failures that pre-deployment testing missed.

For comprehensive deployment practices, see Continuous Deployment Best Practices.

To improve MTTR:

Reduce detection time. Comprehensive monitoring catches problems quickly. Multi-region checks identify geographic issues. Smart alerting thresholds balance sensitivity with noise.

Improve acknowledgment speed. Clear on-call schedules, effective notification routing, and appropriate escalation policies ensure alerts reach responders quickly.

Accelerate investigation. Documented runbooks provide systematic troubleshooting steps. Service catalogs reveal dependencies and ownership. Observability tools provide the data needed for diagnosis.

Enable fast recovery. Automated rollback capabilities restore service quickly. Feature flags disable problematic functionality without redeployment. Pre-planned mitigation steps reduce decision time under pressure.

Common Pitfalls

Gaming metrics instead of improving performance. Deploying empty changes to boost frequency. Classifying incidents as non-deployment-caused to lower failure rate. Closing incidents before service actually recovers to improve MTTR. Metrics lose value when teams optimize for numbers instead of outcomes.

Measuring without acting. Dashboards displaying DORA metrics accomplish nothing by themselves. Metrics drive improvement only when teams review them regularly, identify patterns, and take action based on findings.

Expecting immediate transformation. DORA metrics reflect deep organizational capabilities. Moving from monthly deployments to daily deployments requires substantial investment in testing, automation, and culture. Set realistic improvement timelines.

Comparing across incompatible contexts. A team building a payment system faces different constraints than a team building documentation. Internal tools have different reliability requirements than customer-facing services. Benchmark against yourself and similar organizations, not abstract ideals.

Ignoring the cultural dimension. DORA research consistently shows that organizational culture—psychological safety, information sharing, collaboration—predicts performance alongside technical practices. Metrics alone cannot fix cultural dysfunction.

Using DORA for Incident Response

Incident response teams benefit most from the stability metrics: change failure rate and time to restore service.

Track change failure rate to understand incident sources. When classifying incidents, note whether they resulted from deployments. Over time, patterns emerge: which services have high deployment-caused incident rates? Which deployment types (database migrations, configuration changes, new features) cause the most problems?

Use MTTR to identify response bottlenecks. Break down MTTR into phases: detection time, acknowledgment time, investigation time, resolution time. Which phase dominates? Different bottlenecks require different interventions.

Connect deployment practices to incident outcomes. Work with development teams to understand how deployment practices affect incident patterns. If canary deployments catch problems before full rollout, fewer incidents require response. If automated rollback works reliably, incidents resolve faster.

Learn from incidents to reduce future failures. Post-incident reviews should examine whether deployment practices contributed to the incident and how they might be improved. This feedback loop connects incident response to deployment quality improvement.

Moving Forward with DORA

DORA metrics provide a research-validated framework for understanding software delivery performance. The four metrics—deployment frequency, lead time, change failure rate, and time to restore service—reveal organizational capability in ways that vanity metrics cannot.

Start by measuring. Establish baselines for each metric. Identify which metric presents the biggest opportunity for improvement. Then invest in practices that address that specific bottleneck.

Remember that elite performance means excelling at all four metrics simultaneously. Speed and stability reinforce each other when built on solid foundations of testing, automation, and operational capability. Organizations that ship frequently and recover quickly have built systems and cultures that enable both.

The goal is not hitting specific numbers. The goal is continuous improvement toward reliable software delivery. DORA metrics show you where you stand and whether your investments are working. Use them to guide improvement, not to create stress over benchmarks.

Teams that measure thoughtfully, act on findings, and maintain focus on improvement build the delivery capabilities that distinguish high-performing engineering organizations.

Explore In Upstat

Track MTTR automatically with incident duration analysis, resolution time trends, and severity breakdowns that help teams measure and improve their Time to Restore Service metric.