What are DORA Metrics
Definition
DORA metrics are four key indicators developed by the DevOps Research and Assessment (DORA) team, now part of Google Cloud. They measure software delivery performance and operational reliability based on data from thousands of engineering organizations worldwide.
Your team ships code every day. Deployments go out smoothly most of the time, but when they don’t, recovery takes hours. You’re pretty sure your team is “good,” but you can’t prove it. You don’t have a consistent way to measure how well your engineering organization delivers software and responds to failures. DORA metrics give you that framework.
DORA metrics are four key indicators developed by the DevOps Research and Assessment (DORA) team, now part of Google Cloud. They measure software delivery performance and operational reliability based on data from thousands of engineering organizations worldwide. Since 2014, the DORA team (led by Dr. Nicole Forsgren, Jez Humble, and Gene Kim) has published annual State of DevOps reports establishing these metrics as the industry standard for measuring engineering effectiveness.
The Four DORA Metrics
1. Deployment Frequency
How often your organization deploys code to production. This measures the speed of your delivery pipeline.
- Elite: On demand (multiple deploys per day)
- High: Between once per day and once per week
- Medium: Between once per week and once per month
- Low: Less than once per month
Higher deployment frequency generally correlates with smaller batch sizes, which reduce risk. A team deploying 10 times per day is shipping small, incremental changes. A team deploying once per month is shipping large, risky releases.
2. Lead Time for Changes
The time between a code commit and that code running in production. This measures the efficiency of your delivery pipeline.
- Elite: Less than one day
- High: Between one day and one week
- Medium: Between one week and one month
- Low: More than one month
Long lead times often indicate bottlenecks in code review, testing, approval processes, or deployment infrastructure. Reducing lead time is as much an organizational challenge as a technical one.
3. Change Failure Rate
The percentage of deployments that cause a failure in production (requiring a rollback, hotfix, or patch). This measures the quality of your delivery process.
- Elite: 0-15%
- High: 16-30%
- Medium: 16-30%
- Low: 16-30% (note: the original DORA research found that low performers have similar change failure rates but much worse recovery times)
A low change failure rate indicates effective testing, code review, and deployment practices. It’s worth noting that the DORA research found change failure rate alone doesn’t differentiate performance tiers as strongly as the other three metrics. What differentiates elite teams is their ability to recover quickly when failures do occur.
4. Mean Time to Restore Service (MTTR)
How long it takes to recover from a failure in production. This is the reliability metric, and it’s the one most relevant to operational teams. It directly maps to mean time to resolution.
- Elite: Less than one hour
- High: Less than one day
- Medium: Between one day and one week
- Low: More than one week
This is the metric where the gap between elite and low performers is most dramatic. Elite teams recover from failures 168x faster than low performers (1 hour vs. 1 week). The DORA research consistently shows that MTTR is the strongest predictor of overall delivery performance.
Why DORA Metrics Matter
They’re research-backed, not opinion-based
Unlike most engineering metrics frameworks, DORA metrics are derived from statistical analysis of survey data from tens of thousands of professionals across thousands of organizations. The methodology has been validated over a decade of research and published in the book Accelerate (Forsgren, Humble, Kim, 2018).
They measure outcomes, not activities
DORA doesn’t measure lines of code written, story points completed, or hours worked. It measures things that actually matter to the business: how fast you can deliver value (deployment frequency, lead time) and how reliably you can operate (change failure rate, MTTR).
They correlate with organizational performance
The DORA research has consistently demonstrated that teams with better software delivery performance also report better organizational outcomes: profitability, market share, and employee satisfaction. This gives engineering leaders concrete data to justify investments in developer experience, CI/CD infrastructure, and operational tooling.
They balance speed and stability
A common misconception is that moving faster means more failures. The DORA data shows the opposite: elite teams deploy more frequently AND have better reliability. Speed and stability are not tradeoffs. They’re mutually reinforcing outcomes of good engineering practices.
How to Measure DORA Metrics
Deployment Frequency
Pull deployment records from your CI/CD pipeline (GitHub Actions, GitLab CI, Jenkins, ArgoCD). Count production deployments per day/week. Exclude staging and development deployments.
Lead Time for Changes
Measure the time between a commit merging to main and that commit reaching production. Your CI/CD platform usually has this data. For teams using feature flags, the relevant timestamp is when the flag is enabled for production traffic, not just when the code deploys.
Change Failure Rate
Track deployments that result in a rollback, hotfix, or incident. Divide by total deployments. This requires consistent incident tracking and the discipline to tag incidents as deployment-related when applicable.
Mean Time to Restore
Pull from your incident management system. Measure the time from when a production failure is detected to when service is restored. The key challenge here is consistent timestamp capture. Make sure your incident tooling records both detection time and recovery time.
Several platforms automate DORA metric collection, including Sleuth, LinearB, Haystack, and the DORA team’s own Four Keys open-source project.
Common Pitfalls with DORA Metrics
Optimizing one metric at the expense of others. Increasing deployment frequency by shipping untested code will improve frequency but tank your change failure rate. DORA metrics are designed to be tracked together. Improving one at the expense of another misses the point.
Gaming the numbers. Breaking one deploy into five “micro-deploys” to inflate frequency. Classifying incidents as “expected behavior” to keep change failure rate low. Closing incidents before root cause is found to improve MTTR. If people’s performance reviews depend on DORA numbers, they will find ways to game them.
Comparing across different contexts. A startup shipping a single web app has a very different deployment profile than a bank running 500 microservices with regulatory compliance requirements. DORA benchmarks are useful for tracking your own trends over time, not for declaring your team “elite” based on a single snapshot.
Ignoring the qualitative signals. DORA metrics are quantitative, but the most important improvements are often organizational: blameless postmortem culture, investment in testing infrastructure, reducing approval bottlenecks. The numbers tell you where to look. They don’t tell you what to fix.
How AI Tools Improve DORA Metrics
AI-driven operational tools directly impact the MTTR metric and indirectly improve the others.
MTTR reduction. This is the most direct impact. AI agents that automate root cause analysis compress the diagnosis phase of incident resolution from hours to minutes. NeuBird AI reports 94% accuracy in automated root cause identification, which directly translates to faster restoration of service.
Change failure rate reduction. AI-driven analysis of deployment patterns can identify which types of changes are most likely to cause failures, helping teams focus testing and review effort where it matters most. Proactive detection of risks before they become incidents also helps.
Deployment frequency enablement. When teams trust that failures will be detected and resolved quickly (low MTTR), they’re more willing to deploy frequently. Confidence in your operational safety net removes the fear that makes teams batch up changes into risky large releases.
Key Takeaways
- DORA metrics are four research-backed indicators of software delivery performance: deployment frequency, lead time for changes, change failure rate, and mean time to restore service.
- Elite teams deploy on demand, ship changes in less than a day, maintain change failure rates under 15%, and restore service in under an hour.
- The metrics are designed to be tracked together. Improving one at the expense of another defeats the purpose.
- Speed and stability are not tradeoffs. The DORA research shows they’re mutually reinforcing outcomes of good practices.
- AI tools directly improve MTTR (the reliability metric) and indirectly enable improvements across all four metrics.
Related Reading
- What is MTTR (Mean Time to Resolution)? – The DORA reliability metric explained in depth.
- What is Incident Management? – The process that MTTR measures.
- What is Root Cause Analysis (RCA)? – Faster RCA directly reduces MTTR and improves DORA performance.
- DORA Research Program – The official source for DORA metrics research and annual State of DevOps reports.
- Tackling Observability Scale with Context Engineering – How modern approaches compress the MTTR metric.
- Rewriting Incident Response
- Transforming CI/CD Pipeline Log Analysis with AI
- AI SRE Evaluation
2026 State of AI SRE Terminology – full glossary
Frequently Asked Questions
What are the four DORA metrics? +
The four DORA metrics are Deployment Frequency (how often you deploy), Lead Time for Changes (commit to production time), Change Failure Rate (percentage of deployments causing failures), and Mean Time to Restore Service (how long recovery takes).
What does DORA stand for? +
DORA stands for DevOps Research and Assessment. It started as an independent research program led by Dr. Nicole Forsgren, Jez Humble, and Gene Kim, and was acquired by Google Cloud in 2018. The team publishes the annual State of DevOps Report.
What's the difference between elite and low DORA performers? +
Elite teams deploy on demand (multiple times per day), have lead times under one day, change failure rates of 0-15%, and restore service in under one hour. Low performers deploy less than monthly, have lead times over a month, and take more than a week to restore service.
How do I start measuring DORA metrics? +
Pull deployment data from your CI/CD platform (GitHub Actions, GitLab, Jenkins). Pull incident data from your incident management tool. Open-source projects like the DORA team’s Four Keys can automate collection. Several commercial platforms also offer DORA dashboards.
Are DORA metrics still relevant in 2026? +
Yes. DORA metrics remain the industry standard for measuring software delivery and operational performance. The 2023 DORA report introduced more nuance around organizational context, but the four core metrics continue to correlate strongly with engineering effectiveness.
Can DORA metrics be gamed? +
Yes, like any metric. Teams might split deployments to inflate frequency, reclassify incidents to lower change failure rate, or close tickets early to improve MTTR. Tracking the four metrics together makes gaming harder, since improving one through gaming usually degrades another.
How do DORA metrics relate to MTTR? +
“Mean Time to Restore Service” is one of the four DORA metrics and the one most directly relevant to operations teams. It measures how quickly you recover from production failures and is the strongest predictor of overall delivery performance in DORA’s research.
How do I track DORA metrics in GitHub? +
Pull deployment data from GitHub Actions workflows. Use the GitHub API to capture pull request merge times for lead time. Combine this with your incident data for MTTR and change failure rate. Several tools (LinearB, Sleuth, Faros AI, the open-source Four Keys project) automate DORA metric collection from GitHub.
Are DORA metrics part of Agile? +
DORA metrics are not part of Agile or Scrum frameworks specifically, but they’re commonly used by Agile teams. They focus on engineering performance outcomes rather than process metrics like velocity or story points. Agile teams often adopt DORA as a complement to traditional Agile metrics.
What is the DORA report? +
The DORA State of DevOps Report is an annual research publication based on survey data from thousands of engineering professionals. It analyzes the relationship between software delivery performance (measured by the four DORA metrics) and organizational outcomes. The report has been published annually since 2014 and is widely cited in the DevOps community.
Are DORA metrics a vanity metric? +
No. Unlike vanity metrics that look good but don’t drive decisions, DORA metrics are designed to inform engineering investment. They correlate with business outcomes (profitability, market share, employee satisfaction) and provide actionable signals about where to improve. Used together, they’re hard to game without actually improving performance.