What makes an incident major vs minor?

Major incidents have significant business impact affecting many customers or revenue, require immediate response, and demand coordination across multiple teams. Minor incidents have limited scope affecting few users, can wait for business hours, and typically require only one person to resolve. The distinction guides resource allocation and response urgency.

How many severity levels should you have?

Most organizations use 3-5 severity levels. Too few levels (like just major/minor) lack nuance for prioritization. Too many levels create confusion about distinctions between adjacent severities. A common framework uses SEV-1 (critical outage), SEV-2 (significant degradation), SEV-3 (minor issues), and SEV-4 (cosmetic bugs).

Who decides if an incident is major or minor?

The on-call engineer or incident reporter makes the initial classification using predefined criteria. If classification is unclear or the situation escalates, the incident commander can reclassify. Clear criteria in documentation help responders classify consistently without requiring management approval during incidents.

Can incidents change from minor to major?

Yes. Incidents often escalate as impact becomes clear or problems worsen. A minor database slowdown becomes major when it causes complete API failure. Teams should reassess severity as situations evolve and adjust response accordingly—this is called severity escalation.

Major vs Minor Incident Classification Guide

When an alert fires at 3 AM, your on-call engineer needs to make a critical decision: Is this a major incident requiring immediate escalation, or a minor issue that can wait until morning? Getting this classification wrong can mean wasted resources on false alarms or delayed response to critical problems.

Incident classification isn’t just about labeling problems. It’s about creating a shared language that helps your team prioritize effectively, allocate resources appropriately, and respond with the right level of urgency.

Why Incident Classification Matters

Without clear classification criteria, every team member applies their own judgment. What one engineer considers critical, another might see as routine. This inconsistency leads to alert fatigue, miscommunication, and wasted effort.

A well-defined classification system provides several benefits:

Faster decision-making: Engineers can quickly assess severity without consulting multiple people or second-guessing their judgment.

Appropriate resource allocation: Major incidents get the full team’s attention, while minor issues are handled efficiently without over-escalation.

Clear communication: When you declare a “SEV-1 incident,” everyone understands the implications and their expected role.

Better metrics and learning: Consistent classification enables meaningful analysis of incident patterns and improvement opportunities.

Defining Major Incidents

Major incidents share common characteristics that distinguish them from routine issues:

Business impact: Services that directly affect customers or revenue are degraded or unavailable. This includes complete outages, significant performance degradation, or data integrity issues affecting multiple users.

Scope and scale: The problem affects a substantial portion of your user base or critical business operations. A bug affecting one user is typically minor; the same bug affecting thousands requires major incident response.

Urgency: The issue requires immediate attention and cannot wait for normal business hours. Major incidents often trigger paging and weekend responses.

Resolution complexity: Major incidents typically require coordination across multiple teams, external communication, and executive visibility.

Common examples include complete application outages, payment processing failures, data loss events, security breaches, and critical third-party service failures affecting your core functionality.

Defining Minor Incidents

Minor incidents are real problems that need fixing, but they don’t meet the threshold for major incident response:

Limited impact: Issues affect a small subset of users, non-critical features, or internal tools. The business continues operating normally.

Degraded but functional: The service works but with reduced performance or missing non-essential features. Users experience inconvenience rather than complete inability to work.

Deferrable: The problem can be addressed during business hours without significant business risk.

Single team resolution: One team can handle the investigation and fix without extensive coordination.

Examples include minor UI bugs, performance issues affecting a small user segment, non-critical integration failures, and isolated edge case problems.

Building Your Severity Level Framework

Most organizations use a three to five level severity system. Here’s a practical five-level framework:

Severity 1 (Critical): Complete outage or critical functionality unavailable. Major customer impact. Revenue at risk. Immediate response required regardless of time.

Severity 2 (High): Significant degradation affecting many users. Core features impacted but workarounds exist. Response required within hours.

Severity 3 (Medium): Noticeable issues affecting some users or non-critical features. Response within one business day.

Severity 4 (Low): Minor issues with minimal user impact. Resolution within a week.

Severity 5 (Trivial): Cosmetic issues, minor bugs, or enhancement requests. Addressed as resources allow.

The key is defining clear, measurable criteria for each level. Avoid subjective terms like “important” or “serious” without concrete definitions.

Key Classification Factors

When assessing an incident, consider these dimensions:

Impact scope: How many users or systems are affected? Is it organization-wide, team-specific, or isolated to individuals?

Business criticality: Does this affect revenue, customer trust, regulatory compliance, or core business operations?

Urgency: How quickly must this be resolved? What happens if we wait until tomorrow?

Workarounds: Can users accomplish their goals through alternative methods, or are they completely blocked?

Trend and trajectory: Is the problem stable, improving, or getting worse? A small issue that’s spreading rapidly may warrant higher classification.

Making Classification Decisions

Start with a default classification based on initial information, but be ready to adjust. Incidents often begin as minor issues that escalate, or major incidents that turn out to be less severe than initially thought.

Create decision trees or flowcharts that help on-call engineers quickly assess severity. For example:

“Is the primary application unavailable?” → Yes → SEV-1

“Can users complete core workflows?” → No → SEV-1 or SEV-2

“Are workarounds available?” → Yes → Consider SEV-3

Empower engineers to make classification decisions and adjust as new information emerges. Better to escalate quickly and de-escalate later than to underestimate severity.

Common Classification Mistakes

Over-classification: Treating every issue as critical leads to alert fatigue and burnout. Save major incident response for truly major problems.

Under-classification: Minimizing genuine problems because they’re inconvenient or “not our fault” delays appropriate response.

Static classification: Failing to adjust severity as circumstances change. An incident that starts minor can escalate; major incidents can be downgraded once contained.

Ignoring business context: A technical issue that seems minor might have major business implications during a critical event or for a key customer.

Using UpStat for Incident Classification

UpStat helps teams implement consistent incident classification through a five-level severity system. When creating incidents, you assign severity levels from 1 (highest priority) through 5 (lowest priority) that drive notification routing and escalation policies.

Each incident’s severity automatically determines alert routing based on your configured escalation rules. Level 1 incidents can trigger immediate paging and escalation chains, while level 5 incidents route through standard channels. This structured approach ensures consistent response patterns across your team.

UpStat also supports custom status workflows and labels, allowing you to capture additional context beyond basic severity classification. This flexibility lets you adapt the system to your team’s specific needs while maintaining consistency across your incident response process.

Building a Classification Culture

Technology alone doesn’t solve classification challenges. Build a culture where:

Classification is revisited: Encourage teams to adjust severity as situations evolve. Make it easy to escalate or de-escalate.

Learning is valued: Review classification decisions during post-incident reviews. Did we classify correctly? What would we change?

Criteria are updated: As your business evolves, your classification criteria should too. Quarterly reviews ensure your framework stays relevant.

Context is shared: Document why specific incidents received their classification. This helps future responders make better decisions.

Conclusion

Effective incident classification isn’t about perfection. It’s about creating a consistent, practical framework that helps your team make fast, appropriate decisions under pressure.

Start with clear definitions for major and minor incidents. Build a severity level system with objective criteria. Empower engineers to make classification decisions and adjust as needed. Review and refine your approach based on actual incident data.

The goal is simple: when something goes wrong, your team should spend their time fixing the problem, not debating how serious it is.

Explore In Upstat

Configure up to five custom severity levels that match your organization's framework. Set automated alert routing and escalation policies based on incident severity.

See How Severity Management Works

Major vs Minor Incidents