Alert Performance Report

The Alert Performance report helps you optimize your monitoring by analyzing alert effectiveness. Identify noisy alerts, improve detection times, and ensure critical issues are caught.

Report Sections

Overall Metrics

Key alert statistics for the period:

Total Alerts - All alerts generated
Average Detection Time - Time from issue start to alert
Alert Rate - Alerts per day average

Alert Volume Trends

A line chart showing daily alert counts over the last 30 days. Use this to:

Identify alert storms
Track improvements from tuning
Spot patterns in alert frequency

Monitor Performance

Table showing metrics for each monitor:

Monitor Name - The monitor generating alerts
Alert Count - Total alerts from this monitor
Failure Rate - Percentage of checks that failed
Avg Detection Time - How quickly issues are detected

Sort by any column to find:

Most noisy monitors
Monitors with high failure rates
Slow detection times

Alert Distribution

Visual breakdown showing:

Alerts by monitor type
Time of day distribution
Severity levels

Using the Report

Identifying Noisy Alerts

Look for monitors with:

High alert counts but low incident correlation
Frequent flapping (up/down cycles)
Alerts during known maintenance

These are candidates for tuning or removal.

Improving Detection Time

Fast detection is critical for minimizing impact:

Review monitors with slow detection times
Consider more frequent check intervals
Adjust thresholds for earlier warning

Exporting Data

Click Export to download a CSV with:

Per-monitor statistics
Daily alert volumes
Detection time details
Complete metrics for analysis

Alert Optimization

Reducing Alert Fatigue

Common causes and solutions:

Flapping services - Add retry logic or increase thresholds
Known issues - Use maintenance windows
Low-priority alerts - Adjust severity or disable
Duplicate alerts - Consolidate similar monitors

Threshold Tuning

Use the data to:

Set appropriate failure thresholds
Adjust check frequencies
Configure proper retry counts
Balance sensitivity vs noise

Best Practices

Regular Review

Weekly review of top alerting monitors
Monthly trend analysis
Quarterly threshold adjustments
Document changes and impacts

Team Collaboration

Share findings with service owners
Get feedback on alert usefulness
Coordinate tuning efforts
Track improvement metrics

Interpreting Results

Good Performance Indicators

Stable or decreasing alert volume
Quick detection times (under 5 minutes)
Low false positive rate
Alerts correlate with real incidents

Areas for Improvement

Increasing alert trends without incidents
Slow detection times for critical services
High volume from specific monitors
Alerts ignored by team

Taking Action

Immediate Steps

Disable or tune the noisiest monitors
Adjust thresholds on flapping services
Add maintenance windows for known issues
Review and update alert routing

Long-term Improvements

Implement better monitoring strategies
Use composite alerts for complex scenarios
Add business hours logic where appropriate
Regular alert effectiveness reviews