Alert Fatigue Research: 25+ DevOps and SRE Citations (2026)
25+ primary citations. The only page on the internet that aggregates Google SRE, DORA, incident.io, Honeycomb, The Joint Commission, ECRI, AHRQ, and Gloria Mark in one place.
Updated May 2026.
01. Site Reliability and Alert Fatigue
Chapter 6 'Monitoring Distributed Systems' defines the philosophy: every alert must be actionable, urgent, and require human judgment. Establishes the '2 pages per 12-hour on-call shift' target. The canonical reference for alert philosophy in software operations.
Primary source -->Chapter 5 'Alerting on SLOs' defines multi-window multi-burn-rate alerting. Provides the burn-rate formula and threshold recommendations (14x for 1hr window, 6x for 6hr window). The technical specification for SLO-based alerting.
Primary source -->Annual research correlating software delivery practices with organisational performance. MTTR benchmarks by tier: Elite < 1 hour, High < 1 day, Medium 1 day - 1 week, Low > 1 week. Links alert discipline to MTTR and incident management maturity.
Primary source -->Primary survey data from 500+ on-call engineers. Key findings: 41% have considered leaving due to alert load, 62% report weekly sleep disruption from night pages, median 42 pages per engineer per week. Most-cited 2024 primary source in the space.
Primary source -->Annual SRE practitioner survey. Reports false-positive alert rates of 60-80% industry median, on-call burnout statistics, and tooling adoption patterns. Used for the Alert Fatigue Index false-positive benchmark.
Primary source -->Industry survey from PagerDuty. MTTA benchmarks: 8-15 minutes median. After-hours page prevalence. Alert volume statistics by company size. Note: vendor-published, treat with appropriate scepticism.
Primary source -->02. Observability and SLO Research
Annual survey on observability adoption. Covers SLO adoption rates, correlation between observability maturity and MTTR. Charity Majors and the Honeycomb team are the primary advocates for symptom-based alerting in the industry.
Primary source -->Series of blog posts and conference talks arguing for SLO-based, symptom-first alerting over threshold-based cause-monitoring. The most widely-read practitioner writing on the subject. Not peer-reviewed but highly influential.
Primary source -->Annual survey on monitoring and observability tool usage. Covers alert volume, tooling consolidation patterns, and correlation feature adoption. Used for market baseline data.
Primary source -->Analysis of SLO adoption patterns across enterprise organisations. Reports adoption rates, common pitfalls, and MTTR impact of SLO adoption. Vendor-published; cross-referenced with DORA data.
Primary source -->03. Security Alert Fatigue (SOC)
Annual report documenting data breach costs and contributing factors. SOC alert fatigue cited as a contributing factor to extended breach dwell times. Mean dwell time: 207 days.
Primary source -->Survey of SOC analysts on alert volume, false-positive rates, and burnout. Reports 55% false-positive rate for SOC alerts, similar to DevOps. Analyst attrition data.
Primary source -->Annual threat intelligence report. Documents alert-to-incident investigation gap times. Contextually relevant to SOC alert fatigue and missed true positives.
Primary source -->04. Healthcare Alarm Fatigue
Critical safety alert documenting the link between alarm fatigue and patient harm. Cites studies showing 85-99% false-positive ICU alarm rates. Foundation document for NPSG.06.01.01.
Primary source -->Regulatory mandate requiring accredited US hospitals to establish alarm management programmes. Renewed annually. The only regulatory framework analogous to what DevOps needs voluntarily.
Primary source -->Alarm hazards have appeared in the top 10 for multiple consecutive years. ECRI is the leading independent healthcare technology safety research organisation.
Primary source -->Comprehensive summary of the evidence base for alarm fatigue in clinical settings. Free, publicly available. Cites 72-99% false-positive rates across different ICU unit types.
Primary source -->Seminal literature review establishing the evidence base across multiple studies. Documents 86-99% false-alarm rate range in ICU environments. Frequently cited in subsequent healthcare research.
05. Knowledge-Worker Notification Fatigue
Original 2004 research documenting 23-minute refocus time after an interruption. 2023 work shows this figure has compressed somewhat for digital workers. Core reference for all notification-cost calculators.
Primary source -->Analysis of 185 million working hours. Workers check email or messaging apps every 6 minutes on average. 28% of the working day is spent on communication apps.
Primary source -->Annual survey of knowledge workers. Reports 60% of time spent on work about work (meetings, emails, notifications) vs actual work.
Primary source -->Annual survey of 31,000 workers. 57% report being interrupted by notifications constantly. 68% report lack of uninterrupted focus time as their top challenge.
Primary source -->Survey of 10,000+ knowledge workers. Reports average 32 @mentions per day in Slack. Average response time expected: under 10 minutes.
Primary source -->Stanford research on media multitaskers. Heavy multitaskers are more susceptible to distraction from irrelevant environmental stimuli. Supports the cognitive cost argument for notification reduction.
Primary source -->HBR analysis of the cost of interruptions to knowledge workers. Representative of the mainstream business case for notification management.
Primary source -->