Abstract
We audit 550,145 fire-department dispatch records from Allegheny County (2015-2025) sourced from the Western Pennsylvania Regional Data Center, of which 367,444 are fire-specific and 182,701 are EMS / other categories. The original dashboard is now archived; the reproducible Python rig (experiments/run.py) re-derives every number cited in the paper.
Three findings dominate. (1) Fire alarm activations are the largest workload category at 205,398 incidents — 37.3% of all calls and 55.9% of fire-specific calls; commercial properties account for 64% of those alarms (131,536 incidents), implying a bounded annual cost of roughly 1,000 per response. (2) Seasonal demand is bifurcated: structure fires peak in Winter (heating-driven), outdoor fires peak in Spring (vegetation-driven, before summer rain). (3) Geographic load is concentrated: a single city (Pittsburgh) and a small set of inner-ring municipalities account for the majority of dispatch volume, suggesting that prevention and staffing investments should be city-targeted rather than uniformly distributed.
The paper is wired to the same pre-aggregated JSONs that powered the dashboard. Regenerating those JSONs from the archived source and re-running the rig will refresh every table and figure here. A consistency check inside the rig enforces that paper claims and source totals agree, and fails loudly otherwise.
Introduction
Background and Motivation
Fire departments in Allegheny County (Pittsburgh and ~129 surrounding municipalities) field hundreds of thousands of emergency dispatches per year across structure fires, outdoor fires, alarm activations, EMS assists, vehicle incidents, hazmat, and rescue. The Western Pennsylvania Regional Data Center publishes the full dispatch log with timestamp, municipality, and priority, providing a rare longitudinal view of public-safety demand. We use that log to do one specific thing well: describe what the workload actually looks like, and back every claim with a script anyone can re-run.
Research Objectives
This study pursues four objectives:
- Describe the dispatch mix. What fraction of calls are fire alarms vs. structure fires vs. outdoor fires vs. EMS? How does that split change year over year?
- Quantify the false alarm burden. How many alarm activations are commercial vs. residential, and what is a defensible upper bound on annual cost?
- Map seasonal demand. When do structure fires peak? When do outdoor fires peak? Are the two correlated or anti-correlated?
- Identify spatial concentration. Which municipalities account for the bulk of dispatches, and how concentrated is the top of the distribution?
Analytical Approach
The pipeline is a three-stage chain — designed so the dashboard and the paper consume the same inputs:
- Stage 1 (data prep, now archived with the retired fire-safety source): Convert WPRDC CSV → Parquet via
convert-to-parquet.py; produce pre-aggregated JSONs (stats.json,by-year.json,by-month.json,by-priority.json,by-city.json,false-alarms.json) viaprecompute-aggregations.ts. - Stage 2 (rig, in the fire-safety publication bundle): Load the JSONs, re-derive headline statistics, run consistency checks against
stats.json(the oracle), and emitdata/fire_safety_results.jsonfor the paper plusexperiments/results/headline.jsonfor the publication card. - Stage 3 (paper): Every table and figure in the rendered PDF is generated by an embedded Python cell that reads
data/fire_safety_results.json.
That structure — dashboard-as-paper — is the contribution as much as the findings: the website you click on and the PDF you cite are the same artifact, validated against the same oracle.