Abstract
We audit 550,145 fire-department dispatch records from Allegheny County (2015-2025) sourced from the Western Pennsylvania Regional Data Center, of which 367,444 are fire-specific and 182,701 are EMS / other categories. The data feeds two consumers: an interactive dashboard for exploration, and a reproducible Python rig (experiments/run.py) that re-derives every number cited in the paper.
Three findings dominate. (1) Fire alarm activations are the largest workload category at 205,398 incidents — 37.3% of all calls and 55.9% of fire-specific calls; commercial properties account for 64% of those alarms (131,536 incidents), implying a bounded annual cost of roughly 1,000 per response. (2) Seasonal demand is bifurcated: structure fires peak in Winter (heating-driven), outdoor fires peak in Spring (vegetation-driven, before summer rain). (3) Geographic load is concentrated: a single city (Pittsburgh) and a small set of inner-ring municipalities account for the majority of dispatch volume, suggesting that prevention and staffing investments should be city-targeted rather than uniformly distributed.
The paper is wired to the live data: regenerating the dashboard's pre-aggregated JSONs and re-running the rig will refresh every table and figure here. A consistency check inside the rig enforces that paper claims and dashboard claims agree on totals, and fails loudly otherwise.
Introduction
Background and Motivation
Fire departments in Allegheny County (Pittsburgh and ~129 surrounding municipalities) field hundreds of thousands of emergency dispatches per year across structure fires, outdoor fires, alarm activations, EMS assists, vehicle incidents, hazmat, and rescue. The Western Pennsylvania Regional Data Center publishes the full dispatch log with timestamp, municipality, and priority, providing a rare longitudinal view of public-safety demand. We use that log to do one specific thing well: describe what the workload actually looks like, and back every claim with a script anyone can re-run.
Research Objectives
This study pursues four objectives:
- Describe the dispatch mix. What fraction of calls are fire alarms vs. structure fires vs. outdoor fires vs. EMS? How does that split change year over year?
- Quantify the false alarm burden. How many alarm activations are commercial vs. residential, and what is a defensible upper bound on annual cost?
- Map seasonal demand. When do structure fires peak? When do outdoor fires peak? Are the two correlated or anti-correlated?
- Identify spatial concentration. Which municipalities account for the bulk of dispatches, and how concentrated is the top of the distribution?
Analytical Approach
The pipeline is a three-stage chain — designed so the live dashboard and the paper consume the same inputs:
- Stage 1 (data prep, in
yev/apps/fire-safety/): Convert WPRDC CSV → Parquet viaconvert-to-parquet.py; produce pre-aggregated JSONs (stats.json,by-year.json,by-month.json,by-priority.json,by-city.json,false-alarms.json) viaprecompute-aggregations.ts. - Stage 2 (rig, in
genass/publications/quarto/fire_safety_dashboard/experiments/run.py): Load the JSONs, re-derive headline statistics, run consistency checks againststats.json(the oracle), and emitdata/fire_safety_results.jsonfor the paper plusexperiments/results/headline.jsonfor the publication card. - Stage 3 (paper): Every table and figure in the rendered PDF is generated by an embedded Python cell that reads
data/fire_safety_results.json.
That structure — dashboard-as-paper — is the contribution as much as the findings: the website you click on and the PDF you cite are the same artifact, validated against the same oracle.