Skip to main content
AirHistory

Our Methodology

AirHistory tracks air quality trends across 1,000+ US cities using federal EPA monitoring data. We believe transparent methodology builds trust — here is exactly how we collect, score, and present every number on this site.

Data Sources

Our primary data source is the EPA Air Quality System (AQS), the authoritative federal database of ambient air quality measurements collected by state, local, and tribal monitoring agencies. Specifically, we use the Annual AQI by County CSV files published at aqs.epa.gov/aqsweb/airdata/, covering 10 years of data from 2014 through 2023.

The AQS dataset includes daily Air Quality Index (AQI) readings, pollutant breakdowns (PM2.5, ozone, CO, SO2, NO2), and the number of days at each AQI category (Good, Moderate, Unhealthy for Sensitive Groups, Unhealthy, Very Unhealthy, Hazardous).

How We Calculate the Air Quality Grade

Every city receives a proprietary Air Quality Grade on a 0-100 scale, mapped to letter grades A through F. The score is a weighted composite of four factors:

  • Average AQI (5-year mean) — 40% weight. Lower average AQI scores indicate cleaner air. We normalize against the national distribution so a city at the national median scores approximately 50.
  • Trend direction — 30% weight. We calculate a linear regression across the 10-year AQI history. Cities with improving air quality receive higher scores; worsening trends are penalized.
  • Days unhealthy per year — 20% weight. The average number of days per year rated "Unhealthy for Sensitive Groups" or worse. Fewer unhealthy days produce a higher score.
  • Worst contaminant severity — 10% weight. The peak pollutant concentration recorded in the most recent year, normalized against EPA standards.

Letter grades map as follows: A (80-100), B (65-79), C (50-64), D (35-49), F (0-34).

Data Collection Process

We download annual AQI CSV files from the EPA AQS bulk data portal, parse county-level measurements, and map them to cities using Census place-to-county crosswalks. Data is cleaned to remove incomplete monitoring stations (those with fewer than 200 valid measurement days per year) and normalized to per-capita exposure estimates where multiple monitoring stations exist within a single city.

Update Frequency

The EPA publishes annual AQI summary files each spring for the prior calendar year. We update our dataset within two weeks of each EPA release, typically in March or April. Historical data is stable and does not change retroactively.

Known Limitations

  • Not all cities have dedicated EPA monitoring stations. For cities without direct monitors, we use the nearest county-level data, which may not perfectly reflect hyperlocal conditions.
  • Wildfire smoke events can cause dramatic single-year spikes that disproportionately affect trend calculations. We do not currently exclude wildfire anomalies.
  • Indoor air quality is not measured — AQS data reflects outdoor ambient conditions only.
  • The Air Quality Grade is our own composite metric, not an official EPA designation.

How to Cite This Data

If you use data from AirHistory, please cite:

AirHistory. "[City Name] Air Quality Data." airhistory.org, 2026. Accessed [date].

Underlying data is sourced from the U.S. Environmental Protection Agency Air Quality System and is in the public domain.