Visualizing the Intersection of Senior Populations (65+ and 85+) and PM2.5 Air Pollution

Visualizing the Intersection of Senior Populations (65+ and 85+) and PM2.5 Air Pollution

Project Report: Visualizing the Intersection of Senior Populations (65+ and 85+) and PM2.5 Air Pollution

1. Project Overview

This project investigates the spatial correlation between vulnerable demographic groups (specifically residents aged 65+ and 85+) and exposure to PM2.5 particulate matter in the Salt Lake and surrounding counties.

Study Area

Proof-of-Concept Approach

This report presents a demonstration of geospatial analysis capabilities using two independent air quality data sources:

  1. EPA AQS Analysis (January 1-30, 2025): Regulatory-grade monitoring data from 10 official EPA stations
  2. PurpleAir Analysis (January 1-25, 2026): Community sensor network data from 254 validated sensors

Important Note: I did this analysis as a proof of concept to see how we can use the data to visualize the correlation between the demographic distribution and the air quality. The time period is not consistent between the two datasets, since the EPA data is from 2025 and the PurpleAir data is from 2026 , this is because the EPA data is not available for the year 2026.

Key Findings Preview


2. Data Acquisition Strategy (The “Iterative” Process)

I explored three distinct datasets

A. Demographic Data (The “People” Layer)

Source and Acquisition

Data Processing Pipeline

  1. Age Group Aggregation:
    • 65+ Population: Extracted 12 columns:
      • Male: B01001_020E (65-66), B01001_021E (67-69), B01001_022E (70-74), B01001_023E (75-79), B01001_024E (80-84), B01001_025E (85+)
      • Female: B01001_044E (65-66), B01001_045E (67-69), B01001_046E (70-74), B01001_047E (75-79), B01001_048E (80-84), B01001_049E (85+)
    • 85+ Population: Extracted 2 columns:
      • Male: B01001_025E (85+)
      • Female: B01001_049E (85+)
    • Calculated total populations for both 65+ and 85+ per census tract
    • Converted GEO_ID format (removed “1400000US” prefix) to match shapefile GEOID
  2. Geographic Integration:
    • Joined demographic data with tl_2025_49_tract.shp (Census Tract Shapefiles)
    • Filtered to Salt Lake County (FIPS 49035) and Davis County (FIPS 49011)
    • Reprojected to Web Mercator (EPSG:3857) for web mapping compatibility
  3. Data Refinement:
    • Filtered out uninhabited tracts: Removed tracts with zero 65+ population (e.g., Great Salt Lake, Airport, industrial zones)
    • Outlier removal: Dropped large, sparsely-populated tracts (>95th percentile area with <10 residents aged 65+) that would visually dominate the map
    • Final dataset: Census tracts with valid demographic data for both age groups

Final Statistics


B. Air Quality Data (The “Environment” Layer)

This proof-of-concept demonstrates analysis capabilities with two independent air quality data sources. Each source has distinct characteristics and is analyzed separately. |

B.1. EPA AQS Analysis (January 1-30, 2025)

Data Source:

Data Characteristics:

Monitoring Stations:

  1. Bountiful Viewmont (Site 4) - Davis County
  2. Copper View (Site 2005) - Salt Lake County
  3. Hawthorne (Site 3006) - Salt Lake County
  4. ROSE PARK (Site 3010) - Salt Lake County
  5. Herriman #3 (Site 3013) - Salt Lake County
  6. Lake Park (Site 3014) - Salt Lake County
  7. Utah Technical Center (Site 3015) - Salt Lake County
  8. Inland Port (Site 3016) - Salt Lake County
  9. Red Butte (Site 3018) - Salt Lake County
  10. Near Road (Site 4002) - Salt Lake County

Data Quality:

Advantages:

Limitations:

B.2. PurpleAir Analysis (January 1-25, 2026)

Data Source:

Data Characteristics:

Quality Assurance:

Advantages:

Limitations:

B.3. TRAX Mobile Sensor Data (Not Available)

Data Source:

Potential Value:

Access Status:


3. Technical Methodology

3.A. EPA Data Analysis Methodology (January 1-30, 2025)

Coordinate System Standardization

Data Processing

Visualization Approach

3.B. PurpleAir Data Analysis Methodology (January 1-25, 2026)

Coordinate System Standardization

Data Correction

Correction Formula (University of Utah Winter Inversion):

Corrected PM2.5 = (0.778 × Raw_CF1) + 2.65

Rationale:

Application:

Quality Assurance

Outlier Detection:

Visualization Range:

Spatial Interpolation

Method: Inverse Distance Weighting (IDW) interpolation

Process:

  1. Input Data: Discrete PurpleAir sensor points with corrected PM2.5 values
  2. Grid Creation: 200×200 interpolation grid covering study area bounds
  3. Interpolation Method: Linear interpolation to create continuous PM2.5 surface
  4. Output: Continuous PM2.5 surface as 2D array for heatmap visualization

Spatial Clipping:

Data Aggregation

Visualization Approach


4. Current Deliverables

4.A. EPA Analysis Deliverables (January 1-30, 2025)

Static Visualizations

Files:

Layout:

Purpose: Focus on official regulatory data with demographic context, using point-based visualization without interpolation assumptions. Separate maps allow comparison between 65+ and 85+ population distributions.

Data Analysis Plots

File: EPADataAnalysis/epa_data_analysis.png

Content:

Interactive Web Maps

Files:

Features:

4.B. PurpleAir Analysis Deliverables (January 1-25, 2026)

Static Visualizations

Files:

Layout (for each age group):

  1. Left Panel: Choropleth map of Population Density
    • Colormap: Blues (light = low, dark = high)
    • Clearly shows concentration of seniors on East Bench and specific valley neighborhoods
    • Legend: Population percentage per census tract
  2. Right Panel: PM2.5 Heatmap
    • Interpolated surface from 254 PurpleAir sensors
    • Colormap: OrRd (yellow = low, red = high)
    • Sensor locations shown as black dots for transparency
    • Spatial interpolation enables continuous surface mapping

Purpose: Demonstrate high-density sensor network capability for detailed neighborhood-level air quality visualization. Separate maps allow comparison between 65+ and 85+ population distributions.

Data Analysis Plots

File: PurpleAirDataAnalysis/purpleair_data_analysis.png

Content:

Interactive Web Maps

Files:

Features:

5. Key Findings and Insights

5.1. Demographic Distribution

5.A. EPA Analysis Findings (January 1-30, 2025)

Air Quality Patterns

Air Quality Status Distribution

Site-Level Observations

Preliminary Observations

5.B. PurpleAir Analysis Findings (January 1-25, 2026)

Air Quality Patterns

Air Quality Status Distribution

Spatial Variation


7. Technical Specifications

Software and Libraries

Python Environment:

Data Formats:

8. Future Roadmap

Short-Term Enhancements

  1. Acquire TRAX Mobile Data:
    • Contact University of Utah Atmospheric Sciences Department for alternative access to TRAX mobile data
    • The original data portal (https://atmos.utah.edu/air_quality/trax/) is no longer accessible
    • Goal: Obtain Red Line, Green Line, and Blue Line sensor data if alternative access becomes available
  2. TRAX Visualization (if data becomes available):
    • Plot GPS path line showing PM2.5 variation
    • Explicitly visualize elevation gradient (University of Utah bench → I-15 valley floor)
    • Create animated time-series showing pollution accumulation during inversion events

9. Reproducibility and Data Availability

Reproducibility

Data Availability


10. Acknowledgments


11. Descriptive Statistics

This section provides comprehensive descriptive statistics for all datasets used in this analysis. Statistics were calculated from the processed data files and verified against execution logs.

Important Note: EPA and PurpleAir statistics are from different time periods and cannot be directly compared. Each analysis is independent.

11.1. EPA Data Descriptive Statistics (January 1-30, 2025)

Analysis Period Statistics

Overall Dataset Statistics (Full Year 2025)

Site-Level Statistics (Analysis Period: January 1-30, 2025)

| Site Number | Site Name | Count | Mean (µg/m³) | Min (µg/m³) | Max (µg/m³) | Std Dev | |————-|———–|——-|—————|————-|————-|———| | 4 | Bountiful Viewmont | 478 | 6.90 | 1.0 | 22.70 | 5.68 | | 2005 | Copper View | 510 | 8.43 | 2.1 | 25.35 | 5.87 | | 3006 | Hawthorne | 772 | 7.65 | 1.4 | 23.10 | 5.79 | | 3010 | ROSE PARK | 750 | 8.01 | 1.6 | 26.70 | 6.42 | | 3013 | Herriman #3 | 540 | 5.37 | 0.9 | 14.45 | 3.79 | | 3014 | Lake Park | 270 | 6.41 | 1.5 | 20.18 | 4.84 | | 3015 | Utah Technical Center | 510 | 7.88 | 1.7 | 25.90 | 6.13 | | 3016 | Inland Port | 270 | 6.28 | 0.4 | 22.64 | 5.25 | | 3018 | Red Butte | 270 | 4.22 | 0.3 | 10.24 | 3.14 | | 4002 | Near Road | 510 | 9.63 | 2.4 | 30.24 | 6.65 |

Note: Red Butte (Site 3018) shows the lowest mean PM2.5 (4.22 µg/m³), consistent with its elevated bench location. Near Road (Site 4002) shows the highest mean (9.63 µg/m³), reflecting proximity to traffic emissions.

11.2. PurpleAir Data Descriptive Statistics (January 1-25, 2026)

Sensor Network Summary

PM2.5 Distribution Statistics (Corrected Values)

AQI Category Breakdown

Spatial Distribution

Observations:

11.3. Census Data Descriptive Statistics

Population 65+ Summary

Population 65+ Tract-Level Distribution

Population 65+ Distribution by County

Salt Lake County:

Davis County:

Population 85+ Summary

Population 85+ Tract-Level Distribution

Population 85+ Distribution by County

Salt Lake County:

Davis County:

11.4. Data Quality Metrics

EPA Data Quality (January 1-30, 2025)

PurpleAir Data Quality (January 1-25, 2026)

Census Data Quality

11.5. Analysis Plot Outputs

The following statistical analysis plots were generated as part of this analysis:

  1. EPA Data Analysis Plots (EPADataAnalysis/epa_data_analysis.png)
    • Panel 1: PM2.5 Distribution Histogram (all data vs. inversion period)
    • Panel 2: Box Plot by Monitoring Site
    • Panel 3: Time Series (Daily Average PM2.5 Over Time)
    • Panel 4: Summary Statistics Table
  2. PurpleAir Data Analysis Plots (PurpleAirDataAnalysis/purpleair_data_analysis.png)
    • Panel 1: PM2.5 Distribution Histogram (corrected values)
    • Panel 2: Box Plot by AQI Category
    • Panel 3: Spatial Distribution (Latitude vs. Longitude colored by PM2.5)
    • Panel 4: Summary Statistics Table

These plots provide comprehensive visual summaries of the data distributions, temporal patterns, and spatial characteristics, complementing the numerical statistics presented above.


12. References and Methodology Notes

Correction Formula Source

EPA AQI Categories (PM2.5)

Census Data Notes