Long Term Climatologies Featuring nClimGrid-Daily Data

Author

Affiliation

Joshua Brinks

ISciences, LLC

Published

June 15, 2025

Overview

In this lesson, we will explore long-term climate patterns and detect climate change signals through innovative visualization techniques and time series analysis using operational climate monitoring data. You will learn to access and analyze NOAA’s nClimGrid-Daily dataset, create climate circles that reveal annual seasonal patterns, apply time series decomposition to separate climate signals across multiple timescales, and connect these analytical methods to real-world climate monitoring applications. This integrated approach demonstrates how operational climate datasets enable both scientific research and practical applications for understanding climate variability and change in rapidly evolving urban environments.

Programming Environment

This lesson uses the R programming environment and requires specialized packages for accessing cloud-based climate datasets and performing advanced time series analysis. The analysis leverages Apache Arrow for efficient cloud-native data access and modern R visualization packages for creating publication-quality climate graphics.

Learning Objectives

After completing this lesson, you should be able to:

Access and process operational climate data from NOAA’s cloud-native nClimGrid-Daily dataset, extracting daily meteorological observations and creating time series for climate analysis
Create climate circles using polar coordinate transformations to visualize annual patterns, seasonal cycles, and climate variability in an intuitive circular format that highlights yearly patterns
Apply STL decomposition to separate climate time series into trend, seasonal, and remainder components, enabling detection of long-term climate change signals and natural variability
Calculate climate normals using World Meteorological Organization standards to establish baseline reference periods and compute anomalies for climate change detection
Interpret complex time series patterns including seasonal cycles, long-term trends, and extreme events within the context of urban climate characteristics and regional climate change
Connect analytical methods to operational tools by understanding the methodology behind climate monitoring platforms and translating manual analysis techniques to automated operational systems
Apply professional climate analysis workflows including standardized temporal aggregation, quality control procedures, and statistical methods used by climate monitoring agencies

Important: Climate Data and Urban Planning Context

The San Antonio-New Braunfels metropolitan area has experienced dramatic population growth and urbanization over the past several decades, making it an ideal case study for understanding how urban development intersects with climate patterns. The region’s location in south-central Texas places it at the intersection of multiple climate influences, creating complex seasonal patterns that urban planners and water resource managers must understand for effective decision-making.

While this lesson provides valuable experience with climate analysis techniques and operational datasets, remember that the patterns we examine represent real climate conditions affecting over 2.5 million residents in the greater San Antonio region. Urban heat island effects, changing precipitation patterns, and extreme weather events have direct implications for public health, water resource management, and infrastructure planning across this rapidly growing metropolitan area.

When presenting your analysis or discussing these methods with others, approach the results with appropriate consideration for the communities and ecosystems affected by climate variability and change. These analytical skills provide powerful tools for supporting climate adaptation and resilience planning efforts.

Introduction

Climate change is altering the American Southwest at an unprecedented pace, and Texas is at the forefront of rapidly evolving atmospheric patterns that intersect with urban growth to create new challenges for water resources, agriculture, and public health. Across the southwest United States, temperature increases from 1902-2017 have reached 0.6 to 3.1°C per century, with minimum temperatures rising even faster at rates of 0.1 to 8.0°C per century (Djaman et al. 2020). The most recent generation of climate models projects even more dramatic changes ahead, with temperatures potentially increasing by up to 6°C by 2100 under high emission scenarios, particularly across the northern portions of the region (Almazroui et al. 2021).

For rapidly growing metropolitan areas like San Antonio-New Braunfels, these climatic shifts present several challenges. The region’s location near the 100th meridian—the historic transition zone between humid and arid climates—makes it especially vulnerable to shifts in precipitation patterns and temperature extremes (Nielsen-Gammon et al. 2020). Urban areas amplify these effects through heat island formation, with recent studies in Houston documenting urban-rural temperature differences exceeding 8°C during extreme heat events (Statkewicz, Talbot, and Rappenglueck 2021). Meanwhile, population projections indicate that Texas will grow from 29.5 million residents in 2020 to 51 million by 2070, placing unprecedented demands on water resources and infrastructure systems that were designed using historical climate assumptions (Nielsen-Gammon et al. 2020).

The precipitation story across Texas reveals an equally complex picture of intensifying extremes and shifting patterns. While some areas have experienced long-term increases in annual precipitation averaging 8.5% per century, these increases come mostly through more intense rainfall events rather than gentle, sustained precipitation (Nielsen-Gammon et al. 2020). Houston’s precipitation analysis from 1989-2018 demonstrates this pattern clearly: extreme rainfall events (exceeding 100mm per day) have increased significantly while moderate rainfall has decreased, altering the region’s hydrological cycle (Statkewicz, Talbot, and Rappenglueck 2021). Similar patterns emerge in climate projections for the Clear Creek watershed, where models consistently project increased precipitation intensity and variability even as dry periods lengthen significantly (Li et al. 2019).

Rapid changes in both climate means and extremes have broad implications for agricultural systems across the region. Climate impact studies in the Texas High Plains demonstrate that all major crops—including irrigated grain corn, grain sorghum, and winter wheat—face significant reductions in yields and irrigation efficiency under projected climate conditions (Chen et al. 2021). By the end of the 21st century under severe emission scenarios, models project 63% reductions in irrigation water availability for corn and 80% for sorghum, forcing fundamental changes in agricultural practices and crop selection (Chen et al. 2021). The speed of these changes creates difficult challenges for agricultural communities and water resource managers who must adapt to conditions unlike anything in the historical record.

Perhaps most concerning is the growing recognition that future droughts in Texas will be “unlike past droughts” in fundamental ways (Nielsen-Gammon et al. 2020). Paleoclimate reconstructions reveal that the region has experienced severe megadroughts lasting decades during the past millennium, but climate projections suggest that conditions during the latter half of the 21st century will be drier than even the most arid centuries of the last 1,000 years. This represents a shift from stationary to non-stationary climate conditions that challenges the foundation of how water resource planning and climate risk assessment have traditionally been conducted.

Climate in the News

San Antonio Among Cities Most Affected by Urban Heat Islands, New Analysis Shows

A comprehensive new analysis by Climate Central reveals that San Antonio ranks among the U.S. cities most severely impacted by urban heat islands, with 88% of residents—over 1.2 million people—experiencing at least 8°F of additional heat due to the urban built environment. The study, which analyzed urban heat patterns across 65 major U.S. cities representing 50 million Americans, found that San Antonio’s urban development creates dangerous temperature amplification beyond the regional climate warming we explored in this lesson.

The analysis shows that when rural areas near San Antonio reach 96°F, neighborhoods within the city can experience temperatures exceeding 104°F due to heat-absorbing infrastructure like concrete, asphalt, and buildings with low vegetation cover. This urban heat island effect operates independently of—and compounds—the long-term climate warming trends we detected through our STL decomposition analysis of San Antonio’s temperature record.

Particularly concerning is the connection between urban heat exposure and social vulnerability. The study confirms that historically redlined neighborhoods, which were systematically denied investment due to racially discriminatory housing policies, now experience significantly higher temperatures than affluent areas within the same cities. This finding demonstrates how the climate monitoring and analysis techniques we’ve learned in this lesson become essential tools for understanding environmental justice issues and supporting equitable climate adaptation planning.

The research underscores why the analytical methods we’ve explored—from accessing operational climate datasets to calculating temperature trends and creating climate circles—are increasingly vital for urban planning, public health preparedness, and climate resilience strategies in rapidly growing metropolitan areas like San Antonio.

Source: Climate Central, Urban Heat Hot Spots Analysis, 2024; Photo: Dustin Phillips–flickr

The Critical Need for Operational Climate Analysis

Understanding these rapid changes requires sophisticated analytical approaches that can separate long-term climate trends from natural variability, detect shifts in seasonal patterns, and identify early warning signals of changing climate regimes. Traditional climate analysis methods often struggle with the complex temporal patterns that characterize modern climate change—gradual warming trends superimposed on multi-decadal oscillations, shifting seasonal cycles, and intensifying extreme events that occur within broader patterns of change.

This complexity has driven the development of advanced time series analysis techniques and innovative visualization methods that can reveal the multiple timescales at which climate operates. Climate circles provide an intuitive way to visualize annual patterns and detect shifts in seasonal timing or intensity that might be missed in traditional time series plots. Time series decomposition methods like STL (Seasonal and Trend decomposition using Loess) enable researchers to separate the overlapping signals of long-term trends, seasonal cycles, and interannual variability that all contribute to observed climate patterns.

The San Antonio-New Braunfels metropolitan area provides an ideal case study for demonstrating these techniques because it exemplifies the intersection of rapid urban growth with changing climate conditions. Located at the transition between humid and arid climates, the region experiences complex seasonal patterns influenced by Gulf of Mexico moisture, continental air masses, and topographic effects from the nearby Balcones Escarpment (Zhao, Gao, and Cuo 2016). The area’s dramatic population growth and ongoing urbanization create additional complexity through urban heat island effects and changing land surface properties that interact with regional climate patterns.

From Research Methods to Operational Applications

The analytical methods demonstrated in this lesson form the foundation of operational climate monitoring systems used by agencies like NOAA, NCEI, and NIDIS, with climate circles serving as standard tools in drought monitoring while time series decomposition methods enable detection of climate change signals and natural variability. However, the gap between sophisticated research methods and practical applications remains substantial, as many techniques require specialized expertise and resources not readily available to communities who most need climate information, driving development of user-friendly platforms that democratize access while maintaining scientific rigor. By learning these methodological foundations, you develop both technical skills for advanced climate research and conceptual understanding for effectively using operational platforms, preparing you to contribute to climate services—the translation of climate science into actionable information for adaptation and resilience planning in our rapidly changing world.

Understanding nClimGrid-Daily - From Gridded Data to Operational Products

Understanding modern climate patterns requires bridging the gap between data collection from scattered weather stations and national gridded datasets depicting conditions everywhere; not just at weather stations. Weather stations provide precise, quality-controlled observations, but they represent point measurements in a continuous climate field that varies with elevation, distance from water bodies, urban development, and countless other factors. A farmer managing crops 50 kilometers from the nearest weather station needs climate information specific to their location, not a distant proxy. Urban planners designing heat mitigation strategies need to understand how temperature patterns vary across neighborhoods with different development densities and vegetation cover.

NOAA’s nClimGrid-Daily dataset addresses this challenge by converting point-based weather station observations into spatially complete climate grids that provide daily temperature and precipitation estimates for every 5-kilometer square across the continental United States. Since 1951, this dataset has grown to become the a key climate product for operational monitoring systems used by NOAA, NCEI, and NIDIS, supporting applications from drought monitoring and agricultural planning to climate change research and urban adaptation strategies.

From Stations to Grids: The Challenge of Spatial Climate Mapping

Creating spatially complete climate datasets from scattered station observations represents one of the most complex challenges in applied climatology, as the United States’ 10,000 weather stations are distributed unevenly across the landscape with higher density in populated areas but sparse coverage in mountains, deserts, and remote areas where climate gradients can be steep and complex. Simple distance-weighted interpolation often fails to capture climate complexity—for example, near San Antonio’s Balcones Escarpment, elevation changes of 200-300 meters create significant temperature gradients that stations 50 kilometers apart might miss entirely, overlooking transitional climate zones where metropolitan development occurs.

nClimGrid-Daily employs climatologically aided interpolation (CAI) methods that incorporate digital elevation models, coastal proximity datasets, and long-term climate patterns to capture both broad-scale atmospheric circulation patterns and local-scale geographic variations. The system also addresses temporal challenges through comprehensive bias correction and homogenization procedures that detect station failures, relocations, or observing practice changes, ensuring that observed climate changes reflect actual atmospheric processes rather than artifacts of the observing network.

Data Review

NOAA nClimGrid-Daily Technical Specifications:

Spatial Resolution: 5 km grid spacing (0.0417 degrees)
Temporal Resolution: Daily observations
Coverage: Continental United States (CONUS)
Data Source: NOAA/National Centers for Environmental Information (NCEI)
Historical Archive: January 1, 1951 to near real-time
File Format: NetCDF format on AWS Open Data
Coordinate System: Albers Equal Area Conic projection
Grid Dimensions: 1,405 x 621 cells covering CONUS
Update Frequency: Daily operational system (2-3 day lag)
Quality Control: Multi-stage validation and bias correction

Primary Variables:

Maximum temperature (tmax) - degrees Celsius
Minimum temperature (tmin) - degrees Celsius
Average temperature (tavg) - degrees Celsius
Precipitation (prcp) - millimeters

Gridding Methodology: Uses climatologically aided interpolation (CAI) combining weather station observations with digital elevation models, proximity datasets, and long-term climate patterns. Includes comprehensive quality control and bias correction procedures.

Access: Freely available through AWS Open Data program without authentication requirements. Data organized as one NetCDF file per day following YYYYMMDD_nclimgrid-daily.nc naming convention.

Citation: NOAA/NCEI. Daily Gridded Climate Data (nClimGrid-Daily). Available at: https://www.ncei.noaa.gov/data/nclimgrid-daily/

Dataset Specifications and Technical Architecture

nClimGrid-Daily provides four primary climate variables at 5-kilometer spatial resolution: - Maximum temperature (tmax): Daily maximum air temperature in degrees Celsius - Minimum temperature (tmin): Daily minimum air temperature in degrees Celsius
- Mean temperature (tavg): Daily average air temperature in degrees Celsius - Precipitation (prcp): Daily total precipitation in millimeters

The gridded dataset uses the Albers Equal Area Conic projection optimized for the continental United States, with a spatial resolution of approximately 0.0417 degrees (roughly 4.6 kilometers at mid-latitudes). This resolution strikes a balance between capturing local climate features and maintaining computational efficiency for large-scale analysis. The temporal coverage spans from January 1, 1951, to near real-time (typically within 2-3 days of present), providing over 70 years of consistent daily climate observations.

While all four climate variables are available and the analytical methods demonstrated in this lesson apply equally to temperature and precipitation data, we will focus our analysis exclusively on maximum temperature (tmax) patterns. This focus enables deeper exploration of climate change signals in urban environments, where maximum temperature trends are particularly relevant for understanding heat island effects and extreme heat exposure. The same climate circle and STL decomposition techniques work effectively for precipitation analysis, though precipitation data often benefits from log transformation (lambda = 0) in STL decomposition due to its highly skewed distribution.

Knowledge Check

NOAA’s nClimGrid-Daily dataset addresses which fundamental challenge in climate analysis?

Converting daily weather observations to hourly data
Bridging the gap between scattered weather station observations and the need for spatially complete climate information
Reducing the file size of climate datasets for faster downloads
Converting temperature measurements from Fahrenheit to Celsius

Answer: b) Bridging the gap between scattered weather station observations and the need for spatially complete climate information

Explanation: nClimGrid-Daily transforms point-based weather station observations into spatially complete climate grids that provide daily estimates for every 5-kilometer square across the continental United States, addressing the fundamental gap between where we measure climate (scattered stations) and where we need climate information (every location across the landscape).

Required R Packages

This lesson uses the following R packages for accessing and analyzing long-term climate patterns:

arrow: Cloud-native data access for querying partitioned parquet datastores hosted on AWS S3 public buckets
dplyr: Essential data manipulation functions including filtering, grouping, and summarizing operations for climate data processing
ggplot2: Advanced data visualization and plotting for creating climate circles, time series plots, and publication-quality graphics
lubridate: Date and time manipulation for handling temporal climate data, extracting day-of-year values, and creating time series objects
terra: Modern raster data processing and analysis for handling NOAA’s nClimGrid-Daily NetCDF files and performing spatial operations
sf: Simple Features for spatial data handling, coordinate system transformations, and working with Census Bureau geographic boundaries
exactextractr: Precise area-weighted extraction of raster statistics using polygon boundaries for converting gridded climate data to boundary-based summaries
tidycensus: Access US Census Bureau data including Metropolitan Statistical Area boundaries and demographic information for defining study regions
forecast: Time series analysis and forecasting including MSTL decomposition for separating climate signals across multiple timescales
plotly: Interactive data visualization for creating interactive climate circles with hover tooltips and dynamic exploration capabilities
data.table: High-performance data manipulation for efficient processing of large climate datasets and lookup table operations
dotenv: Secure management of API keys and credentials through environment variables for accessing Census and climate data services
tidyr: Data reshaping for converting between wide and long formats, essential for STL decomposition visualization and multi-panel plotting
nclimgrid: Specialized package for accessing pre-compiled climate boundary summaries and lookup tables (installation from GitLab)

Installation Notes

Most packages are available on CRAN and can be installed using install.packages(). Key exceptions:

nclimgrid package: Install from GitLab using:

# devtools::install_gitlab("isciences/nclimgrid")

API Requirements: - Census API key: Required for tidycensus - obtain free from https://api.census.gov/data/key_signup.html - No authentication needed: NOAA nClimGrid-Daily data access through AWS Open Data program

Cloud Data Access: The lesson uses Sys.setenv(AWS_NO_SIGN_REQUEST = "YES") for accessing public NOAA datasets without AWS credentials.

Cloud-Native Data Access

NOAA hosts nClimGrid-Daily through the AWS Open Data program, making the complete dataset freely accessible without authentication requirements. The data is organized in NetCDF format with monthly files containing all days for each month, following the naming convention ncdd-YYYY-MM-grd-scaled.nc. This cloud-native architecture enables efficient access to specific time periods without downloading massive datasets, supporting both operational monitoring applications and research workflows.

Single Month Retrieval Demonstration

Let’s start by examining nClimGrid-Daily data structure by looking at the hottest day in San Antonio: September 5, 2000.

Even though nClimGrid-Daily is open access on Amazon’s S3 cloud storage, we still have to set no sign request either manually or with an .env file.

# Set AWS configuration for public bucket access (required for NOAA data)
Sys.setenv(AWS_NO_SIGN_REQUEST = "YES")

# Load environmental variables if you have them
# Discussed in more detail later
dotenv::load_dot_env()

Accessing the Data

Accessing cloud-hosted geospatial data presents several technical challenges that vary significantly across platforms and system configurations. The ideal approach uses GDAL’s Virtual File System (VSI) capabilities, which allow direct streaming access to cloud-stored files without downloading them locally. On Linux systems with properly configured GDAL installations, paths like /vsis3/bucket-name/file.nc enable seamless access to S3-hosted datasets as if they were local files.

However, this approach requires specific GDAL drivers and network configurations that can be problematic on Windows systems. The NetCDF driver needs to be compiled with specific options, HDF5 libraries must be properly configured, and network security settings can interfere with cloud streaming. Additionally, different versions of GDAL, terra, and underlying system libraries can create compatibility issues that are difficult to diagnose and resolve.

Alternative approaches include using HTTP URLs directly (e.g., https://bucket.s3.amazonaws.com/file.nc), which works across platforms but can be slower and less reliable for large files. The HDF5 driver approach we demonstrated above works well for operational applications but requires careful handling of subdatasets and manual spatial reference application.

For broad usability in this educational context, we’ll download the nClimGrid-Daily file locally and then read it using standard file access methods. This approach ensures the lesson works consistently across all platforms and system configurations, avoiding the technical complexity of cloud streaming that can vary significantly between different computing environments. We’ll test out the dataset by looking at the hottest day recorded for San Antonio (September 5, 2000).

# Define the date using the peak flash drought date for synergy with other lessons
sa_hottest_day <- lubridate::as_date("2000-09-05")  # SA's hottest day ever
year_month <- format(sa_hottest_day, "%Y%m")
year <- format(sa_hottest_day, "%Y")

# Create a local filename
local_filename <- paste0("ncdd-", year_month, "-grd-scaled.nc")
# Now the S3 url
aws_url <- file.path(
  "https://noaa-nclimgrid-daily-pds.s3.amazonaws.com/v1-0-0/grids",
  year,
  paste0("ncdd-", year_month, "-grd-scaled.nc"))

# Download the file locally for reliable access
download.file(aws_url, destfile = local_filename, mode = "wb")

This file contains the complete September 2000 nClimGrid-Daily dataset. This single file contains all four climate variables (tmax, tmin, tavg, prcp) for each day of the month, demonstrating NOAA’s efficient monthly packaging approach for operational climate data distribution.

# Load the downloaded file using standard terra methods
monthly_data <- terra::rast(local_filename)

# First, let's examine the structure of the monthly file
monthly_data

class       : SpatRaster 
dimensions  : 596, 1385, 120  (nrow, ncol, nlyr)
resolution  : 0.04166666, 0.04166667  (x, y)
extent      : -124.7083, -67, 24.54167, 49.375  (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 (CRS84) (OGC:CRS84) 
sources     : ncdd-200009-grd-scaled.nc:tmax  (30 layers) 
              ncdd-200009-grd-scaled.nc:tmin  (30 layers) 
              ncdd-200009-grd-scaled.nc:prcp  (30 layers) 
              ncdd-200009-grd-scaled.nc:tavg  (30 layers) 
varnames    : tmax (Temperature, daily maximum) 
              tmin (Temperature, daily minimum) 
              prcp (Precipitation, daily total) 
              ...
names       :         tmax_1,         tmax_2,         tmax_3,         tmax_4,         tmax_5,         tmax_6, ... 
unit        : degree_Celsius, degree_Celsius, degree_Celsius, degree_Celsius, degree_Celsius, degree_Celsius, ... 
time (days) : 2000-09-01 to 2000-09-30

The SpatRaster object reveals the comprehensive structure of nClimGrid-Daily: 596 × 1,385 grid cells covering the continental United States at ~4.2 km resolution, with 124 total layers representing four variables across 31 days. The coordinate reference system confirms WGS84 geographic coordinates, and the temporal metadata shows proper date attribution from September 1-30, 2000.

# Display the layer naming structure
names(monthly_data)[1:10]  # Show first 10 layer names to understand structure

 [1] "tmax_1"  "tmax_2"  "tmax_3"  "tmax_4"  "tmax_5"  "tmax_6"  "tmax_7" 
 [8] "tmax_8"  "tmax_9"  "tmax_10"

# The monthly files contain 124 layers for July (31 days × 4 variables)
# Layer organization: 
# - Layers 1-31: tmax (days 1-31)
# - Layers 32-62: tmin (days 1-31)  
# - Layers 63-93: prcp (days 1-31)
# - Layers 94-124: tavg (days 1-31)

The layer naming convention uses a simple “variable_day” format where tmax_1 represents maximum temperature for September 1st, tmax_2 for September 2nd, and so forth through the month. This organization is a predictable indexing method where September 5th maximum temperature is always layer 5, making programmatic access straightforward for operational climate monitoring systems.

# Check the coordinate reference system and spatial extent
terra::crs(monthly_data)

[1] "GEOGCRS[\"WGS 84 (CRS84)\",\n    DATUM[\"World Geodetic System 1984\",\n        ELLIPSOID[\"WGS 84\",6378137,298.257223563,\n            LENGTHUNIT[\"metre\",1]]],\n    PRIMEM[\"Greenwich\",0,\n        ANGLEUNIT[\"degree\",0.0174532925199433]],\n    CS[ellipsoidal,2],\n        AXIS[\"geodetic longitude (Lon)\",east,\n            ORDER[1],\n            ANGLEUNIT[\"degree\",0.0174532925199433]],\n        AXIS[\"geodetic latitude (Lat)\",north,\n            ORDER[2],\n            ANGLEUNIT[\"degree\",0.0174532925199433]],\n    USAGE[\n        SCOPE[\"unknown\"],\n        AREA[\"World\"],\n        BBOX[-90,-180,90,180]],\n    ID[\"OGC\",\"CRS84\"]]"

terra::ext(monthly_data)

SpatExtent : -124.708333332415, -67.0000025440503, 24.5416666655981, 49.3750012726343 (xmin, xmax, ymin, ymax)

The coordinate reference system confirms WGS84 geographic coordinates (longitude/latitude in decimal degrees), which is the standard for global climate datasets. The spatial extent spans from approximately 125°W to 67°W longitude and 24.5°N to 49.4°N latitude, covering the continental United States with precise boundaries that align with NOAA’s operational climate monitoring grid.

# Now extract the specific day (September 5, 2000 = day 22 of July)
sep_05_tmax_layer <- 05  # July 22nd tmax is the 22nd layer
single_day <- monthly_data[[sep_05_tmax_layer]]

# Examine the extracted day's data structure
single_day

class       : SpatRaster 
dimensions  : 596, 1385, 1  (nrow, ncol, nlyr)
resolution  : 0.04166666, 0.04166667  (x, y)
extent      : -124.7083, -67, 24.54167, 49.375  (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 (CRS84) (OGC:CRS84) 
source      : ncdd-200009-grd-scaled.nc:tmax 
varname     : tmax (Temperature, daily maximum) 
name        :         tmax_5 
unit        : degree_Celsius 
time (days) : 2000-09-05

The extracted single day maintains the same spatial structure (596 × 1,385 grid cells) but now contains only one layer representing maximum temperature for September 5, 2000. The temporal metadata correctly identifies this as the peak flash drought date, and the variable name tmax_22 confirms we’ve successfully isolated the specific day and variable needed for our analysis of extreme temperature conditions during this critical period.

# Create a quick visualization of the complete CONUS extent
# Using your established nClimGrid temperature color palette
temp_colors <- c("#2c7bb6", "#abd9e9", "#ffffbf", "#fdae61", "#d7191c")

# Plot the full continental United States temperature data
terra::plot(single_day, 
           main = "Maximum Temperature - September 5, 2000",
           col = temp_colors)

Yikes! Pretty, pretty, hot. Texas and Louisiana are on absolute fire in this snapshot.

Spatial Extraction for San Antonio MSA

To demonstrate how gridded climate data gets transformed into operational products, we’ll extract climate data for the San Antonio-New Braunfels Metropolitan Statistical Area (MSA). This process mirrors how operational climate monitoring systems aggregate gridded data to geographic boundaries relevant for decision-making.

Census API Configuration

Before accessing MSA boundaries from the U.S. Census Bureau, we will configure API access. The tidycensus package doesn’t require a Census API key for data access, but by using a key you have much greater access limits and you will reduce the potential for errors and other warning. Plus, it’s not a bad idea to have a census key; they are used for a variety of data services offered by the U.S. Census Bureau.

# Census API key configuration
# Option 1: Load from .env file (recommended for security)
# CENSUS_API_KEY=your_api_key_here

# Option 2: Set directly in R session (not recommended for shared code)
# Sys.setenv(CENSUS_API_KEY = "your_api_key_here")

# The API key should be loaded from your .env file
# You can verify it's loaded (without exposing the value)
cat("Census API key loaded:", !is.na(Sys.getenv("CENSUS_API_KEY", unset = NA)))

Census API key loaded: TRUE

Obtaining a Census API Key: 1. Visit https://api.census.gov/data/key_signup.html 2. Complete the simple registration form with your organization information 3. Check your email for the API key (usually arrives within minutes) 4. Store the key securely in your .env file as CENSUS_API_KEY=your_key_here

MSA Boundary Acquisition

Now we’ll obtain the proper MSA boundary from the U.S. Census Bureau using standardized geographic definitions:

# Get San Antonio-New Braunfels MSA boundary from Census Bureau
san_antonio_msa <- tidycensus::get_acs(
  geography = "metropolitan statistical area/micropolitan statistical area",
  variables = "B01001_001",  # Total population (just to get geometry)
  year = 2020,
  geometry = TRUE
) |>
  dplyr::filter(grepl("San Antonio", NAME)) |>
  dplyr::select(msa_name = NAME, geometry)

# Examine the MSA information
san_antonio_msa

This approach uses the official Census Bureau Metropolitan Statistical Area definitions, ensuring our analysis aligns with how federal agencies and urban planners define metropolitan regions. The San Antonio-New Braunfels MSA encompasses multiple counties and represents the functionally integrated urban area around San Antonio.

Spatial Cropping and Projection Alignment

Now we’ll crop the climate data to our MSA boundary. Many functions will reproject data to the same projections, but it’s good practice to manually set them to the same projections so you control which object is the standard.

# Transform MSA boundary to match nClimGrid projection
san_antonio_msa_proj <- sf::st_transform(san_antonio_msa, terra::crs(single_day))

Now we’ll create an extent with a little buffer around San Antonio to use for cropping. This way any visuals or maps we produce won’t butt right up against the borders of San Antonio proper.

# Create bounding box with buffer for cropping
msa_bbox <- sf::st_bbox(san_antonio_msa_proj)
msa_extent <- terra::ext(msa_bbox[1] - 0.2, msa_bbox[3] + 0.2,  # Add ~20km buffer
                        msa_bbox[2] - 0.2, msa_bbox[4] + 0.2)

# Crop the daily data to our study region  
san_antonio_daily <- terra::crop(single_day, msa_extent)

We can create a map of the San Antonio metropolitan area during the peak flash drought to visualize the results.

# Create nClimGrid-style color palette for temperature
temp_colors <- c("#2c7bb6", "#abd9e9", "#ffffbf", "#fdae61", "#d7191c")

# Visualization with proper color scaling for flash drought context
terra::plot(san_antonio_daily, 
           main = "Max Temp - San Antonio MSA (September 5, 2000)",
           col = temp_colors)

# Add MSA boundary for geographic context
plot(sf::st_geometry(san_antonio_msa_proj), add = TRUE, border = "black", lwd = 2)

This regional view reveals the intense temperature conditions across the San Antonio metropolitan on their recorded hottest day. The temperature values exceeded 42°C (108°F) across much of the urban area, demonstrating the extreme heat stress that contributed to rapid soil moisture depletion and agricultural impacts throughout the region.

We can extract the min/max maximum temperature in the San Anotonio area by grabbing all the values with the terra::values function.

# Get data range for color scaling
data_values <- terra::values(san_antonio_daily, na.rm = TRUE)
data_range <- range(data_values, na.rm = TRUE)

# Display the regional temperature statistics
cat("San Antonio region temperature on September 5, 2000:\n",
    "Minimum:", round(data_range[1], 1), "°C\n",
    "Maximum:", round(data_range[2], 1), "°C\n")

San Antonio region temperature on September 5, 2000:
 Minimum: 39.4 °C
 Maximum: 43.8 °C

However this only tells us the min/max for a single cell, and it’s not actually restricted to values inside of the boundary. A more precise approach is to use a zonal statistics tool like exactextractr. Zonal statistics tools can apply a function to a raster layer for all the cells within a boundary. You could perform any function, but common ones include the mean of all values in a boundary, the min, the max, etc. As we see in the previous chunk, these calculations are not too difficult at small scales, but imagine we had thousands of census metropolitan areas across hundreds of days; tools like exactextractr allow you to scale your analysis quickly and accurately.

Let’s do a simple example of the mean.

sa_mean <-
  exactextractr::exact_extract(san_antonio_daily, san_antonio_msa_proj, 'mean')

sa_mean

[1] 42.38875

This produces a very simple output, having only 1 boundary, one variable, and one day in the raster. Using a monthly file with all 4 variables illustrates the power and speed of exact_extract

sa_mean <-
  exactextractr::exact_extract(monthly_data, 
                               san_antonio_msa_proj, 
                               'mean', 
                               force_df = TRUE, 
                               append_cols = "msa_name",
                               stack_apply = TRUE)

names(sa_mean)

  [1] "msa_name"     "mean.tmax_1"  "mean.tmax_2"  "mean.tmax_3"  "mean.tmax_4" 
  [6] "mean.tmax_5"  "mean.tmax_6"  "mean.tmax_7"  "mean.tmax_8"  "mean.tmax_9" 
 [11] "mean.tmax_10" "mean.tmax_11" "mean.tmax_12" "mean.tmax_13" "mean.tmax_14"
 [16] "mean.tmax_15" "mean.tmax_16" "mean.tmax_17" "mean.tmax_18" "mean.tmax_19"
 [21] "mean.tmax_20" "mean.tmax_21" "mean.tmax_22" "mean.tmax_23" "mean.tmax_24"
 [26] "mean.tmax_25" "mean.tmax_26" "mean.tmax_27" "mean.tmax_28" "mean.tmax_29"
 [31] "mean.tmax_30" "mean.tmin_1"  "mean.tmin_2"  "mean.tmin_3"  "mean.tmin_4" 
 [36] "mean.tmin_5"  "mean.tmin_6"  "mean.tmin_7"  "mean.tmin_8"  "mean.tmin_9" 
 [41] "mean.tmin_10" "mean.tmin_11" "mean.tmin_12" "mean.tmin_13" "mean.tmin_14"
 [46] "mean.tmin_15" "mean.tmin_16" "mean.tmin_17" "mean.tmin_18" "mean.tmin_19"
 [51] "mean.tmin_20" "mean.tmin_21" "mean.tmin_22" "mean.tmin_23" "mean.tmin_24"
 [56] "mean.tmin_25" "mean.tmin_26" "mean.tmin_27" "mean.tmin_28" "mean.tmin_29"
 [61] "mean.tmin_30" "mean.prcp_1"  "mean.prcp_2"  "mean.prcp_3"  "mean.prcp_4" 
 [66] "mean.prcp_5"  "mean.prcp_6"  "mean.prcp_7"  "mean.prcp_8"  "mean.prcp_9" 
 [71] "mean.prcp_10" "mean.prcp_11" "mean.prcp_12" "mean.prcp_13" "mean.prcp_14"
 [76] "mean.prcp_15" "mean.prcp_16" "mean.prcp_17" "mean.prcp_18" "mean.prcp_19"
 [81] "mean.prcp_20" "mean.prcp_21" "mean.prcp_22" "mean.prcp_23" "mean.prcp_24"
 [86] "mean.prcp_25" "mean.prcp_26" "mean.prcp_27" "mean.prcp_28" "mean.prcp_29"
 [91] "mean.prcp_30" "mean.tavg_1"  "mean.tavg_2"  "mean.tavg_3"  "mean.tavg_4" 
 [96] "mean.tavg_5"  "mean.tavg_6"  "mean.tavg_7"  "mean.tavg_8"  "mean.tavg_9" 
[101] "mean.tavg_10" "mean.tavg_11" "mean.tavg_12" "mean.tavg_13" "mean.tavg_14"
[106] "mean.tavg_15" "mean.tavg_16" "mean.tavg_17" "mean.tavg_18" "mean.tavg_19"
[111] "mean.tavg_20" "mean.tavg_21" "mean.tavg_22" "mean.tavg_23" "mean.tavg_24"
[116] "mean.tavg_25" "mean.tavg_26" "mean.tavg_27" "mean.tavg_28" "mean.tavg_29"
[121] "mean.tavg_30"

Inspecting the output names we can see that exactextractr has quickly produced mean values for every day and every variable in the month in the format FUNCTION.VARIABLE._DAY_OF_MONTH.

Live Data Access - Pre-compiled Climate Datastore

The single-month demonstration above illustrates the fundamental workflow for accessing NOAA’s nClimGrid-Daily data and extracting climate information for specific geographic boundaries. However, operational climate monitoring and long-term climate analysis require processing decades of daily data across multiple variables and geographic regions. Downloading and processing 70+ years of monthly files for every analysis would be computationally intensive and impractical for most applications.

Consider the analytical challenges we face: a comprehensive climate analysis for the San Antonio MSA from 1951-present requires processing over 26,000 daily observations across four climate variables, totaling more than 100,000 individual data points. Extending this analysis to multiple metropolitan areas or comparing urban heat island effects across different cities multiplies the data volume exponentially. Research workflows that manually download, process, and aggregate this data for each analysis quickly become unmanageable and prohibitively time-consuming.

The Pre-compiled Datastore Solution

To address these scalability challenges, ISciences has developed a pre-compiled climate datastore as part of the nclimgrid R package, funded in part by the Earth Science Information Partners (ESIP) Lab program. This datastore transforms the raw nClimGrid-Daily grids into boundary-based climate summaries for multiple geographic boundary types across the United States. The datastore performs the exactextractr workflow we demonstrated above at scale, computing daily climate statistics for:

States, counties, and census tracts - Multi-level administrative boundaries
Congressional districts - Political boundaries for policy analysis
Census Metropolitan Statistical Areas (MSAs) - Urban regions like San Antonio-New Braunfels
NIDIS Drought Early Warning System (DEWS) regions - Climate monitoring boundaries
Watershed boundaries (HUC6) - Major hydrological units

This comprehensive geographic coverage enables climate analysis across scales from local census tracts to regional watersheds, supporting diverse applications from urban planning to agricultural monitoring.

Understanding Partitioned Datastores and Unique Identifiers

Before accessing the datastore, it’s essential to understand why partitioned cloud datastores require unique identifier systems rather than using common place names directly. This design choice reflects fundamental requirements of distributed data architectures optimized for analytical performance.

The Partitioning Challenge: Modern cloud datastores use partitioning strategies to organize massive datasets into smaller, manageable chunks that can be queried efficiently. Each partition typically contains data for a specific geographic region, time period, or category. For this architecture to work effectively, every geographic entity must have a unique, stable identifier that can serve as a partition key.

Why Place Names Don’t Work: Common place names create several problems for partitioned systems: - Ambiguity: Multiple “Springfield” cities exist across different states - Changing Names: Political boundaries occasionally change names over time - Encoding Issues: Special characters and spaces complicate file naming and URL construction - Case Sensitivity: “San Antonio” vs “san antonio” could be treated as different entities - Standardization: Different data sources may use slightly different naming conventions

The Unique ID Solution: The datastore uses standardized entity IDs like ms_448 (Metropolitan Statistical Area #448) that provide: - Global Uniqueness: No two entities can share the same ID across the entire system - Stability: IDs remain constant even if place names or boundaries change - Efficiency: Simple alphanumeric strings optimize partitioning and query performance - Consistency: Identical IDs work across different datasets and time periods

Geographic Boundary Discovery Using the Lookup System

Rather than querying the massive datastore to discover available boundaries (which would be slow and inefficient), the nclimgrid package includes a comprehensive lookup system that provides instant access to all available geographic entities. Now let’s explore the boundary types available in the system:

# Display all available boundary types and their coverage
boundary_summary <- data.table::rbindlist(lapply(names(nclimgrid::boundary_lookup), function(bt) {
  lookup_table <- nclimgrid::boundary_lookup[[bt]]
  data.table::data.table(
    boundary_type = bt,
    total_entities = nrow(lookup_table),
    sample_entities = paste(head(lookup_table$name, 3), collapse = ", ")
  )
}))

boundary_summary

The boundary_lookup object contains pre-indexed data.table objects for each boundary type, enabling lightning-fast searches across nearly 100,000 geographic entities without any network calls or datastore queries. You can pick any of the types to check out the structure.

nclimgrid::boundary_lookup$wbd_huc6

The USGS watersheds show the entity_id for datastore lookups, the states where the watershed overlaps, the name, and then the boundary_type identifier; in this case wbd_huc6 (USGS Watershed Boundary Dataset HUC 6).

Let’s use the lookup system to find the San Antonio metropolitan area:

# Search for San Antonio in the Census MSA lookup table
san_antonio_lookup <- nclimgrid::boundary_lookup$census_msas[
  grepl("San Antonio", name, ignore.case = TRUE)
]

san_antonio_lookup

Perfect! The lookup confirms that San Antonio-New Braunfels MSA has entity ID ms_448.

nclimgrid also has cross reference tables to help you find hierarchical relationships such as which counties, watershed, or congressional districts belong to which states along with their unique identifiers in the datastore.

nclimgrid::state_county_lookup

Connecting to the Public Datastore

With our entity ID confirmed through the lookup system, we can now connect efficiently to the datastore. The datastore is hosted in a public s3 cloud storage “bucket”.

# Connect to the public nClimGrid daily summaries datastore
ds <- arrow::open_dataset("s3://isci-nclimgrid/nclim-svi-daily/partitioned/", 
                         hive_style = TRUE)

# The datastore connection is now ready for efficient queries
# No data is downloaded until explicitly requested through collect()
ds

FileSystemDataset with 4456 Parquet files
6 columns
primary_id: string
date: string
value: double
weight_type: string
boundary_type: string
climate_var: string

See $metadata for additional Schema metadata

This shows you some limited information including the number of columns, their names, and class. The Arrow dataset provides a lazy connection to the partitioned datastore. Now we can query directly using our known entity ID (ms_448).

Efficient Data Retrieval for San Antonio

Arrow datasets allow for dplyr style syntax for easy data pulls in R. This call will take a moment (it’s pulling data for every day from 1951-2024!).

# Query complete climate record for San Antonio MSA using confirmed entity ID
san_antonio_data <- ds |>
  dplyr::filter(
    primary_id == "ms_448",
    climate_var %in% c("tmax", "prcp"),  # Note: we retrieve both but focus on tmax
    weight_type == "standard",
    boundary_type == "census_msas"
  ) |>
  dplyr::collect()

# Examine the structure of the retrieved data
head(san_antonio_data)

For 2 climate variables this is more than 54,000 record!

dim(san_antonio_data)

[1] 54058     6

range(san_antonio_data$date)

[1] "1951-01-01" "2024-12-31"

Although preliminary data is updated every 3 days in NOAA’s AWS s3 bucket, bias corrected and scaled data is released every month. The datstore is only updated with daily summaries once the processed scaled data is released.

Exploratory Analysis - From Daily Data to Climate Patterns

With our San Antonio climate data, we can explore the structure, quality, and patterns within this 70+ year record. This exploratory phase serves multiple critical purposes: understanding data characteristics that will inform our climate circles and time series decomposition analysis, validating data quality and completeness, and demonstrating advanced capabilities of the datastore including Social Vulnerability Index (SVI) integration for climate justice applications.

Data Structure and Temporal Coverage Assessment

Everything looked correct on the surface but we can make some quick checks to verify everything is in order.

# Examine data structure and temporal coverage
data_structure <- san_antonio_data |>
  dplyr::mutate(date = as.Date(date)) |>
  dplyr::summarise(
    total_records = dplyr::n(),
    variables = paste(unique(climate_var), collapse = ", "),
    date_range = paste(min(date), "to", max(date)),
    total_years = as.numeric(difftime(max(date), min(date), units = "days")) / 365.25,
    records_per_variable = dplyr::n() / length(unique(climate_var)),
    .groups = "drop"
  )

data_structure

We have an equal number of records per variable, 74 years of data, and min/max dates as you would expect. We can also check for missingness.

# Comprehensive data quality assessment
quality_assessment <- san_antonio_data |>
  dplyr::mutate(date = as.Date(date)) |>
  dplyr::group_by(climate_var) |>
  dplyr::summarise(
    total_observations = dplyr::n(),
    missing_values = sum(is.na(value)),
    completeness_percent = round(100 * (1 - missing_values / total_observations), 2),
    min_value = min(value, na.rm = TRUE),
    max_value = max(value, na.rm = TRUE),
    mean_value = round(mean(value, na.rm = TRUE), 2),
    .groups = "drop"
  )

quality_assessment

This all looks good.

Annual Time Series Overview

Let’s create annual summaries to examine long-term patterns and trends:

# Calculate annual climate summaries
annual_climate <- san_antonio_data |>
  dplyr::mutate(
    date = as.Date(date),
    year = lubridate::year(date)
  ) |>
  dplyr::group_by(climate_var, year) |>
  dplyr::summarise(
    annual_mean = mean(value, na.rm = TRUE),
    annual_max = max(value, na.rm = TRUE),
    annual_min = min(value, na.rm = TRUE),
    .groups = "drop"
  ) |>
  # Focus on complete years only
  dplyr::filter(year >= 1951, year <= 2024)

# Display recent trends
recent_climate <- annual_climate |>
  dplyr::filter(year >= 2020) |>
  dplyr::arrange(climate_var, year)

recent_climate

These values look reasonable. Precipitation and maximum temperature are holding steady the past few years, although it looks like 2023 had a few cold days!

Long-term Climate Trend Visualization

We can perform a simple analysis to check trends with annual means.

# Create comprehensive time series visualization
climate_evolution_plot <- 
  ggplot2::ggplot(annual_climate, ggplot2::aes(x = year, y = annual_mean)) +
  ggplot2::geom_line(alpha = 0.7, color = "#1F77B4", size = 0.8) +
  ggplot2::geom_smooth(method = "lm", se = TRUE, color = "#A50026", size = 1.2) +
  ggplot2::facet_wrap(~ climate_var, scales = "free_y", 
                      labeller = ggplot2::labeller(climate_var = c(
                        "tmax" = "Maximum Temperature (°C)", 
                        "prcp" = "Daily Precipitation (mm)"))) +
  ggplot2::labs(
    title = "Annual Climate Means (Precip/TMAX)",
    subtitle = "For San Antonio-New Braunfels MSA (1951-2024)",
    x = "Year",
    y = "Climate Value",
    caption = "Source: NOAA nClimGrid-Daily via ISciences datastore"
  ) +
  ggplot2::theme_minimal() +
  ggplot2::theme(
    strip.text = ggplot2::element_text(size = 12, face = "bold"),
    panel.grid.minor.x = ggplot2::element_blank(),
    plot.title = ggplot2::element_text(size = 14, face = "bold")
  )

climate_evolution_plot

This simple analysis is showing slight increases in precipitation and more drastic increases in maximum temperature.

Social Vulnerability Index (SVI) Integration

The climate analysis we’ve conducted so far treats geographic areas as uniform entities, computing area-weighted averages that assume climate impacts are distributed equally across all residents within a region. However, decades of environmental justice research have documented that climate risks—including extreme heat, air pollution, and flood exposure—often disproportionately affect communities with limited resources to adapt or relocate.

This raises a critical question for climate monitoring: do socially vulnerable populations within metropolitan areas like San Antonio experience different climate conditions than more affluent communities? Urban development patterns, infrastructure quality, and access to green space can create significant microclimatic variations within cities. Historically redlined neighborhoods often have less tree cover, more heat-absorbing pavement, and older housing stock with limited air conditioning—factors that can amplify heat exposure during extreme temperature events.

One of the unique capabilities of the ISciences datastore is integration with NASA’s Social Vulnerability Index (SVI) for climate justice analysis. Rather than simply averaging climate conditions across geographic area, we can weight our analysis according to social vulnerability, revealing whether climate risks are equitably distributed or concentrated among populations with the least capacity to adapt.

# Query both standard and SVI-weighted climate data for comparison
svi_comparison_data <- ds |>
  dplyr::filter(
    primary_id == "ms_448",
    climate_var %in% c("tmax", "prcp"),
    boundary_type == "census_msas"
  ) |>
  dplyr::collect()

SVI grids have been issued from SEDAC for the years 2000, 2010, 2014, 2018, and 2020. The year specified in the weight_type identifies which grid was used for the weighted calculation. The nClimGrid climate data is weighted with whichever grid is closest; e.g. 2013 climate data is weighted using svi_weighted_2014.

unique(svi_comparison_data$weight_type)

[1] "standard"          "svi_weighted_2000" "svi_weighted_2010"
[4] "svi_weighted_2014" "svi_weighted_2016" "svi_weighted_2018"
[7] "svi_weighted_2020"

For simplicity we can convert all SVI designations to simply “svi”. Let’s also filter for only data from 2000-2024; SVI weighting for data before the year 2000 wouldn’t be very relevant.

svi_comparison_data <- svi_comparison_data |>
  dplyr::mutate(date = as.Date(date)) |>
  dplyr::filter(lubridate::year(date) >= 2000) |>
  dplyr::mutate(weight_type = ifelse(weight_type == "standard", "standard", "svi"))

svi_comparison_data

# Calculate recent climate statistics by weighting method
svi_comparison_summary <- svi_comparison_data |>
  dplyr::group_by(climate_var, weight_type) |>
  dplyr::summarise(
    mean_value = mean(value, na.rm = TRUE),
    max_value = max(value, na.rm = TRUE),
    min_value = min(value, na.rm = TRUE),
    .groups = "drop"
  ) |>
  dplyr::mutate(
    weight_label = dplyr::case_when(
      weight_type == "standard" ~ "Standard (Area-Weighted)",
      weight_type == "svi" ~ "SVI-Weighted (Vulnerability-Weighted)",
      TRUE ~ weight_type
    )
  )

svi_comparison_summary

The overall values for for 2000-2024 are very similar. This suggests that vulnerable and less vulnerable populations have similar climate exposure (at least in terms of maximum temperature and daily precipitation). We can take a finer look by examinging annual trends–similar to our prior analysis.

# Create annual time series by weighting method
annual_svi_comparison <- svi_comparison_data |>
  dplyr::mutate(year = lubridate::year(date)) |>
  dplyr::group_by(climate_var, weight_type, year) |>
  dplyr::summarise(annual_mean = mean(value, na.rm = TRUE), .groups = "drop") |>
  dplyr::mutate(
    weight_label = dplyr::case_when(
      weight_type == "standard" ~ "Standard (Area-Weighted)",
      weight_type == "svi" ~ "SVI-Weighted (Vulnerability-Weighted)",
      TRUE ~ weight_type
    )
  )

# Visualize annual time series by weighting method
ggplot2::ggplot(annual_svi_comparison, 
                ggplot2::aes(x = year, y = annual_mean, color = weight_label, linetype = weight_label)) +
  ggplot2::geom_line(size = 1.2, alpha = 0.8) +
  ggplot2::facet_wrap(~ climate_var, scales = "free_y",
                      labeller = ggplot2::labeller(climate_var = c(
                        "tmax" = "Maximum Temperature (°C)", 
                        "prcp" = "Daily Precipitation (mm)"))) +
  ggplot2::scale_color_manual(values = c("Standard (Area-Weighted)" = "steelblue", 
                                        "SVI-Weighted (Vulnerability-Weighted)" = "coral"),
                             name = "Weighting Method:") +
  ggplot2::scale_linetype_manual(values = c("Standard (Area-Weighted)" = "solid", 
                                           "SVI-Weighted (Vulnerability-Weighted)" = "dashed"),
                                name = "Weighting Method:") +
  ggplot2::labs(
    title = "Annual Climate Trends by Weighting Method",
    subtitle = "San Antonio-New Braunfels MSA (2000-Present)",
    x = "Year",
    y = "Annual Mean Value",
    caption = "Source: NOAA nClimGrid-Daily and NASA SVI via ISciences datastore"
  ) +
  ggplot2::theme_minimal() +
  ggplot2::theme(
    strip.text = ggplot2::element_text(size = 12, face = "bold"),
    legend.position = "bottom",
    panel.grid.minor.x = ggplot2::element_blank()
  )

This is telling the same story. The lines for non-weighted and SVI-weighted are practically on top of each other. This shows some interesting trends in long term climatology and social vulnerability, but there are more sophisticated methods we can implement.

Knowledge Check

What does the Social Vulnerability Index (SVI) integration in the nClimGrid datastore enable researchers to analyze?

How climate conditions vary when weighted by social vulnerability rather than simple geographic area
The accuracy of weather station measurements
Changes in precipitation patterns over time
The efficiency of cloud data storage systems

Answer: a) How climate conditions vary when weighted by social vulnerability rather than simple geographic area

Explanation: The SVI integration allows researchers to examine climate exposure through an environmental justice lens by weighting climate data according to social vulnerability rather than simple area-based averaging. This reveals whether vulnerable populations experience different climate conditions and helps inform equitable climate adaptation planning.

Advanced Climatology Analysis

Our exploratory analysis revealed San Antonio’s basic climate patterns—gradual warming temperatures and evolving precipitation trends. While annual time series provide useful insights, they only scratch the surface of what climate analysis can show us about how climate systems work across different time periods.

Climate operates on multiple timescales at once, from daily weather variability to multi-decade oscillations and century-long anthropogenic trends. Regular time series analysis often has trouble separating these overlapping signals, making it hard to tell real climate change from natural ups and downs or spot shifts in seasonal timing.

This section covers two advanced techniques that climate monitoring systems use around the world. Climate circles turn calendar time into polar coordinates, making it easy to spot seasonal patterns and catch changes in seasonal timing or intensity. Time series decomposition using STL methods carefully separates climate signals across different timescales, distinguishing long-term trends from cyclical patterns while showing how seasonal patterns themselves change over time.

These methods are the same ones used by NOAA’s NCEI, NIDIS, and climate research groups globally. Learning to build these analyses step-by-step develops both technical skills for climate research and the background knowledge needed to understand operational climate monitoring platforms.

The Challenge of Non-Stationary Climate Baselines

Regular climate analysis assumes stationarity—that statistical properties stay constant over time, letting us calculate meaningful long-term averages and use historical patterns to predict future conditions. This assumption drives the World Meteorological Organization’s 30-year climate normals, which provide reference baselines for deciding whether current conditions are “normal” or “unusual.” But climate change breaks this assumption by adding systematic trends that make historical baselines less relevant for understanding today’s climate.

The problem gets worse for rapidly growing cities like San Antonio, where global climate change and local urban heat island effects both create changing conditions. The 1991-2020 climate normals cover a period when atmospheric CO₂ rose from 355 to 415 ppm, global temperatures increased about 0.5°C, and San Antonio’s urban area expanded dramatically.

Even with these problems, operational climate monitoring still uses traditional methods and fixed baseline periods because they work in practice. Climate circles and STL decomposition with 30-year reference periods are still standard tools for weather services, agricultural programs, and water managers. While better approaches exist—moving baselines, detrended analysis, climate change-adjusted normals—traditional methods remain popular because people understand them and institutions are set up to use them.

The analyses shown here recognize these limitations while teaching the methods needed to understand operational climate products. Learning traditional baseline approaches builds skills for working with existing climate systems while understanding what they can and can’t tell us in a changing climate.

Constructing Climate Circles

Climate circles represent one of the most intuitive ways to visualize annual climate patterns, transforming the familiar linear progression of calendar months into a circular representation that naturally reflects the cyclical nature of Earth’s seasonal systems. Unlike traditional time series plots that display climate data along a linear x-axis, climate circles use polar coordinates where each day of the year corresponds to a specific angle around the circle, creating visualizations that immediately reveal seasonal patterns, timing shifts, and long-term changes in ways that linear plots often obscure.

The power of climate circles lies in their ability to display multiple dimensions of climate information simultaneously. The angular position around the circle represents the day of year (seasonal timing), the radial distance from center represents the climate variable magnitude, and color coding can show additional information like long-term averages or extreme values. This multi-dimensional approach makes climate circles particularly valuable for detecting changes in seasonal timing—such as earlier spring warming or extended summer heat—that might indicate shifting climate regimes.

Climate circles work by converting calendar dates to angles around a circle, where each day of the year gets a specific position and the distance from center represents temperature values.

Data Science Review

The mathematical foundation involves converting calendar dates to polar coordinates using day-of-year as the angular component. January 1st corresponds to 0 degrees (pointing due north), July 1st (approximately day 182) corresponds to 180 degrees (pointing due south), and the progression continues clockwise around the circle.

Conversion Formula: θ = DOY × (360° / 365)

Coordinate System:

Angular position (θ): Day of year × (360°/365) maps calendar days to degrees
Radial distance (r): Climate variable magnitude (temperature values)
Seasonal mapping: Winter (top), Spring (right), Summer (bottom), Fall (left)

This creates a natural mapping where the familiar seasonal progression becomes a clockwise journey around the circle.

Data Preparation for Climate Circles

Before creating our climate circle, we need to prepare the San Antonio temperature data by calculating daily climatological statistics. This process involves computing the minimum, maximum, and mean values for each day of the year across all years in our dataset, creating a representative “average year” that captures typical seasonal patterns while smoothing out year-to-year variability.

# Filter to maximum temperature data only for this analysis
san_antonio_tmax <- san_antonio_data |>
  dplyr::filter(climate_var == "tmax") |>
  dplyr::mutate(date = as.Date(date))

# Calculate day of year for each observation
san_antonio_tmax <- san_antonio_tmax |>
  dplyr::mutate(
    doy = lubridate::yday(date),
    year = lubridate::year(date)
  )

# Calculate daily climatological statistics by day of year
daily_climate <- san_antonio_tmax |>
  dplyr::group_by(doy) |>
  dplyr::summarise(
    min_temp = min(value, na.rm = TRUE),
    max_temp = max(value, na.rm = TRUE),
    mean_temp = mean(value, na.rm = TRUE),
    .groups = "drop"
  )

# Examine the structure of our daily climatology
head(daily_climate, 10)

The resulting dataset contains one row for each day of the year (1-365), with columns representing the climatological minimum, maximum, and mean temperature observed on that calendar day across our entire 70+ year record. This approach creates a smooth representation of the “typical” annual cycle while preserving information about the range of variability (difference between min and max) experienced on each day.

Creating the Polar Coordinate Framework

Now we’ll transform our day-of-year values into polar coordinates and prepare the data structure needed for our climate circle visualization.

# Add polar coordinate calculations
daily_climate <- daily_climate |>
  dplyr::mutate(
    # Convert day of year to degrees (clockwise, starting at top)
    theta_degrees = doy * (360 / 365),
    # Convert to radians for mathematical functions
    theta_radians = theta_degrees * (pi / 180),
    # Calculate range for bar height in the circle
    temp_range = max_temp - min_temp
  )

# Verify the polar coordinate conversion
cat("Polar coordinate examples:\n",
    "Jan 1 (DOY 1):", daily_climate$theta_degrees[1], "degrees\n",
    "Jul 1 (DOY 182):", daily_climate$theta_degrees[182], "degrees\n",
    "Dec 31 (DOY 365):", daily_climate$theta_degrees[365], "degrees\n")

Polar coordinate examples:
 Jan 1 (DOY 1): 0.9863014 degrees
 Jul 1 (DOY 182): 179.5068 degrees
 Dec 31 (DOY 365): 360 degrees

Month Positioning for Labels

To properly label our climate circle with month names, we need to calculate the angular position where each month begins. This requires accounting for the varying lengths of months and their cumulative day positions throughout the year.

# Create month reference information
month_info <- data.table::data.table(
  month = 1:12,
  month_name = c("Jan", "Feb", "Mar", "Apr", "May", "Jun", 
                 "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"),
  days_in_month = c(31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31)
)

# Calculate cumulative days for month start positions
month_info[, cum_days := cumsum(c(0, days_in_month[1:11]))]

# Calculate angular position for month labels (start of each month)
month_info[, theta_degrees := (cum_days + 1) * (360 / 365)]

# Display month positioning
month_info[, .(month_name, cum_days, theta_degrees)]

Building the Climate Circle Visualization

Now we’ll construct our climate circle using ggplot2’s polar coordinate system. The approach involves creating a standard plot and then transforming it to polar coordinates, where the x-axis becomes the angular position and the y-axis becomes the radial distance.

# Create base plot with day of year on x-axis and temperature on y-axis
climate_circle_plot <- ggplot2::ggplot(daily_climate) +
  # Create bars representing temperature range (min to max)
  ggplot2::geom_col(
    ggplot2::aes(x = doy, y = temp_range, fill = mean_temp),
    width = 1,  # Full width bars for continuous appearance
    position = ggplot2::position_nudge(y = daily_climate$min_temp)  # Start bars at minimum temperature
  ) +
  # Transform to polar coordinates
  ggplot2::coord_polar(theta = "x", start = 0) +
  # Color scale for mean temperature
  ggplot2::scale_fill_gradientn(
    colors = c("#313695", "#4575B4", "#74ADD1", "#FDAE61", "#F46D43", "#A50026"),
    name = "Mean Temp\n(°C)",
    guide = ggplot2::guide_colorbar(barwidth = 1, barheight = 8)
  ) +
  # Set x-axis (angular) properties
  ggplot2::scale_x_continuous(
    limits = c(1, 365),
    breaks = month_info$cum_days + 1,  # Month start positions
    labels = month_info$month_name,    # Month names
    expand = c(0, 0)
  ) +
  # Set y-axis (radial) properties
  ggplot2::scale_y_continuous(
    limits = c(min(daily_climate$min_temp) * 0.95, 
               max(daily_climate$max_temp) * 1.05),
    expand = c(0, 0)
  ) +
  # Styling and labels
  ggplot2::labs(
    title = "San Antonio Climate Circle - Maximum Temperature",
    subtitle = "Daily min/max/mean climatology (1951-2024)",
    caption = "Source: NOAA nClimGrid-Daily via ISciences datastore"
  ) +
  ggplot2::theme_minimal() +
  ggplot2::theme(
    axis.text.y = ggplot2::element_blank(),    # Hide radial axis labels
    axis.title = ggplot2::element_blank(),     # Hide axis titles
    panel.grid.major.x = ggplot2::element_line(color = "gray80", size = 0.5),
    panel.grid.minor = ggplot2::element_blank(),
    panel.grid.major.y = ggplot2::element_line(color = "gray90", size = 0.3),
    plot.title = ggplot2::element_text(hjust = 0.5, size = 14, face = "bold"),
    plot.subtitle = ggplot2::element_text(hjust = 0.5, size = 12),
    legend.position = "right"
  )

# Display the climate circle
climate_circle_plot

Interpreting Climate Circle Patterns

The resulting climate circle for San Antonio reveals several important characteristics of the region’s temperature regime that would be difficult to see as clearly in traditional time series plots:

Clear Seasonal Progression: The smooth color transition from deep blue (winter) through orange (spring/fall) to deep red (summer) shows San Antonio’s distinct seasonal cycle, with the coldest temperatures (around 5-10°C) in December-February and peak heat (above 35°C) in July-August. This visual progression immediately communicates the timing and intensity of seasonal temperature changes.

Extended Summer Heat: The thick band of deep red from June through September reveals the extended hot season that characterizes South Texas, with maximum temperatures consistently above 30°C for about four months. This prolonged heat period has important implications for energy demand, agricultural planning, and public health preparedness.

Gradual Seasonal Transitions: Unlike northern climates that might show abrupt seasonal shifts, San Antonio exhibits smooth color gradients indicating gradual seasonal transitions typical of subtropical climates. The absence of sharp color boundaries suggests that temperature changes occur gradually rather than through sudden jumps between seasons.

Winter Temperature Variability: The “spiky” appearance, especially visible in winter months (Dec-Feb), shows greater day-to-day temperature variability during cooler months when the region experiences more variable continental air masses. This increased winter variability reflects San Antonio’s location at the boundary between stable subtropical conditions and more variable continental weather patterns.

Summer Temperature Consistency: Summer months (Jun-Aug) appear more uniform and less spiky, reflecting the dominance of stable subtropical high pressure systems that create consistent hot, dry conditions. This consistency is important for understanding why summer heat stress can be so persistent in the region.

Asymmetric Heating and Cooling: The fall cooling period (Sep-Nov) appears more gradual than spring warming (Mar-May), a common pattern in continental locations where thermal inertia affects seasonal transitions. This asymmetry has practical implications for seasonal planning and climate adaptation strategies.

Creating an Interactive Climate Circle

While the static ggplot2 version provides an excellent overview of seasonal patterns, an interactive version using plotly enables detailed exploration of daily climate statistics. Interactive climate circles allow users to hover over specific days to see exact temperature values, dates, and statistical information—the same functionality used in operational climate monitoring platforms.

# Prepare data with enhanced formatting for interactive tooltips
daily_climate_interactive <- daily_climate |>
  dplyr::mutate(
    # Format date for display (using 2020 as reference year for leap year handling)
    formatted_date = format(as.Date(doy - 1, origin = "2020-01-01"), "%b %d"),
    # Create detailed tooltip text
    tooltip_text = paste0(
      "Date: ", formatted_date, "<br>",
      "Day of Year: ", doy, "<br>",
      "Mean: ", round(mean_temp, 1), "°C<br>",
      "Min: ", round(min_temp, 1), "°C<br>",
      "Max: ", round(max_temp, 1), "°C<br>",
      "Range: ", round(temp_range, 1), "°C"
    )
  )

# Create interactive climate circle using plotly
interactive_climate_circle <- plotly::plot_ly() |>
  # Add temperature range bars (from min to max)
  plotly::add_trace(
    data = daily_climate_interactive,
    type = "barpolar",
    r = ~temp_range,
    theta = ~theta_degrees,
    base = ~min_temp,
    width = 1,
    marker = list(
      color = ~mean_temp,
      colorscale = list(
        c(0, "#313695"),    # Deep blue for coldest
        c(0.2, "#4575B4"),  # Medium blue
        c(0.4, "#74ADD1"),  # Light blue
        c(0.6, "#FDAE61"),  # Light orange
        c(0.8, "#F46D43"),  # Medium orange/red
        c(1, "#A50026")     # Deep red for hottest
      ),
      showscale = TRUE,
      colorbar = list(
        title = "Mean Temp (°C)",
        thickness = 15,
        len = 0.7
      )
    ),
    text = ~tooltip_text,
    hoverinfo = "text",
    showlegend = FALSE
  ) |>
  # Configure polar layout
  plotly::layout(
    title = list(
      text = "Interactive San Antonio Climate Circle<br><span style='font-size:14px'>Daily Temperature Climatology (1951-2024)</span>",
      font = list(size = 16),
      x = 0.5
    ),
    polar = list(
      radialaxis = list(
        visible = FALSE,
        range = c(min(daily_climate_interactive$min_temp) * 0.95, 
                  max(daily_climate_interactive$max_temp) * 1.05)
      ),
      angularaxis = list(
        visible = FALSE,
        direction = "clockwise",
        period = 360
      )
    ),
    showlegend = FALSE,
    margin = list(t = 80, b = 50, l = 50, r = 80)
  )

# Add month labels using scatterpolar (not annotations)
interactive_climate_circle <- interactive_climate_circle |>
  plotly::add_trace(
    type = "scatterpolar",
    r = rep(max(daily_climate_interactive$max_temp) * 1.25, nrow(month_info)),  # Position outside the circle
    theta = month_info$theta_degrees,  # Use the theta degrees directly
    text = month_info$month_name,
    mode = "text",
    textfont = list(
      size = 14,
      color = "#000000",
      family = "Arial"
    ),
    showlegend = FALSE,
    hoverinfo = "none"
  )

# Display the interactive climate circle
interactive_climate_circle

The interactive version provides several key advantages: hovering over any day reveals exact temperature values, dates, and ranges without needing to interpolate from color scales, while the format mirrors professional climate monitoring platforms used for operational decision-making. The combination of visual pattern recognition from static climate circles and detailed data access from interactive versions provides a comprehensive approach that serves both educational and operational needs.

While climate circles excel at revealing seasonal patterns and long-term changes in timing or magnitude, they represent only one dimension of advanced climate analysis. To fully understand how climate systems operate across multiple timescales—separating genuine long-term trends from natural variability and detecting changes in seasonal patterns themselves—we need time series decomposition methods that can systematically separate overlapping climate signals.

To fully understand how climate systems operate, we need analytical methods that can systematically separate overlapping climate signals that occur on different timescales. Consider the complexity embedded in our San Antonio temperature record: daily weather variability creates short-term noise around seasonal patterns, the annual cycle creates predictable oscillations, multi-year climate oscillations like El Niño create medium-term variations, and long-term climate change creates systematic trends that span decades. Climate circles effectively display the seasonal component, but distinguishing genuine climate change signals from natural variability requires mathematical decomposition techniques that can separate these overlapping temporal patterns.

This is where time series decomposition becomes essential for climate analysis. While climate circles provide intuitive visualization of seasonal patterns, STL (Seasonal and Trend decomposition using Loess) decomposition provides the analytical rigor needed to quantify climate change signals, detect shifts in seasonal timing or intensity, and identify unusual events that deviate from typical patterns. Together, these approaches provide complementary insights: climate circles for pattern recognition and communication, STL decomposition for rigorous signal separation and trend detection.

Knowledge Check

In a climate circle, what does the radial distance from the center represent?

The day of the year
The magnitude of the climate variable
The month of the year
The number of years in the dataset

Answer: b) The magnitude of the climate variable

Explanation: In polar coordinates, the angular position (degrees around the circle) represents time (day of year), while the radial distance from center represents the magnitude of the climate variable being plotted.

Time Series Decomposition with STL

Time series decomposition represents one of the most powerful analytical techniques in climate science, enabling researchers to systematically separate the multiple timescales at which climate operates. While climate circles reveal seasonal patterns effectively, they cannot distinguish between genuine long-term climate change trends and multi-year natural variability, nor can they detect subtle shifts in seasonal timing or intensity that may indicate changing climate regimes. STL (Seasonal and Trend decomposition using Loess) decomposition addresses these limitations by mathematically separating climate time series into distinct components that operate on different timescales.

The STL approach breaks any climate time series into three fundamental components: trend (long-term changes that may indicate climate change), seasonal (repeating annual cycles), and remainder (short-term variability and anomalies). This separation enables climate scientists to answer critical questions: Is the warming we observe a genuine long-term trend or part of a natural multi-decadal cycle? Are seasonal patterns shifting earlier or becoming more intense? How unusual are recent extreme events compared to historical variability?

Understanding STL Components

Each component of STL decomposition provides unique insights into climate system behavior: Trend Component: Represents long-term changes in climate variables that persist across multiple years or decades.

Positive trends may indicate anthropogenic warming
Negative trends could suggest cooling influences from aerosols or natural variability
Filters out seasonal cycles and short-term noise to reveal underlying climate change signals

Seasonal Component: Captures the repeating annual cycle that characterizes most climate variables.

Unlike simple monthly averages, can vary from year to year
Allows detection of changes in seasonal timing (phenological shifts) or intensity
Spring warming occurring earlier or summer heat becoming more intense appear as systematic changes

Remainder Component: Contains everything not captured by trend and seasonal components.

Includes weather variability, extreme events, and irregular climate anomalies
Large values often correspond to significant climate events like droughts or heat waves
Represents periods that deviate from typical seasonal patterns

Data Preparation for STL Analysis

STL decomposition requires regular time series data with consistent temporal spacing. We’ll start by converting our daily San Antonio temperature data to monthly averages, which provides the temporal resolution needed to capture both seasonal cycles and long-term trends while reducing the computational complexity of processing 70+ years of daily observations.

# Filter to maximum temperature data and convert to monthly time series
san_antonio_monthly <- san_antonio_data |>
  dplyr::filter(climate_var == "tmax") |>
  dplyr::mutate(
    date = as.Date(date),
    year = lubridate::year(date),
    month = lubridate::month(date)
  ) |>
  # Calculate monthly means
  dplyr::group_by(year, month) |>
  dplyr::summarise(
    monthly_temp = mean(value, na.rm = TRUE),
    .groups = "drop"
  ) |>
  # Create date column for the first of each month
  dplyr::mutate(date = as.Date(paste(year, month, "01", sep = "-"))) |>
  dplyr::arrange(date)

# Examine the monthly time series structure
head(san_antonio_monthly, 12)

The aggregation process reduces our dataset from over 27,000 daily observations to approximately 900 monthly values while preserving the essential temporal structure needed for decomposition analysis. The resulting monthly averages smooth out day-to-day weather variability while maintaining the seasonal cycles and long-term trends that STL decomposition is designed to analyze.

Converting to R Time Series Object

STL decomposition requires a specialized time series object that includes information about the temporal structure and seasonal frequency. R’s ts() function creates this structure with explicit frequency specification—12 for monthly data to capture the annual cycle.

# Convert to time series object for STL analysis
ts_start_year <- lubridate::year(min(san_antonio_monthly$date))
ts_start_month <- lubridate::month(min(san_antonio_monthly$date))

san_antonio_ts <- ts(
  san_antonio_monthly$monthly_temp,
  frequency = 12,  # 12 months per year for seasonal cycle
  start = c(ts_start_year, ts_start_month)
)

# Examine the time series properties
cat("Time series structure:\n",
    "Start:", paste(start(san_antonio_ts), collapse = "-"), "\n",
    "End:", paste(end(san_antonio_ts), collapse = "-"), "\n",
    "Frequency:", frequency(san_antonio_ts), "\n",
    "Length:", length(san_antonio_ts), "observations\n")

Time series structure:
 Start: 1951-1 
 End: 2024-12 
 Frequency: 12 
 Length: 888 observations

The ts object encodes the temporal structure explicitly, enabling STL algorithms to properly identify seasonal patterns and align decomposition components with calendar months. The frequency parameter tells the algorithm that seasonal patterns repeat every 12 observations (months), which is essential for accurate seasonal component extraction.

Implementing STL Decomposition

Now we’ll apply the STL decomposition using the forecast::mstl() function, which provides enhanced capabilities compared to base R’s stl() function. The MSTL (Multiple Seasonal Trend decomposition using Loess) approach handles complex seasonal patterns and provides better parameter control for climate applications.

# Apply MSTL decomposition with climate-optimized parameters
mstl_result <- forecast::mstl(
  san_antonio_ts,
  s.window = 13,      # Seasonal window: moderate flexibility for year-to-year variation
  t.window = 121,     # Trend window: ~10 years for long-term climate trends
  robust = TRUE,      # Handle extreme weather events and outliers
  lambda = 1          # No transformation for temperature data
)

# Examine the decomposition structure
head(mstl_result)

              Data    Trend  Seasonal12  Remainder
Jan 1951  1.009350 13.33685 -14.3425780  1.0150784
Feb 1951  1.265010 13.32454 -12.2437192 -0.8158062
Mar 1951  5.410703 13.31222  -9.0423440  0.1408261
Apr 1951 11.256094 13.29991  -0.7686269 -2.2751854
May 1951 20.218304 13.28759   4.9538178  0.9768939
Jun 1951 24.275014 13.27528  11.0009326 -1.0011966

Looking at these MSTL results, we can see how the decomposition separates San Antonio’s temperature patterns across different timescales. The Data column shows the original monthly temperatures (ranging from about 1°C in January to 24°C in June), while the Trend component reveals a relatively stable baseline around 13.3°C that represents the long-term climate signal after removing seasonal cycles. The Seasonal12 component captures the expected annual temperature cycle, with negative values during winter months (January shows -14.3°C, representing how much cooler winter is than the annual average) and positive values building toward summer (June shows +11°C above the trend). The Remainder component contains the leftover variability not explained by trend or seasonal patterns, with small values (mostly within ±2°C) indicating that the trend and seasonal components successfully capture most of the temperature variation, leaving only minor weather-related fluctuations and anomalies.

Understanding STL Parameters

The choice of STL parameters significantly affects the decomposition results and should align with the timescales relevant for climate analysis:

STL Parameter Guide

s.window (Seasonal Window):

Small values (7-11): Allow rapid changes in seasonal patterns
Medium values (13-15): Moderate flexibility, typical for climate
Large values (>20): Force stable seasonal patterns
“periodic”: Identical seasonal cycle every year

t.window (Trend Window):

Small values (24-60): Capture short-term climate variations
Medium values (121-241): Focus on decadal climate trends
Large values (>300): Emphasize long-term climate change

robust = TRUE: Essential for climate data to handle extreme events without distorting overall patterns.

lambda transformation:

lambda = 1: No transformation (temperature)
lambda = 0: Log transformation (precipitation)

s.window = 13: Allows moderate year-to-year variation in seasonal patterns while maintaining stability. This setting enables detection of gradual changes in seasonal timing or intensity that may indicate climate regime shifts.

t.window = 121: Approximately 10 years of monthly data, appropriate for separating decadal climate trends from shorter-term variability. This window length captures genuine climate change signals while filtering out multi-year natural oscillations.

robust = TRUE: Essential for climate data because extreme weather events can distort decomposition results. The robust option reduces the influence of outliers while preserving information about unusual climate conditions.

lambda = 1: No data transformation for temperature variables. Precipitation data might use lambda = 0 (log transformation) to handle its highly skewed distribution.

Extracting and Organizing Decomposition Components

# Extract decomposition components into a data frame for analysis
decomp_components <- data.frame(
  date = san_antonio_monthly$date,  # Use the original date column from our monthly data
  original = as.numeric(san_antonio_ts),
  trend = as.numeric(mstl_result[, "Trend"]),
  seasonal = as.numeric(mstl_result[, "Seasonal12"]),
  remainder = as.numeric(mstl_result[, "Remainder"])
) |>
  dplyr::mutate(
    year = lubridate::year(date),
    month = lubridate::month(date),
    # Add season labels for analysis
    season = dplyr::case_when(
      month %in% c(12, 1, 2) ~ "Winter",
      month %in% c(3, 4, 5) ~ "Spring", 
      month %in% c(6, 7, 8) ~ "Summer",
      month %in% c(9, 10, 11) ~ "Fall"
    ),
    # Convert season to ordered factor
    season = factor(season, levels = c("Winter", "Spring", "Summer", "Fall"))
  )

# Display summary of decomposition components
head(decomp_components)

The organized data frame shows how STL decomposition transforms our monthly temperature time series into analytically useful components, with each row representing one month’s decomposition into trend, seasonal, and remainder values. Notice how the seasonal component follows the expected annual pattern (negative values for winter months like January’s -14.3°C, transitioning to positive values for summer months like June’s +11.0°C), while the trend component shows a relatively stable baseline around 13.3°C that represents San Antonio’s long-term climate after removing seasonal effects, and the remainder captures small month-to-month variations that aren’t explained by the underlying trend or seasonal patterns.

Visualizing the Decomposition

The most effective way to understand STL results is through a comprehensive visualization that shows all components simultaneously, revealing how each contributes to the overall climate signal. Start off by converting the data to long form.

# Create comprehensive STL visualization
# Reshape data for faceted plotting
decomp_long <- decomp_components |>
  tidyr::pivot_longer(
    cols = c("original", "trend", "seasonal", "remainder"),
    names_to = "component",
    values_to = "temperature"
  ) |>
  dplyr::mutate(
    component = factor(component, 
                      levels = c("original", "trend", "seasonal", "remainder"),
                      labels = c("Original Data", "Trend", "Seasonal", "Remainder"))
  )

Now we’ll create the multi panel plot.

# Create the decomposition plot
stl_plot <- ggplot2::ggplot(decomp_long, ggplot2::aes(x = date, y = temperature)) +
  ggplot2::geom_line(color = "#1f77b4", alpha = 0.8, size = 0.6) +
  ggplot2::facet_wrap(~ component, scales = "free_y", ncol = 1) +
  ggplot2::labs(
    title = "STL Decomposition - San Antonio Maximum Temperature",
    subtitle = "Separating long-term trends, seasonal cycles, and climate variability",
    x = "Year",
    y = "Temperature (°C)",
    caption = "Source: NOAA nClimGrid-Daily via ISciences datastore"
  ) +
  ggplot2::theme_minimal() +
  ggplot2::theme(
    strip.text = ggplot2::element_text(size = 12, face = "bold"),
    strip.background = ggplot2::element_rect(fill = "gray95", color = "gray80"),
    panel.spacing = ggplot2::unit(0.8, "lines"),
    plot.title = ggplot2::element_text(size = 14, face = "bold"),
    panel.grid.minor = ggplot2::element_blank()
  )

# Display the decomposition plot
stl_plot

Interpreting STL Results

The STL decomposition of San Antonio’s maximum temperature reveals distinct patterns that would be impossible to detect in the original time series alone. Each panel tells a different part of the climate story, providing insights into how temperature patterns have evolved over the past 70+ years.

Original Data Panel: The top panel displays the complete monthly temperature record with its characteristic sawtooth pattern of seasonal cycles. While you can see the regular summer-winter oscillations and perhaps sense that recent years appear warmer, the overlapping signals make it difficult to quantify specific trends or detect changes in seasonal patterns. This complexity demonstrates why decomposition is essential for rigorous climate analysis.

Trend Panel: The second panel reveals San Antonio’s underlying climate change signal after removing seasonal variability. The trend shows a clear warming pattern that began in the late 1970s and has accelerated since the 1990s, with temperatures rising approximately 1.5°C from the coolest period (around 1980) to present. Importantly, this warming isn’t linear—the trend shows periods of relatively stable temperatures (1950s-1970s), followed by rapid warming (1980s-1990s), and continued gradual increase through recent decades. This pattern aligns with documented regional climate change in the southwestern United States.

Seasonal Panel: The third panel displays how the annual temperature cycle has remained remarkably consistent over time. The regular oscillation between winter lows (around -12°C below the trend) and summer highs (+12°C above the trend) shows that while absolute temperatures have increased, the seasonal amplitude has stayed relatively stable. This suggests that warming has affected all seasons proportionally rather than dramatically altering the seasonal cycle itself—an important finding for understanding regional climate change impacts.

Remainder Panel: The bottom panel captures temperature variability not explained by trend or seasonal patterns, representing weather variability and climate anomalies. Most values fall within ±5°C of zero, indicating that the trend and seasonal components successfully capture the majority of temperature variation. However, several notable spikes appear throughout the record—large positive values likely correspond to unusually hot periods (heat waves or drought years), while negative spikes represent unusually cool periods. The relatively consistent magnitude of remainder variability across decades suggests that while mean temperatures have increased, the range of year-to-year variability has remained fairly stable.

Key Climate Insights: This decomposition provides quantitative evidence that San Antonio has experienced significant climate change, with most warming occurring since 1980. The preservation of seasonal patterns indicates that this warming represents a systematic shift in the entire temperature distribution rather than changes to seasonal timing or intensity. The stable remainder variability suggests that while the climate baseline has shifted, the magnitude of natural temperature fluctuations around that baseline has not dramatically changed—though extreme high temperatures now occur from a higher baseline, making them more severe in absolute terms.

Calculating Climate Normals and Anomalies

One of the most fundamental applications of STL decomposition is establishing standardized climate baselines that enable consistent comparison of current conditions against historical norms. The World Meteorological Organization (WMO) defines climate normals as 30-year averages of meteorological variables, updated every decade to track long-term climate change while maintaining sufficient temporal stability for operational applications. The current standard reference period is 1991-2020, which replaced the previous 1981-2010 normals in 2021.

Climate normals provide the essential baseline for detecting and quantifying climate anomalies—departures from typical conditions that may indicate unusual weather patterns, emerging climate trends, or extreme events. While our STL trend component reveals long-term climate change patterns, climate normals enable us to quantify exactly how current conditions compare to recent historical experience using internationally standardized methods.

# Calculate 1991-2020 climate normals following WMO standards
# These represent the official baseline for current climate monitoring
climate_normals_1991_2020 <- san_antonio_data |>
  dplyr::filter(climate_var == "tmax") |>
  dplyr::mutate(
    date = as.Date(date),
    year = lubridate::year(date),
    month = lubridate::month(date)
  ) |>
  # Filter to the current WMO standard normal period
  dplyr::filter(year >= 1991 & year <= 2020) |>
  # Calculate monthly normals (30-year averages)
  dplyr::group_by(month) |>
  dplyr::summarise(
    normal_temp = mean(value, na.rm = TRUE),
    .groups = "drop"
  ) |>
  dplyr::mutate(
    month_name = month.abb[month]
  )

# Display the calculated normals
climate_normals_1991_2020

These climate normals reveal San Antonio’s characteristic seasonal temperature cycle, with January averaging just 0.04°C and July reaching 26.9°C. The nearly 27°C seasonal range reflects the region’s continental subtropical climate, with hot summers and surprisingly cool winters that can occasionally reach near-freezing conditions. This seasonal amplitude demonstrates why San Antonio experiences such dramatic temperature swings between winter cold fronts and summer heat domes.

Now we can calculate temperature anomalies for recent years to quantify how current conditions compare to these established normals:

# Calculate recent temperature anomalies relative to 1991-2020 normals
recent_anomalies <- san_antonio_data |>
  dplyr::filter(climate_var == "tmax") |>
  dplyr::mutate(
    date = as.Date(date),
    year = lubridate::year(date),
    month = lubridate::month(date)
  ) |>
  # Focus on recent years to assess current climate conditions
  dplyr::filter(year >= 2020) |>
  # Join with climate normals to calculate anomalies
  dplyr::left_join(climate_normals_1991_2020, by = "month") |>
  dplyr::mutate(
    temperature_anomaly = value - normal_temp
  ) |>
  # Calculate annual average anomalies
  dplyr::group_by(year) |>
  dplyr::summarise(
    annual_anomaly = mean(temperature_anomaly, na.rm = TRUE),
    .groups = "drop"
  )

recent_anomalies

These anomaly calculations reveal that recent years in San Antonio have consistently experienced above-normal temperatures, with 2024 showing the largest positive anomaly of +1.6°C above the 1991-2020 baseline. This pattern of sustained positive anomalies exceeding 1°C in most years provides quantitative evidence that current climate conditions substantially exceed even the most recent 30-year normal period—a clear indicator of accelerating climate change impacts on urban areas.

The climate normals approach provides the standardized methodology used by NOAA, NCEI, and weather services worldwide for operational climate monitoring. By understanding how to calculate and interpret these baselines manually, you develop the conceptual foundation needed to effectively use operational climate products that rely on identical methodological approaches.

WMO Climate Normal Standards

Current Standard Period: 1991-2020 (updated from 1981-2010 in 2021)

Update Frequency: Every 10 years with overlapping 30-year periods

Calculation Method: Simple arithmetic mean of daily values aggregated to monthly normals

Global Coordination: WMO ensures consistent baseline periods across all national weather services

Operational Use: Forms the basis for temperature and precipitation anomaly products used in weather forecasting, agricultural planning, and climate monitoring systems worldwide

Future Updates: The next standard period (2001-2030) will be adopted in 2031, continuing the systematic tracking of climate change through evolving baseline periods

Climate Change Signal Detection

One of the most powerful applications of STL decomposition is distinguishing anthropogenic climate change signals from natural variability. The trend component provides a “signal” of systematic climate change, while the remainder component represents “noise” from natural variability and extreme events. We can put together a nice figure illustrating current trends with the climate normals for San Antonio.

# Calculate climate change signal strength
trend_range <- max(decomp_components$trend, na.rm = TRUE) - 
               min(decomp_components$trend, na.rm = TRUE)
remainder_sd <- sd(decomp_components$remainder, na.rm = TRUE)

signal_to_noise <- trend_range / (2 * remainder_sd)

cat("Climate Change Signal Analysis:\n",
    "Trend range:", round(trend_range, 2), "°C\n",
    "Remainder variability (2σ):", round(2 * remainder_sd, 2), "°C\n",
    "Signal-to-noise ratio:", round(signal_to_noise, 2), "\n")

Climate Change Signal Analysis:
 Trend range: 2.61 °C
 Remainder variability (2σ): 3.66 °C
 Signal-to-noise ratio: 0.71

if(signal_to_noise > 1) {
  cat("Result: Climate change signal is detectable above natural variability\n")
} else {
  cat("Result: Climate change signal is comparable to natural variability\n")
}

Result: Climate change signal is comparable to natural variability

To visualize the strength of this climate change signal, we can overlay the extracted trend with confidence intervals that represent the magnitude of natural variability. We start by calculating the annual averages.

# Create trend visualization with annual aggregation for smooth trend line
# Calculate annual averages of the STL trend component
annual_trend <- decomp_components |>
  dplyr::group_by(year) |>
  dplyr::summarise(
    annual_trend = mean(trend, na.rm = TRUE),
    .groups = "drop"
  )

Now can get the baseline reference information to place on the figure.

# Calculate baseline average from 1991-2020 climate normal period
baseline_avg <- annual_trend |>
  dplyr::filter(year >= 1991 & year <= 2020) |>
  dplyr::pull(annual_trend) |>
  mean()

Determine the liner trend to overlay on the figure. This is a simple linear approximation but it’s good for our purposes.

# Calculate linear trend of the annual STL trend values
trend_model <- lm(annual_trend ~ year, data = annual_trend)
annual_trend$linear_trend <- predict(trend_model)

# Calculate confidence intervals for the linear trend
trend_pred <- predict(trend_model, se.fit = TRUE)
annual_trend$trend_upper <- trend_pred$fit + 1.96 * trend_pred$se.fit
annual_trend$trend_lower <- trend_pred$fit - 1.96 * trend_pred$se.fit

Now we’re ready to put it all together into a figure.

# Create the trend plot
trend_plot <- ggplot2::ggplot(annual_trend, ggplot2::aes(x = year)) +
  # Add confidence interval for baseline period
  ggplot2::annotate(
    "rect",
    xmin = 1991, xmax = 2020,
    ymin = -Inf, ymax = Inf,
    fill = "gray90", alpha = 0.5
  ) +
  # Add confidence interval around linear trend
  ggplot2::geom_ribbon(
    ggplot2::aes(ymin = trend_lower, ymax = trend_upper),
    fill = "#d62728", alpha = 0.3
  ) +
  # Add annual STL trend line (blue) - now smooth
  ggplot2::geom_line(
    ggplot2::aes(y = annual_trend),
    color = "#1f77b4", size = 1.2
  ) +
  # Add linear trend line (red)
  ggplot2::geom_line(
    ggplot2::aes(y = linear_trend),
    color = "#d62728", size = 1.2
  ) +
  # Add baseline reference line
  ggplot2::geom_hline(
    yintercept = baseline_avg,
    color = "#d62728", size = 1, linetype = "dashed"
  ) +
  ggplot2::scale_y_continuous(
    name = "Temperature (°C)"
  ) +
  ggplot2::scale_x_continuous(
    name = "Year",
    breaks = seq(1950, 2020, 10)
  ) +
  ggplot2::labs(
    title = "Long-term Temperature Trend (TMAX): San Antonio-New Braunfels, TX Metro Area",
    caption = "Blue line: STL Trend | Red line: Linear Trend | Red dashed: 1991-2020 Baseline | Gray: Baseline Period"
  ) +
  ggplot2::theme_minimal() +
  ggplot2::theme(
    plot.title = ggplot2::element_text(size = 14, face = "bold"),
    panel.grid.minor = ggplot2::element_blank(),
    legend.position = "none"
  )

# Display the trend plot
trend_plot

Here we see the relationship between the systematic climate change signal (blue STL trend) and the linear warming trajectory (red line with confidence intervals). The blue trend line reveals the complex, non-linear nature of San Antonio’s temperature evolution—starting from relatively warm conditions in the 1950s, cooling through the 1960s-1970s (likely influenced by aerosol pollution), reaching minimum temperatures around 1980, then warming steadily from the 1980s onward with accelerated warming since 2000. The linear trend (red line) shows an overall warming rate of approximately 0.18°C per decade across the full record, but the STL trend reveals that this warming has been highly variable, with periods of cooling followed by rapid warming phases. Most significantly, recent temperatures (2010-2024) have risen well above the 1991-2020 baseline (red dashed line), indicating that current climate conditions now exceed even the most recent 30-year normal period used for operational climate monitoring—a clear signal that climate change is outpacing the traditional baseline periods used to define “normal” conditions.

The STL decomposition methodology demonstrated here provides the analytical foundation used by climate research institutions and operational monitoring agencies worldwide for detecting and attributing climate change signals. By learning to implement and interpret these techniques manually, you develop both the technical skills needed for advanced climate analysis and the conceptual framework for understanding how climate change detection and attribution studies separate anthropogenic signals from natural climate variability.

Knowledge Check

The STL decomposition of San Antonio’s temperature record shows that recent temperatures have risen above the 1991-2020 baseline. What does this pattern indicate about climate change detection?

Natural variability is masking any climate change signal
The warming trend is too small to be statistically significant
Climate change is occurring faster than the 30-year periods used to define “normal” conditions
The STL method is not appropriate for temperature data

Answer: c) Climate change is occurring faster than the 30-year periods used to define “normal” conditions

Explanation: When current temperatures exceed even the most recent 30-year climate normal (1991-2020), it demonstrates that climate change is happening rapidly enough that traditional baseline periods become outdated quickly. This highlights the challenge of defining “normal” conditions in a non-stationary climate.

From Analysis to Applications - The nClimGrid Platform

Our step-by-step exploration of climate circles and time series decomposition has revealed the powerful analytical methods that operational climate monitoring systems use to track climate patterns and detect change signals. We manually calculated day-of-year polar coordinates, applied STL decomposition with carefully tuned parameters, and interpreted trend components to understand San Antonio’s changing climate. While these hands-on techniques provide essential understanding of how climate analysis works, implementing them for routine climate monitoring across thousands of geographic boundaries and decades of data would be computationally intensive and time-consuming for most applications.

This is where the transition from research methods to operational tools becomes crucial. The analytical workflows we’ve demonstrated represent the methodological foundation behind professional climate monitoring platforms, but operational systems need to make these sophisticated techniques accessible to broader audiences—urban planners who need seasonal climate summaries for infrastructure design, agricultural extension agents tracking growing season changes, water resource managers assessing drought risk, and community organizations planning climate adaptation strategies.

Operational Climate Monitoring Systems

NOAA/NCEI Climate Products:

Monthly climate summaries use STL-style decomposition to separate trends from seasonal patterns
Temperature anomaly products apply the same climate normal calculations we demonstrated
Climate monitoring dashboards use identical gridded data access and spatial aggregation methods

NIDIS Drought Monitoring:

Drought early warning systems rely on climate circle-style analysis for seasonal pattern detection
Regional assessments use time series decomposition to distinguish drought from natural variability
Drought.gov visualizations implement the same analytical workflows we explored manually

Connection to Practice: Understanding these manual methods enables effective evaluation and application of operational climate products for local decision-making and adaptation planning.

The nClimGrid Package and Platform

The methodological skills you’ve developed in this lesson form the foundation for effectively using the nclimgrid R package and web platform (nclimgrid.isciences.com), developed by ISciences with funding support from the Earth Science Information Partners (ESIP) Lab program. This comprehensive platform transforms the manual analytical processes we’ve demonstrated into automated, scalable tools that democratize access to advanced climate analysis capabilities while maintaining scientific rigor.

The platform follows open science principles throughout its development lifecycle—from open source code and reproducible workflows to public data access and educational integration. By making these sophisticated analytical methods available through both programmatic tools (the R package) and user-friendly web interfaces, the nClimGrid platform bridges the gap between cutting-edge climate science and practical applications across multiple user communities.

From 70,000+ Lines of Code to Three Simple Clicks

To appreciate the value of operational climate platforms, consider the analytical complexity behind the visualizations we’ve created in this lesson. Our climate circle required calculating day-of-year transformations, polar coordinate conversions, color scaling, and seasonal labeling across 70+ years of daily data—roughly 150 lines of R code for a single geographic location. The STL decomposition involved monthly aggregation, time series object creation, parameter tuning, component extraction, and multi-panel visualization—another 200+ lines of code. Scaling these analyses to multiple variables, locations, and time periods would require thousands of lines of code and substantial computational resources.

The nClimGrid web platform automates all of this complexity behind intuitive interfaces that enable sophisticated climate analysis with just a few clicks. Users can generate publication-quality climate circles for any of 100,000+ geographic boundaries, apply STL decomposition with optimized parameters, and export results in multiple formats—all without writing a single line of code.

Interactive Climate Analysis Platform

The nClimGrid web application provides three main interfaces that transform the manual methods we’ve learned into accessible operational tools:

Climate Data Viewer

The Viewer tab provides interactive access to NOAA’s complete nClimGrid-Daily archive through dynamic maps and boundary selection tools. Users can explore daily climate conditions across the continental United States, overlay multiple administrative boundaries, and instantly access the same high-resolution gridded data we worked with manually. The interface handles all the complexity of cloud data access, spatial cropping, and statistical aggregation we demonstrated in our exactextractr workflows, presenting results through interactive visualizations that update in real-time as users modify their selections.

This interface exemplifies how operational climate platforms make sophisticated geospatial analysis accessible to non-technical audiences while maintaining the scientific accuracy of the underlying methods. Urban planners can quickly assess temperature patterns across their metropolitan area, watershed managers can examine precipitation trends across drainage boundaries, and agricultural extension agents can monitor growing season conditions—all using the same NOAA datasets and analytical methods we explored manually, but through interfaces designed for operational decision-making.

Advanced Analysis Tools

The Analysis tab automates the climate circles and time series decomposition methods we developed step-by-step in this lesson. Users can generate climate circles for any available geographic boundary with automatic polar coordinate transformations, seasonal labeling, and publication-quality styling. The STL decomposition interface applies the same parameter optimization and component interpretation we performed manually, but scales across multiple variables and geographic regions simultaneously.

This automation demonstrates the transition from research methods to operational applications—the mathematical foundations and analytical insights remain identical to our manual implementations, but the platform handles the computational complexity and provides standardized visualizations that support consistent interpretation across different users and applications. Climate researchers can quickly generate decomposition analyses for comparative studies, while water resource managers can track long-term trends in precipitation patterns for infrastructure planning.

Data Export and Integration

The CSV Generator provides structured data export capabilities that enable integration with existing analytical workflows and decision-support systems. Users can specify custom time periods, variable combinations, and geographic boundaries to generate climate datasets tailored to their specific applications. This capability transforms the manual data processing workflows we demonstrated—connecting to cloud datastores, filtering large datasets, performing spatial aggregations—into automated pipelines that deliver analysis-ready climate data.

This export functionality recognizes that operational climate monitoring often requires integration with existing tools and workflows rather than complete replacement. Agricultural consultants can generate seasonal climate summaries for incorporation into crop planning models, urban planners can extract temperature data for building energy simulations, and emergency managers can create historical climate datasets for risk assessment models.

Scaling Climate Analysis: From Single Location to Continental Coverage

Our manual analysis focused on San Antonio’s climate patterns across a 70+ year record—a substantial analytical undertaking that revealed important insights about urban climate change in south-central Texas. The nClimGrid platform scales this same analytical capability across 100,000+ geographic boundaries covering the entire continental United States, enabling comparative studies and regional assessments that would be impossible through manual analysis alone.

This scaling capability transforms climate analysis from case study exploration to systematic monitoring. Rather than examining individual metropolitan areas in isolation, researchers can compare climate change patterns across the entire U.S. urban hierarchy. Instead of manual calculations for single watersheds, water resource managers can assess climate trends across complete river basin systems. Agricultural extension networks can monitor growing season changes across entire agricultural regions rather than relying on scattered point observations.

The platform maintains the same analytical rigor and methodological foundations we explored manually while providing the computational infrastructure needed for continental-scale climate monitoring. Every climate circle uses the same polar coordinate mathematics we implemented, every time series decomposition applies the same STL algorithms with the same parameter optimization, and every climate normal calculation follows the same WMO standards—but at a scale that enables systematic understanding of climate patterns across diverse geographic and climatic regions.

Connecting Methods to Monitoring: NOAA and NIDIS Integration

The analytical methods we’ve explored in this lesson—climate circles, time series decomposition, and climate normal calculations—form the foundation of operational climate monitoring systems used by NOAA’s National Centers for Environmental Information (NCEI) and the National Integrated Drought Information System (NIDIS). Understanding these methodological foundations provides the conceptual framework needed to effectively interpret and apply products from these operational systems.

The nClimGrid platform serves as a bridge between these methodological foundations and operational applications, providing tools that implement the same analytical approaches used by national climate monitoring agencies while making them accessible to local and regional decision-makers who need climate information for practical applications.

Knowledge Check

The transition from manual climate analysis methods to operational platforms like primarily addresses which challenge?

Making sophisticated climate analysis techniques accessible to broader audiences while maintaining scientific rigor
Reducing the cost of climate data storage
Improving the accuracy of weather forecasts
Converting all climate data to the same file format

Answer: a) Making sophisticated climate analysis techniques accessible to broader audiences while maintaining scientific rigor

Explanation: While manual implementation provides essential understanding of climate analysis methodology, operational platforms democratize access to sophisticated techniques like climate circles and STL decomposition, enabling urban planners, agricultural extension agents, water resource managers, and community organizations to perform advanced climate analysis without requiring specialized programming skills.

Key Learning Points

Congratulations! In this lesson you:

Accessed and processed operational climate data from NOAA’s nClimGrid-Daily dataset using cloud-native approaches, demonstrating how to efficiently retrieve and work with large-scale gridded climate datasets
Created climate circles using polar coordinate transformations to visualize annual climate patterns, revealing seasonal cycles and long-term changes in timing and intensity that linear time series plots often obscure
Applied STL decomposition to separate climate signals across multiple timescales, systematically distinguishing long-term climate change trends from seasonal patterns and natural variability
Calculated climate normals using WMO standards and computed anomalies for climate change detection, implementing the same baseline methodologies used by operational climate monitoring agencies
Interpreted complex time series patterns including trend components that revealed San Antonio’s 1.5°C warming since 1980, seasonal components that remained remarkably stable over time, and remainder patterns that captured extreme events and natural variability
Mastered advanced data management techniques including cloud datastore connections, entity ID systems, and efficient spatial aggregation using exactextractr for boundary-based climate analysis
Connected analytical methods to operational applications by understanding the methodological foundations behind NOAA, NCEI, and NIDIS climate monitoring products and learning to evaluate their relevance for local applications
Developed professional climate analysis workflows from data access and quality control through visualization and interpretation, using the same technical approaches employed by climate research institutions worldwide
Applied open science principles by working with freely accessible datasets, reproducible analytical methods, and openly available tools that democratize access to sophisticated climate analysis capabilities

References

Almazroui, Mansour, M. Nazrul Islam, Fahad Saeed, Sajjad Saeed, Muhammad Ismail, Muhammad Azhar Ehsan, Ismaila Diallo, et al. 2021. “Projected Changes in Temperature and Precipitation Over the United States, Central America, and the Caribbean in CMIP6 GCMs.” Earth Systems and Environment 5 (1): 1–24. https://doi.org/10.1007/s41748-021-00199-5.

Chen, Yong, Gary W. Marek, Thomas H. Marek, Dana O. Porter, David K. Brauer, and Raghavan Srinivasan. 2021. “Modeling Climate Change Impacts on Blue, Green, and Grey Water Footprints and Crop Yields in the Texas High Plains, USA.” Agricultural and Forest Meteorology 310 (November): 108649. https://doi.org/10.1016/j.agrformet.2021.108649.

Djaman, Koffi, Komlan Koudahe, Ansoumana Bodian, Lamine Diop, and Papa Malick Ndiaye. 2020. “Long-Term Trend Analysis in Annual and Seasonal Precipitation, Maximum and Minimum Temperatures in the Southwest United States.” Climate 8 (12, 12): 142. https://doi.org/10.3390/cli8120142.

Li, Zhiying, Xiao Li, Yue Wang, and Steven M. Quiring. 2019. “Impact of Climate Change on Precipitation Patterns in Houston, Texas, USA.” Anthropocene 25 (March): 100193. https://doi.org/10.1016/j.ancene.2019.100193.

Nielsen-Gammon, John W., Jay L. Banner, Benjamin I. Cook, Darrel M. Tremaine, Corinne I. Wong, Robert E. Mace, Huilin Gao, et al. 2020. “Unprecedented Drought Challenges for Texas Water Resources in a Changing Climate: What Do Researchers and Stakeholders Need to Know?” Earth’s Future 8 (8): e2020EF001552. https://doi.org/10.1029/2020EF001552.

Statkewicz, Madeline D., Robert Talbot, and Bernhard Rappenglueck. 2021. “Changes in Precipitation Patterns in Houston, Texas.” Environmental Advances 5 (October): 100073. https://doi.org/10.1016/j.envadv.2021.100073.

Zhao, Gang, Huilin Gao, and Lan Cuo. 2016. “Effects of Urbanization and Climate Change on Peak Flows over the San Antonio River Basin, Texas,” September. https://doi.org/10.1175/JHM-D-15-0216.1.