# Define the six study states most impacted by 2022 flash drought
<- c("Kansas", "Missouri", "Oklahoma", "Arkansas", "Texas", "Louisiana") study_states
The 2022 Southern Plains Flash Drought: A Multi-Indicator Exploration
Overview
In this lesson, we will explore the 2022 Southern Plains flash drought through a comprehensive multi-indicator analysis that integrates meteorological data, soil moisture monitoring, evapotranspiration calculations, and agricultural outcomes. You will learn to access and analyze operational drought monitoring products from NASA and NOAA, calculate evapotranspiration from meteorological variables, examine relationships between atmospheric drivers and soil moisture depletion, and connect these physical indicators to real-world agricultural consequences. This integrated approach demonstrates how multiple Earth system components interact during extreme events and provides practical experience with the datasets and methods used by operational drought monitoring agencies.
This lesson uses the R programming environment and requires several specialized packages for accessing NASA, NOAA, and USDA datasets. Make sure you have the required packages installed and API credentials configured before beginning the analysis.
Learning Objectives
After completing this lesson, you should be able to:
- Access and process operational drought monitoring data from NASA SPORT-LIS, extracting soil moisture percentiles and creating time series analyses for drought assessment
- Retrieve and analyze high-resolution meteorological data from NOAA URMA, including temperature, humidity, and wind variables that drive flash drought development
- Calculate reference evapotranspiration using the Hargreaves-Samani method to quantify atmospheric water demand during drought conditions
- Process agricultural survey data from USDA NASS to examine real-world drought impacts on pasture and rangeland conditions
- Create integrated visualizations that combine multiple datasets to reveal relationships between meteorological drivers, soil moisture depletion, and agricultural outcomes
- Perform temporal analysis to identify flash drought onset, intensification, and recovery phases using multi-indicator approaches
- Interpret drought monitoring products in the context of operational forecasting and early warning systems used by agricultural and water management communities
- Analyze spatial patterns of flash drought development across state and regional scales to understand geographic vulnerability
- Connect atmospheric processes to agricultural impacts through lagged correlation analysis and temporal pattern recognition
- Apply professional data management practices including API authentication, environmental variable management, and reproducible workflow development
The 2022 Southern Plains flash drought was a real event that caused significant hardship for farming communities, rural economies, and countless families across the affected region. Agricultural producers faced devastating crop losses, livestock stress, and financial difficulties that extended well beyond the drought period itself. Some operations were forced to liquidate herds, abandon planted acres, or exit farming entirely due to the rapid and severe nature of the drought’s impacts.
While this lesson provides valuable learning opportunities to understand flash drought processes and connect environmental science to real-world applications, please approach this analysis with appropriate consideration for the human dimensions of these events. The data we examine represents actual agricultural stress, economic losses, and community challenges experienced by real people and places.
When discussing your findings or presenting this work to others, remember that behind every data point are farming families, rural communities, and regional economies that were significantly affected by these extreme conditions. Use this knowledge responsibly and with respect for those who experienced these impacts firsthand.
Introduction
Flash drought has emerged as a critical research priority within the atmospheric sciences community due to its rapid onset and disproportionate impacts on unprepared stakeholders (Christian, Hobbins, et al. 2024). In the two decades since the term was coined, the scientific community has forged a new paradigm that recognizes flash droughts have distinct dynamics and impacts compared to conventional droughts, moving beyond definitional debates to focus on the triad of rapid onset, drought development, and associated impacts (Christian, Hobbins, et al. 2024).
Global analysis has identified the central United States as one of the most active flash drought hotspots worldwide, alongside Brazil, the Sahel, and India (Christian et al. 2021). Significantly, six of fifteen major study regions experienced statistically significant increases in flash drought frequency during 1980–2015, with climate projections indicating dramatic increases in North America from 32% in 2015 to 49% in 2100 under extreme emissions scenarios (Christian et al. 2021, 2023). This makes understanding current events like the 2022 Southern Plains flash drought critically important for developing future adaptive capacity.
The 2022 Southern Plains flash drought exemplified the characteristics that make these events so challenging. Unlike conventional droughts that develop gradually, this event was characterized by unusually rapid intensification over days to weeks, catching stakeholders with insufficient time to implement mitigation strategies (Otkin et al. 2022; Christian, Grace, et al. 2024). The U.S. Drought Monitor reported “one-category-per-week degradation for the last four weeks in a row” as of July 28, 2022—a rate described as “lightning speed” affecting an “unusually large area.” This rare event actually consisted of two consecutive flash droughts interrupted by brief recovery, spanning summer and early fall across eastern Oklahoma, Arkansas, and southern Missouri.
Europe Faces Rising Drought Concerns as May 2025 Becomes World’s Second Hottest on Record
May 2025 was the world’s second warmest May ever recorded, with an average global temperature of 15.79°C—0.53°C above the 1991-2020 average and 1.4°C above pre-industrial levels. The record heat has brought severe drought conditions to much of northwestern Europe, with more than half of Europe and the Mediterranean basin facing some form of drought during May 11-20—the highest level recorded for that time period since monitoring began in 2012.
The drought impacts are already severe: northern European farmers report crop failures due to the driest spring in over a century, Germany experienced its driest March on record, and the Rhine River in Cologne dropped to 1.54 meters—about half its usual level. The European Central Bank warns that water scarcity now puts nearly 15% of the eurozone’s economic output at risk.
Northwestern Europe experienced its lowest precipitation and soil moisture levels since at least 1979, while persistent dry conditions have led to the lowest spring river flows across Europe since records began in 1992. The situation demonstrates how rapidly extreme drought conditions can develop and spread across entire continents, highlighting the increasing global challenge of flash drought events in a warming climate.
Source: Euronews, June 11, 2025
The 2022 event was driven by severe precipitation deficits amplified by extreme heat and large-scale atmospheric teleconnections. Texas recorded its warmest and fourth driest April to July period on record, with July 2022 marking the hottest July ever recorded—the first time the state-averaged July mean daily maximum temperature surpassed 100°F (“Assessing the U.S. Climate in 2022” 2023). These conditions, exacerbated by persistent La Niña and negative Pacific Decadal Oscillation, created a “perfect storm” that severely suppressed precipitation and enhanced temperatures. The resulting impacts were catastrophic: Texas alone suffered over $6.4 billion in agricultural losses, with the U.S. cow herd reaching its smallest size since 1961 (Powell 2023; “Cow-Calf Corner | March 24, 2024 - Oklahoma State University” 2025).
The National Integrated Drought Information System (NIDIS), established by Congress in 2006, coordinates drought monitoring, forecasting, and early warning across the United States through federal, state, and local partnerships. Recent advances emphasize five critical pillars for flash drought early warning: 1) physically based identification frameworks, 2) comprehensive monitoring capabilities, 3) improved prediction across time scales, 4) impact assessments, and 5) decision-making guidance (Otkin et al. 2022). However, the 2022 event highlighted prediction challenges, with only 20% and 16% of ensemble forecasts correctly predicting rapid drought development for the two consecutive events (Christian, Grace, et al. 2024).
Open science and freely accessible data portals have become essential for understanding and mitigating drought impacts. NASA’s SPORT-LIS provides daily soil moisture percentiles, NOAA’s URMA delivers high-resolution meteorological analyses, and USDA’s NASS offers comprehensive agricultural survey data (White et al. 2022; National Oceanic and Atmospheric Administration 2025; U.S. Department of Agriculture, National Agricultural Statistics Service 2025). This democratizes access to critical environmental information, enabling communities to develop locally-relevant analyses and early warning systems. Critical research gaps remain in understanding atmospheric and oceanic drivers, developing practitioner-tailored detection indices, and improving subseasonal-to-seasonal prediction capabilities (Christian, Hobbins, et al. 2024).
This lesson provides hands-on experience with the operational datasets and analytical methods used to monitor, understand, and predict flash drought events like the 2022 Southern Plains case study. Through systematic analysis of NASA soil moisture data, NOAA meteorological observations, and USDA agricultural surveys, you will learn to integrate multiple Earth system datasets to reveal the connections between atmospheric processes and real-world agricultural impacts. By the end of this lesson, you will have developed practical skills in accessing, processing, and interpreting the same data sources used by operational drought monitoring agencies and research institutions worldwide.
What is the primary characteristic that distinguishes flash droughts from conventional droughts?
- Flash droughts occur only during summer months
- Flash droughts develop through unusually rapid intensification over days to weeks
- Flash droughts are caused exclusively by high temperatures
- Flash droughts only affect agricultural areas
Answer: b) Flash droughts develop through unusually rapid intensification over days to weeks
Explanation: While conventional droughts develop gradually over months or years, flash droughts are characterized by their rapid intensification, often catching stakeholders with insufficient time to implement mitigation strategies.
Required R Packages
This lesson uses the following R packages for accessing and analyzing flash drought data:
tidycensus
: Access US Census Bureau data including state boundaries and demographic information for defining study regionssf
: Simple Features for spatial data handling, operations, and coordinate reference system managementterra
: Modern raster data processing and analysis, successor to the raster package with improved performanceearthdatalogin
: NASA Earthdata authentication for accessing SPORT-LIS soil moisture data from NASA serversrnassqs
: Access USDA National Agricultural Statistics Service (NASS) data through their QuickStats APIdplyr
: Essential data manipulation functions including filtering, grouping, and summarizing operationsggplot2
: Advanced data visualization and plotting with publication-quality graphicslubridate
: Date and time manipulation for handling temporal data and creating time seriesRColorBrewer
: Color palettes for data visualization, including official drought monitoring colorsdotenv
: Secure management of API keys and credentials through environment variablestidyterra
: Integration between terra raster objects and ggplot2 for modern raster visualizationexactextractr
: Precise area-weighted extraction of raster statistics using polygon boundariesknitr
: Dynamic report generation and formatted table creation for presenting analysis resultskableExtra
: Enhanced table formatting with styling options for HTML and PDF outputstidyr
: Data reshaping for converting between wide and long formats, essential for time series analysis
Most of these packages are available on CRAN and can be installed using install.packages()
. The earthdatalogin
package provides streamlined access to NASA data services, while tidyterra
offers modern integration between raster analysis and the tidyverse ecosystem. For API access, you’ll need to register for credentials with NASA Earthdata and USDA NASS as described in the lesson.
Study Region Definition
For our analysis, we’ll focus on six south-central states that bore the brunt of the 2022 flash drought: Kansas, Missouri, Oklahoma, Arkansas, Texas, and Louisiana. These states form a natural laboratory for studying flash drought development, as they experienced the full spectrum of impacts—from the initial rapid intensification in Texas and Oklahoma to the broader regional expansion that eventually affected agricultural operations across the entire area.
We’ll start by downloading state boundary data from the US Census Bureau, which provides standardized geographic boundaries that align with how drought monitoring agencies organize their assessments. To ensure our satellite and meteorological data captures the complete picture, we’ll also create a slightly larger analysis area by adding a buffer around our study states. This buffered region helps us avoid edge effects when processing gridded datasets and provides important spatial context for understanding how atmospheric patterns extended beyond state lines during the drought.
Now we’ll download the actual state boundary geometries from the US Census Bureau using the tidycensus package.
# Get state boundaries using tidycensus
<- tidycensus::get_acs(
states_sf geography = "state",
variables = "B01001_001", # Total population variable (just to get boundaries)
year = 2020,
geometry = TRUE
|>
) ::filter(NAME %in% study_states) |>
dplyr::select(state_name = NAME, geometry) dplyr
Now we can take a peak at the structure. It’s pretty straightforward with sf objects. Typically one row per boundary, a name column, possibly other attributes, and then the geometry
field that holds all the spatial information.
# Examine the structure of our spatial data
states_sf
Next we need to create a bounding box with a buffer around our study states. This buffered extent will provide better spatial context when we crop our gridded datasets.
# Create study region bounding box with 10% buffer for spatial context
<- sf::st_bbox(states_sf)
study_bbox <- study_bbox[3] - study_bbox[1]
lon_range <- study_bbox[4] - study_bbox[2]
lat_range
# Add 10% buffer to each direction
<- c(
buffered_bbox xmin = study_bbox[1] - (lon_range * 0.1),
ymin = study_bbox[2] - (lat_range * 0.1),
xmax = study_bbox[3] + (lon_range * 0.1),
ymax = study_bbox[4] + (lat_range * 0.1)
)
buffered_bbox
xmin.xmin ymin.ymin xmax.xmax ymax.ymax
-108.42851 24.35975 -87.03415 42.09127
Finally, we’ll convert this bounding box into a terra extent object that we can use for cropping our raster datasets.
# Create analysis extent for cropping gridded datasets
<- terra::ext(buffered_bbox[1], buffered_bbox[3],
study_extent 2], buffered_bbox[4])
buffered_bbox[ study_extent
SpatExtent : -108.4285089, -87.0341541, 24.3597507, 42.0912663 (xmin, xmax, ymin, ymax)
These coordinates define a rectangular boundary around our six study states, with the numbers representing the westernmost longitude (-108.4°), southernmost latitude (24.4°), easternmost longitude (-87.0°), and northernmost latitude (42.1°). The 10% buffer we added extends this boundary beyond the actual state borders to ensure our satellite and weather data captures the complete atmospheric patterns affecting the region, avoiding edge effects when we crop the gridded datasets.
Let’s create a quick verification plot to visualize our study region and the buffered analysis extent. First we need to make a sf polygon using the extent we just established so we can view it on the map with the state boundaries.
# Create extent rectangle for visualization
<- sf::st_polygon(list(matrix(c(
extent_poly 1], buffered_bbox[2], # bottom-left
buffered_bbox[3], buffered_bbox[2], # bottom-right
buffered_bbox[3], buffered_bbox[4], # top-right
buffered_bbox[1], buffered_bbox[4], # top-left
buffered_bbox[1], buffered_bbox[2] # close polygon
buffered_bbox[ncol = 2, byrow = TRUE))) |>
), ::st_sfc(crs = sf::st_crs(states_sf)) sf
Now put it all together.
# Plot study region with buffered extent
::ggplot() +
ggplot2::geom_sf(data = extent_poly, fill = "lightblue", alpha = 0.3,
ggplot2color = "blue", linetype = "dashed", size = 1) +
::geom_sf(data = states_sf, fill = "lightgreen", alpha = 0.6,
ggplot2color = "darkgreen", size = 0.8) +
::geom_sf_text(data = states_sf, ggplot2::aes(label = state_name),
ggplot2size = 3, fontface = "bold") +
::labs(title = "2022 Flash Drought Study Region",
ggplot2caption = "Blue dashed line shows analysis extent for gridded data cropping",
x = "",
y = "") +
::theme_minimal() ggplot2
Excellent, all of our states of interest are present and in the proper projected space. We can move on.
Timespan Definition
Understanding flash drought requires capturing its defining characteristic: the dramatic speed at which conditions can deteriorate. The 2022 Southern Plains event unfolded like a slow-motion disaster that was actually anything but slow—what typically takes months or years happened in a matter of weeks. Our analysis will span the most critical period from June through October 2022, when the region experienced not just one flash drought, but essentially two back-to-back events separated by brief moments of relief.
# Define analysis period covering both flash drought events
<- lubridate::as_date("2022-06-01")
analysis_start <- lubridate::as_date("2022-10-31")
analysis_end
# Key dates from drought monitoring records
<- lubridate::as_date("2022-07-19") # 93% of Southern Plains in D1+ drought
first_intensification <- lubridate::as_date("2022-07-22") # Peak of first flash drought event
peak_severity <- lubridate::as_date("2022-08-30") # Brief improvement period
partial_recovery <- lubridate::as_date("2022-10-25") # 63% of CONUS in drought (Monthly Climate Reports | Drought Report | Annual 2022)
peak_conus_extent
<- seq(analysis_start, analysis_end, by = "day") analysis_period
This timeframe tells the complete story: the initial development in June and July that caught many agricultural producers off guard, a short-lived recovery period in August that offered false hope, and then a second intensification through fall that ultimately made 2022 a record-breaking drought year. By focusing on these critical months, we can track how rapidly soil moisture plummeted, atmospheric demand skyrocketed, and agricultural conditions deteriorated—providing a real-time view of flash drought as it actually unfolds.
# Create date sequence for our analysis
<- length(analysis_period)
total_days cat("Analysis period:", as.character(analysis_start), "to", as.character(analysis_end), "\n",
"Total days in analysis:", total_days, "\n")
Analysis period: 2022-06-01 to 2022-10-31
Total days in analysis: 153
Data Access and Authentication Setup
Working with operational drought monitoring datasets requires access to several data services that use API keys and authentication systems. These credentials provide secure access to data while allowing providers to track usage and maintain service quality.
API keys and passwords should never be written directly into your code or shared in scripts. Environmental variables provide a secure way to store sensitive information separately from your analysis code. This approach allows you to share your code publicly while keeping your credentials private. This is a fundamental best practice in data science and software development. When credentials are hardcoded in scripts, they can accidentally be shared through version control systems like Git, exposed in screenshots, or accessed by unauthorized users. Environmental variables create a secure barrier between your analysis logic and your authentication credentials, enabling reproducible research while maintaining security standards.
# Load dotenv package for reading .env files
library(dotenv)
Setting Up Credentials
There are two main approaches for managing environmental variables in R:
Option 1: Using Sys.setenv() directly in R
# Set credentials manually (not recommended for shared code)
Sys.setenv(EARTHDATA_USERNAME = "your_username")
Sys.setenv(EARTHDATA_PASSWORD = "your_password")
Sys.setenv(USDA_NASS_API = "your_api_key")
Sys.setenv(AWS_NO_SIGN_REQUEST = "YES")
Option 2: Using a .env file (recommended)
Create a file named .env
in your project directory with your credentials in the following format:
# NASA Earthdata credentials for SPORT-LIS soil moisture data
EARTHDATA_USERNAME=your_actual_username
EARTHDATA_PASSWORD=your_actual_password
# USDA NASS API key for agricultural survey data
USDA_NASS_API=your_actual_api_key_here
# AWS configuration for public NOAA data access
AWS_NO_SIGN_REQUEST=YES
Important Notes:
- Replace the example values with your actual credentials
- Do not use quotes around the values in .env files
- Do not include spaces around the equals sign
- Never commit your .env file to version control (add it to .gitignore)
- Never share your .env file as it contains sensitive information
Then load the credentials using the dotenv package:
# Load credentials from .env file
::load_dot_env()
dotenv
# Verify credentials are loaded (without exposing values)
cat("EARTHDATA_USERNAME loaded:", !is.na(Sys.getenv("EARTHDATA_USERNAME", unset = NA)),
"\nUSDA_NASS_API loaded:", !is.na(Sys.getenv("USDA_NASS_API", unset = NA)),
"\nAWS_NO_SIGN_REQUEST loaded:", !is.na(Sys.getenv("AWS_NO_SIGN_REQUEST", unset = NA)))
EARTHDATA_USERNAME loaded: TRUE
USDA_NASS_API loaded: TRUE
AWS_NO_SIGN_REQUEST loaded: TRUE
Data Services We’ll Access
- NASA Earthdata: SPORT-LIS soil moisture percentiles
- Requires:
EARTHDATA_USERNAME
andEARTHDATA_PASSWORD
- Requires:
- USDA NASS: Agricultural survey data including pasture conditions
- Requires:
USDA_NASS_API
key
- Requires:
- NOAA AWS: URMA meteorological analysis data
- Uses:
AWS_NO_SIGN_REQUEST=YES
for public access (no authentication required)
- Uses:
The next sections will demonstrate how to use these credentials to access each dataset securely.
SPORT-LIS Soil Moisture Data
Soil moisture represents the foundation of agricultural drought—when soils dry out rapidly, crops stress, pastures fail, and rural economies suffer. But measuring soil moisture across vast agricultural regions presents enormous challenges. Traditional ground-based sensors provide precise point measurements but leave huge gaps between monitoring sites, while satellite observations often lack the spatial detail or temporal frequency needed to track the rapid changes characteristic of flash drought.
SPORT-LIS (Short-term Prediction Research and Transition Land Information System) bridges this gap by combining satellite observations, weather data, and sophisticated land surface models to produce daily soil moisture estimates at 3-kilometer resolution across the entire continental United States (White et al. 2022). The system generates percentiles for numerous hydrological variables including soil moisture at multiple depths, evapotranspiration, runoff, and snow water equivalent. Rather than providing raw soil moisture values, SPORT-LIS delivers percentiles that rank current conditions against decades of historical data—a 10th percentile reading means current soil moisture is drier than 90% of all historical conditions for that location and time of year.
For our flash drought analysis, we’ll focus specifically on the 0-1 meter soil moisture percentiles, which integrate surface and root zone conditions most relevant to agricultural impacts. This percentile approach makes SPORT-LIS invaluable for drought monitoring because it automatically accounts for seasonal patterns and regional differences, allowing us to quickly identify when and where conditions are becoming unusually dry.
NASA Earthdata Authentication Setup
NASA Earthdata requires a .netrc
file for automated authentication. This file stores your credentials in a format that allows secure, programmatic access to NASA datasets.
# Set up NASA Earthdata authentication using .netrc file
::edl_netrc(
earthdataloginusername = Sys.getenv("EARTHDATA_USERNAME"),
password = Sys.getenv("EARTHDATA_PASSWORD")
)
The .netrc
file is created in your /Home
directory automatically and stores your credentials securely for future data access. This only needs to be run once per system setup.
Single Date Example: Peak Flash Drought Conditions
Let’s start by examining soil moisture conditions on July 19, 2022, when 93% of the Southern Plains region was experiencing drought conditions (Southern Plains Drought Status Update July 22, 2022).
# Construct URL for SPORT-LIS soil moisture percentiles on peak drought date
<- "20220719"
peak_date <- paste0("https://data.ghrc.earthdata.nasa.gov/ghrcw-public/stage/sportlis__1/percentiles/grid/vsm_percentile_", peak_date, ".grb2")
sport_url
# Load soil moisture percentile data
<- terra::rast(sport_url)
soil_moisture_peak soil_moisture_peak
class : SpatRaster
dimensions : 929, 1929, 4 (nrow, ncol, nlyr)
resolution : 0.03, 0.03 (x, y)
extent : -124.94, -67.07, 25.06, 52.93 (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat Coordinate System imported from GRIB file
source : vsm_percentile_20220719.grb2
names : 0-0.1[m~38) [-], 0-0.4[m~38) [-], 0-1[m] ~38) [-], 0-2[m] ~38) [-]
unit : -, -, -, -
time : 2022-07-19 UTC
This SpatRaster object contains soil moisture percentile data for the entire continental United States on July 19, 2022, organized into four layers representing different soil depths: 0-10cm (surface), 0-40cm (shallow root zone), 0-100cm (deep root zone), and 0-200cm (total soil column). We see the spatial resolution of 0.03 degrees (roughly 3 kilometers), and the extent of 929 rows and 1,929 columns that span the CONUSA. Percentile values in each layer rank current soil moisture against 20+ years of historical data for that specific location and date.
Let’s crop the raster to narrow the scope down to our 6 states of interest.
# Crop to our buffered study extent
<- terra::crop(soil_moisture_peak, study_extent)
soil_moisture_cropped
# Check layer names and depths
names(soil_moisture_cropped)
[1] "0-0.1[m] DBLL=Depth below land surface; (prodType 2, cat 0, subcat 238) [-]"
[2] "0-0.4[m] DBLL=Depth below land surface; (prodType 2, cat 0, subcat 238) [-]"
[3] "0-1[m] DBLL=Depth below land surface; (prodType 2, cat 0, subcat 238) [-]"
[4] "0-2[m] DBLL=Depth below land surface; (prodType 2, cat 0, subcat 238) [-]"
Now we’ll create a visualization to see the spatial pattern of drought conditions across our study region.
# Create a quick visualization of surface soil moisture (0-10cm layer)
::plot(soil_moisture_cropped[[1]],
terramain = "Surface Soil Moisture Percentiles - July 19, 2022",
col = RColorBrewer::brewer.pal(11, "RdYlBu"))
# Add state boundaries for context
plot(sf::st_geometry(states_sf), add = TRUE, border = "black", lwd = 1.5)
Pretty, pretty, good! Looking at the SPORT-LIS data from July 19, 2022, you can see exactly why this flash drought was such a big deal. All the deep red across Texas, Oklahoma, and Kansas represents soil moisture percentiles below the 10th percentile, meaning the soil was drier than it had been 90% of the time historically for mid-July. The darkest red zones in central and western Texas are showing near-record or actual record-breaking dryness.
What really stands out is how sharp the boundaries are - you’ve got intense drought conditions right next to relatively normal areas (the blue regions). That’s classic flash drought behavior - it doesn’t gradually spread across the landscape like a typical drought. Instead, it develops rapidly in specific areas while leaving neighboring regions relatively unaffected.
Here’s what makes this particularly significant: this extreme dryness developed in just 4-6 weeks. Back in early June, much of this red area was showing normal or even above-normal soil moisture. The speed of that transformation - from adequate moisture to record-breaking drought in about a month - is exactly what makes flash drought so challenging for farmers and ranchers who suddenly find themselves dealing with severe conditions with little warning.
SPORT-LIS Technical Specifications:
- Spatial Resolution: 3 km grid spacing
- Temporal Resolution: Daily analysis
- Coverage: Continental United States (CONUS)
- Data Source: NASA GSFC/SPoRT Program
- Update Frequency: Daily operational system
- File Format: GRIB2 format
- Coordinate System: Geographic (latitude/longitude)
- Historical Archive: Available from 2017-present
- Model Framework: NASA Land Information System (LIS)
Key Variables Available:
- Soil moisture percentiles (0-10cm, 10-40cm, 40-100cm, 0-100cm)
- Evapotranspiration percentiles
- Runoff percentiles
- Snow water equivalent percentiles
- Streamflow percentiles
- Groundwater percentiles
Percentile Calculation: Based on 20+ year climatology (2000-present) using day-of-year ranking against historical distribution. Percentiles range from 0-100, where values below 20th percentile indicate dry conditions and below 10th percentile indicate severe drought conditions.
Model Physics: Integrates satellite observations (soil moisture, precipitation, vegetation) with meteorological forcing data through the Noah-MP land surface model to produce physically consistent hydrological estimates.
Citation: White, K. D., LeRoy, A., Fuell, K. K., Hain, C. & Case, J. L. NASA’s Integrated Multi-Satellite Retrievals (IMERG) and NASA SPoRT Land Information System Products for Drought Analysis. in vol. 102 446 (2022).
Systematic Time Series Analysis
Now we’ll load soil moisture data for our complete analysis period using a combination of weekly sampling plus key drought event dates. We’ll focus on the 0-1m soil layer, which integrates surface and root zone conditions and is most relevant for agricultural impacts.
# Create weekly date sequence for our analysis period
<- seq(analysis_start, analysis_end, by = "week")
weekly_dates
# Add key drought event dates from literature
<- c(
key_dates ::as_date("2022-07-19"), # 93% of Southern Plains in D1+ drought
lubridate::as_date("2022-07-22"), # Peak of first flash drought event
lubridate::as_date("2022-08-30"), # Partial recovery period
lubridate::as_date("2022-10-25") # Peak CONUS drought extent
lubridate
)
# Combine weekly and key dates, remove duplicates, and sort
<- sort(unique(c(weekly_dates, key_dates, analysis_end)))
all_dates <- format(all_dates, "%Y%m%d")
date_strings
# Display our sampling strategy
paste("Total analysis dates:", length(all_dates))
[1] "Total analysis dates: 27"
Now we’ll construct the URLs for accessing all our analysis dates from the SPORT-LIS archive.
# Create URLs for all our analysis dates
<- sapply(date_strings, function(date_string) {
sport_urls paste0("https://data.ghrc.earthdata.nasa.gov/ghrcw-public/stage/sportlis__1/percentiles/grid/vsm_percentile_",
".grb2")
date_string,
})
paste("Total URLs to process:", length(sport_urls))
[1] "Total URLs to process: 27"
Let’s test our approach by loading and processing the first date to establish our workflow before processing all dates.
# Load and process the first date to establish our approach
<- terra::rast(sport_urls[1])
first_date <- terra::crop(first_date, study_extent)
first_date_cropped
# Extract only the 0-1m soil layer (layer 3)
<- first_date_cropped[[3]]
first_date_0to1m
first_date_0to1m
class : SpatRaster
dimensions : 568, 714, 1 (nrow, ncol, nlyr)
resolution : 0.03, 0.03 (x, y)
extent : -108.44, -87.02, 25.06, 42.1 (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat Coordinate System imported from GRIB file
source(s) : memory
varname : vsm_percentile_20220601
name : 0-1[m] DBLL=Depth below land s~Type 2, cat 0, subcat 238) [-]
min value : 0.4329004
max value : 99.9953995
time : 2022-06-01 UTC
Now we’ll process all dates systematically. This may take several minutes as we’re downloading and processing multiple datasets from NASA servers.
# Initialize list to store processed rasters
<- list()
soil_rasters <- names(sport_urls)
days
# Load and process each date
for (day in days) {
# Load full raster
<- terra::rast(sport_urls[day])
temp_raster # Crop to study region
<- terra::crop(temp_raster, study_extent)
temp_raster # Extract 0-1m layer
<- temp_raster[[3]]
temp_raster # Store in list
<- temp_raster
soil_rasters[[day]]
paste("-", day)
}
# Combine into single multi-layer raster
<- terra::rast(soil_rasters)
soil_timeseries soil_timeseries
class : SpatRaster
dimensions : 568, 714, 27 (nrow, ncol, nlyr)
resolution : 0.03, 0.03 (x, y)
extent : -108.44, -87.02, 25.06, 42.1 (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat Coordinate System imported from GRIB file
source(s) : memory
varnames : vsm_percentile_20220601
vsm_percentile_20220608
vsm_percentile_20220615
...
names : 20220601, 20220608, 20220615, 20220622, 20220629, 20220706, ...
min values : 0.4329004, 0.4329004, 0.4329004, 0.4329004, 0.4329004, 0.4329004, ...
max values : 99.9953995, 99.9953995, 99.9953995, 99.9953995, 99.9953995, 99.9953995, ...
time : 2022-06-01 to 2022-10-31 UTC
Quick Preview of Flash Drought Progression
Let’s start with a simple overview of our complete time series with terra::panel()
to see the overall patterns of soil moisture change.
This time series provides an excellent overview of how the 2022 flash drought evolved across our study period. You can clearly see the characteristic flash drought pattern - rapid shifts from higher percentiles (blues and greens) to lower percentiles (yellows and reds) during the intensification periods. Notice how quickly the region transitions from relatively normal conditions in early June to widespread drought stress by mid-July, followed by some recovery in August, and then a return to drought conditions through fall. The speed of these transitions - particularly the June-July deterioration - is what distinguishes flash drought from conventional drought events that develop much more gradually over seasons or years.
Creating Publication-Quality Maps
Now we’ll create a more polished visualization using ggplot2 with the official NIDIS/drought.gov color palette. First, let’s define the drought color scheme used in all official U.S. Drought Monitor products.
# Official NIDIS/USDM drought color palette (hex codes from drought.gov)
# These colors represent drought intensity from severe (red) to adequate moisture (yellow)
<- c(
drought_colors "#730000", # Exceptional drought (D4) - darkest red
"#E60000", # Extreme drought (D3) - bright red
"#FFAA00", # Severe drought (D2) - orange
"#FCD37F", # Moderate drought (D1) - light orange
"#FFFF00" # Abnormally dry (D0) - yellow
)
# Reverse order for soil moisture percentiles (low percentile = drought = red)
# Since our data shows percentiles (0-100), we want low values (drought) as red
<- rev(c("#FFFF00", "#FCD37F", "#FFAA00", "#E60000", "#730000"))
soil_moisture_colors
# Add additional colors for higher percentiles (adequate to abundant moisture)
<- c(
soil_moisture_palette # 0-50th percentiles (drought conditions)
soil_moisture_colors, "#87CEEB", # Light blue for above normal
"#4169E1", # Medium blue for well above normal
"#0000FF" # Dark blue for exceptional moisture
)
For the ggplot2 visualization, we’ll select the key drought event dates we defined earlier that represent important phases of the flash drought evolution.
# Use the key drought event dates we defined earlier
<- c(analysis_start, first_intensification, peak_severity, partial_recovery, peak_conus_extent, analysis_end)
key_event_dates <- format(key_event_dates, "%Y%m%d")
key_date_strings
# Find which layers in our time series correspond to these dates
<- names(soil_timeseries)
layer_names <- match(key_date_strings, layer_names)
key_layer_indices
# Remove any dates that don't have corresponding data
<- key_layer_indices[!is.na(key_layer_indices)]
available_indices <- key_event_dates[!is.na(key_layer_indices)]
available_dates
for(i in 1:length(available_dates)) {
paste("\n", as.character(available_dates[i]), "- Layer", available_indices[i])
}
# Extract the key date layers
<- soil_timeseries[[available_indices]]
key_soil_data
# Convert raster to data frame for ggplot2
<- tidyterra::as_tibble(key_soil_data, xy = TRUE)
soil_df
# Add proper date labels for the layers
<- format(available_dates, "%B %d, %Y")
date_labels names(soil_df)[3:ncol(soil_df)] <- date_labels
# Reshape for ggplot2 faceting
<- soil_df |>
soil_df_long ::pivot_longer(cols = -c(x, y), names_to = "date_event", values_to = "soil_moisture_percentile")
tidyr
# Convert date_event to factor with chronological order to fix panel ordering
$date_event <- factor(soil_df_long$date_event, levels = date_labels)
soil_df_long
soil_df_long
Now we’ll create the ggplot2 visualization with state boundaries and the official drought color palette.
# Create publication-quality map with state boundaries
::ggplot() +
ggplot2::geom_raster(data = soil_df_long,
ggplot2::aes(x = x, y = y, fill = soil_moisture_percentile)) +
ggplot2::geom_sf(data = states_sf, fill = NA, color = "black", size = 0.8) +
ggplot2::scale_fill_gradientn(
ggplot2colors = soil_moisture_palette,
name = "Soil Moisture\nPercentile",
limits = c(0, 100),
breaks = c(0, 25, 50, 75, 100),
labels = c("0th\n(Severe Drought)", "25th", "50th\n(Normal)", "75th", "100th\n(Abundant)")
+
) ::facet_wrap(~ date_event, ncol = 3) +
ggplot2::labs(
ggplot2title = "2022 Flash Drought Evolution - Key Event Dates",
subtitle = "Soil Moisture Percentiles (0-1m depth) with Official NIDIS Color Scheme",
caption = "Source: NASA SPORT-LIS | Colors match drought.gov standards"
+
) ::theme_minimal() +
ggplot2::theme(
ggplot2axis.text = ggplot2::element_blank(),
axis.ticks = ggplot2::element_blank(),
axis.title = ggplot2::element_blank(),
panel.grid = ggplot2::element_blank()
)
This ggplot2 approach provides much better control over the visualization, allowing us to add state boundaries cleanly and use the official drought monitoring color scheme that matches drought.gov products. Focusing on the key dates book-ended by June 1 and October 31 illustrates the modest-poor conditions early summer that quickly ramped up to extreme drought, followed by a little respite in August, before ramping back up again in September and October.
State-Level Soil Moisture Analysis
Now we’ll extract state-level soil moisture averages to understand how drought conditions varied across our study region. This analysis will help us see which states were most severely affected and how quickly conditions changed.
Before extracting data, we need to ensure our state boundaries and raster data are in the same coordinate reference system. The SPORT-LIS data uses a geographic coordinate system, so we’ll reproject our state boundaries to match.
# Reproject state boundaries to match raster CRS
<- sf::st_transform(states_sf, terra::crs(soil_timeseries)) states_sf_reproj
Now we’ll extract the average soil moisture percentile for each state across all time periods using exactextractr, which provides more accurate area-weighted extraction than simple overlay methods.
# Extract mean soil moisture percentiles for each state across all dates
<- exactextractr::exact_extract(
state_soil_moisture
soil_timeseries,
states_sf_reproj, 'mean',
force_df = TRUE,
full_colnames = TRUE,
append_cols = "state_name",
progress = FALSE
)
# Examine the structure
head(state_soil_moisture)
The exactextractr::exact_extract()
function provides several advantages: - Area-weighted extraction: Accounts for partial pixel overlap with state boundaries - Complete data frame output: force_df = TRUE
returns a clean data frame - Descriptive column names: full_colnames = TRUE
creates informative column headers - State identification: append_cols = "state_name"
adds state names to the output
The output shows the average soil moisture percentiles for each state across our analysis period, with each column representing a specific date (formatted as YYYYMMDD) and the values indicating how dry or wet conditions were compared to historical norms. For example, Texas shows a dramatic decline from 21.8% on June 1st to just 18.9% by June 22nd, meaning soil moisture dropped from the 22nd percentile (already quite dry) to the 19th percentile (approaching severe drought conditions) in just three weeks—this rapid deterioration is the hallmark of flash drought development.
Now we’ll create a summary table showing soil moisture conditions for our 6 key drought event dates.
# Use the same key dates we defined for the maps plus analysis boundaries
<- c(analysis_start, first_intensification, peak_severity, partial_recovery, peak_conus_extent, analysis_end)
key_event_dates_expanded <- format(key_event_dates_expanded, "%Y%m%d")
key_date_strings_expanded
# Find which columns correspond to these dates (they have "mean." prefix)
<- names(state_soil_moisture)
soil_moisture_columns <- soil_moisture_columns[grepl("^mean\\.", soil_moisture_columns)]
date_columns
# Extract just the date part from column names
<- gsub("^mean\\.", "", date_columns)
column_dates
# Find which columns match our key dates
<- match(key_date_strings_expanded, column_dates)
key_column_matches <- paste0("mean.", key_date_strings_expanded[!is.na(key_column_matches)])
available_key_columns <- key_event_dates_expanded[!is.na(key_column_matches)]
available_key_dates
# Create subset for key dates
<- state_soil_moisture[, c("state_name", available_key_columns)]
key_dates_data
# Rename columns with readable date labels
<- format(available_key_dates, "%b %d")
date_labels_expanded names(key_dates_data)[2:ncol(key_dates_data)] <- date_labels_expanded
key_dates_data
Let’s create a formatted HTML table showing how soil moisture percentiles changed across key dates for each state.
# Create formatted table for key dates
|>
key_dates_data ::kable(
knitrdigits = 1,
caption = "State-Average Soil Moisture Percentiles (0-1m) During Key 2022 Flash Drought Dates",
col.names = c("State", date_labels_expanded)
|>
) ::kable_styling(bootstrap_options = c("striped", "hover", "condensed")) |>
kableExtra::add_header_above(c(" " = 1, "Key Drought Event Dates" = length(date_labels_expanded))) |>
kableExtra::column_spec(1, bold = TRUE) kableExtra
State | Jun 01 | Jul 19 | Jul 22 | Aug 30 | Oct 25 | Oct 31 |
---|---|---|---|---|---|---|
Arkansas | 53.1 | 14.9 | 14.3 | 52.0 | 12.0 | 36.8 |
Missouri | 52.8 | 14.2 | 11.3 | 27.6 | 4.4 | 16.9 |
Oklahoma | 32.6 | 12.1 | 13.7 | 38.4 | 16.2 | 27.6 |
Kansas | 38.0 | 19.8 | 16.9 | 15.0 | 9.5 | 8.0 |
Texas | 21.8 | 19.2 | 19.3 | 64.8 | 38.4 | 46.9 |
Louisiana | 47.9 | 27.7 | 24.9 | 81.5 | 22.3 | 32.7 |
These are pretty alarming numbers. Percentiles in Arkansas and Missouri go from over 50% on June 1st down to the teens in a month and hover there until the August recovery period.
Now we’ll prepare the full time series data for visualization by converting the exactextractr output directly to long format.
# Convert exactextractr output to long format for ggplot2
# Extract date information from column names and convert the data to long format
<- state_soil_moisture |>
state_timeseries_long ::pivot_longer(
tidyrcols = starts_with("mean."),
names_to = "date_column",
values_to = "soil_moisture_percentile"
|>
) ::mutate(
dplyr# Extract date from column name and convert to Date object
date = as.Date(gsub("^mean\\.", "", date_column), format = "%Y%m%d")
|>
) ::select(state = state_name, date, soil_moisture_percentile)
dplyr
head(state_timeseries_long)
Finally, we’ll create a time series plot showing how soil moisture percentiles changed in each state throughout the 2022 flash drought period.
# Create time series plot of state-level soil moisture
::ggplot(state_timeseries_long, ggplot2::aes(x = date, y = soil_moisture_percentile, color = state)) +
ggplot2::geom_line(size = 1.2, alpha = 0.8) +
ggplot2::geom_point(size = 1.5, alpha = 0.7) +
ggplot2::geom_hline(yintercept = c(20, 10, 5, 2), linetype = "dashed", alpha = 0.5, color = "gray") +
ggplot2::scale_y_continuous(
ggplot2name = "Soil Moisture Percentile",
breaks = c(0, 5, 10, 20, 30, 50, 75, 100),
limits = c(0, 100)
+
) ::scale_x_date(
ggplot2name = "Date",
date_breaks = "2 weeks",
date_labels = "%b %d"
+
) ::scale_color_brewer(
ggplot2name = "State",
type = "qual",
palette = "Set2"
+
) ::labs(
ggplot2title = "State-Level Soil Moisture Percentiles During 2022 Flash Drought",
subtitle = "Dashed lines approximate NIDIS thresholds (D0: 20th, D1: 10th, D2: 5th, D3: 2nd %)",
caption = "Source: NASA SPORT-LIS soil moisture percentiles (0-1m depth)"
+
) ::theme_minimal() +
ggplot2::theme(
ggplot2axis.text.x = ggplot2::element_text(angle = 45, hjust = 1),
legend.position = "bottom",
panel.grid.minor = ggplot2::element_blank()
)
This state-level analysis reveals how the flash drought progressed differently across our study region. Texas (yellow line) and Oklahoma (green line) experienced the most severe and sustained impacts, with both states plunging below the 10th percentile (moderate drought threshold) during the peak intensification period and staying there for extended periods. Kansas shows a similar pattern but with slightly less severe conditions.
What’s particularly interesting is the timing differences - you can see that some states began recovering in August while others continued to deteriorate. Louisiana and Arkansas show more variability and generally less severe impacts, likely due to their geographic position and different climate patterns. The dashed reference lines help put these percentiles in context - when a state drops below the 20th percentile line, agricultural stress typically becomes noticeable, and conditions below the 10th percentile often trigger drought emergency responses. This visualization clearly shows why the 2022 event was so challenging for agricultural communities, with multiple states experiencing prolonged periods well below these critical thresholds.
When SPORT-LIS shows soil moisture at the 5th percentile, this means:
- Soil moisture is 5% of its maximum possible value
- Current soil moisture is drier than 95% of all historical conditions for that location and time of year
- There is a 5% chance of drought conditions
- Soil moisture has decreased by 5% from the previous day
Answer: b) Current soil moisture is drier than 95% of all historical conditions for that location and time of year
Explanation: SPORT-LIS percentiles rank current conditions against decades of historical data. A 5th percentile reading means current soil moisture is drier than 95% of all historical conditions, indicating severe drought stress.
URMA Meteorological Data
While soil moisture tells us what happened during the 2022 flash drought, understanding why it happened requires examining the atmospheric conditions that drove such rapid soil drying. Think of it this way: soil moisture is the victim, but the atmosphere is the culprit. To build a complete picture of flash drought development, we need to investigate the meteorological “crime scene”—the combination of temperature, humidity, wind, and radiation that created the perfect storm for rapid moisture loss.
The challenge is that weather observations from individual stations can miss important spatial patterns, especially in a region as large and climatically diverse as the Southern Plains. A single weather station might show hot, dry conditions, but what about the areas 50 kilometers away? This is where NOAA’s Unrestricted Mesoscale Analysis (URMA) becomes invaluable—it’s essentially a high-resolution “weather reanalysis” that fills in the gaps between observation stations.
How URMA Creates a Complete Picture
URMA combines observations from thousands of surface weather stations, automated weather stations, and other observing platforms across the continental United States, then uses sophisticated atmospheric models to create continuous, gridded meteorological fields at 2.5-kilometer resolution (National Oceanic and Atmospheric Administration 2025). Think of it as taking scattered puzzle pieces (individual weather observations) and filling in the missing pieces to create a complete picture of atmospheric conditions every hour of every day.
This matters enormously for flash drought research because the atmospheric drivers often vary significantly over short distances. A weather front might bring relief to one county while leaving the next county baking under clear skies. URMA captures these spatial patterns with the resolution needed to understand how meteorological conditions varied across our six-state study region during the critical summer of 2022.
The Atmospheric Drivers of Flash Drought
Flash droughts develop when the atmosphere becomes exceptionally “thirsty”—when conditions align to rapidly pull moisture from soils and plants faster than it can be replenished. This atmospheric demand for water involves five key meteorological variables working together:
Temperature acts as the primary driver by increasing the atmosphere’s capacity to hold water vapor. During the 2022 Southern Plains flash drought, Texas recorded its hottest July on record with statewide daily maximum temperatures exceeding 100°F—extreme heat that dramatically increased evaporation rates from soils and transpiration from stressed vegetation.
Relative humidity creates the moisture gradient that drives water loss. When the air is dry (low humidity), it creates a steep gradient between the moist soil or plant surfaces and the thirsty atmosphere above, accelerating water movement from the ground into the air.
Wind speed acts as nature’s hair dryer, constantly moving humid air away from the surface and replacing it with drier air. This maintains the moisture gradient and prevents the air near the ground from becoming saturated, which would slow the drying process.
Solar radiation provides the energy that powers evaporation and transpiration. Clear skies during drought periods—which characterized much of summer 2022 across the Southern Plains—result in intense solar heating that drives rapid water loss.
Precipitation deficits complete the perfect storm. While rain doesn’t directly affect evapotranspiration rates, its absence means that water lost through atmospheric demand cannot be replenished, leading to the rapid soil moisture decline that defines flash drought.
When these variables align—hot temperatures, low humidity, strong winds, clear skies, and no rain—they create conditions where potential evapotranspiration far exceeds water availability. This is exactly what happened across the Southern Plains in summer 2022, and URMA data allows us to track how these conditions developed and intensified with unprecedented spatial and temporal detail.
URMA Data Access
URMA data is freely available through NOAA’s AWS cloud infrastructure. While no authentication is required, we do need to set the AWS_NO_SIGN_REQUEST=YES
environment variable to access the public data bucket without AWS credentials.
The data files are organized by date and variable, with each file containing a single meteorological parameter for one hour. For our analysis, we’ll focus on daily aggregations to match the temporal resolution of our soil moisture data while capturing the key atmospheric drivers of flash drought.
Programmatic Data Access
First, let’s verify our AWS configuration is set up correctly for accessing the public NOAA data.
# Verify AWS configuration for public data access
paste("AWS_NO_SIGN_REQUEST setting:", Sys.getenv("AWS_NO_SIGN_REQUEST"))
[1] "AWS_NO_SIGN_REQUEST setting: YES"
# If not set, configure it now
if(Sys.getenv("AWS_NO_SIGN_REQUEST") == "") {
Sys.setenv(AWS_NO_SIGN_REQUEST = "YES")
paste("\nConfigured AWS_NO_SIGN_REQUEST = YES")
}
URMA data is stored in the NOAA AWS bucket with a specific directory structure. We’ll need to construct URLs for the meteorological variables most relevant to flash drought analysis.
# Define URMA data access parameters
<- "https://noaa-urma-pds.s3.amazonaws.com"
urma_base_url
# Key meteorological variables for flash drought analysis
<- c(
urma_variables "TMP_2maboveground", # 2-meter temperature (K)
"RH_2maboveground", # 2-meter relative humidity (%)
"WIND_10maboveground", # 10-meter wind speed (m/s)
"DSWRF_surface", # Downward shortwave radiation flux (W/m²)
"APCP_surface" # Accumulated precipitation (kg/m²)
)
Now we’ll select a representative date to test our data access approach before processing the full time series.
# Use the same peak drought date from our SPORT-LIS analysis
<- lubridate::as_date("2022-07-19")
test_date <- format(test_date, "%Y%m%d") test_date_string
URMA files follow a specific naming convention on the NOAA AWS bucket. The key insight is that all meteorological variables are contained in a single file per hour, making data access much more efficient than initially anticipated.
# Define strategic hours for capturing diurnal cycle (UTC)
# These hours are selected to capture key points in the daily meteorological cycle
# important for evapotranspiration processes:
<- c("06", "12", "18", "00") # UTC hours
strategic_hours
# Why these specific hours?
# 06Z (1am Central): Near-minimum temperature period, maximum relative humidity
# Represents baseline atmospheric demand conditions
# 12Z (7am Central): Morning transition period, rising temperature and solar radiation
# Captures the onset of daily ET processes
# 18Z (1pm Central): Near-maximum temperature, minimum humidity, peak solar radiation
# Represents peak atmospheric demand and maximum ET conditions
# 00Z (7pm Central): Evening transition, declining temperature and solar radiation
# Captures the wind-down of daily ET processes
The naming conventions for many NASA and NOAA datasets can be overwhelming. THankfully they are all systematic, so once you’ve established the patterns and directory hierarchies you can create simple functions to construct a file path for any given day.
# URMA file naming convention: urma2p5.YYYYMMDD/urma2p5.tHHz.2dvaranl_ndfd.grb2_wexp
# All meteorological variables are included in each file
# Function to construct URMA URLs
<- function(date_string, hour) {
construct_urma_url <- paste0("urma2p5.t", hour, "z.2dvaranl_ndfd.grb2_wexp")
filename <- paste(urma_base_url, paste0("urma2p5.", date_string), filename, sep = "/")
url return(url)
}
# Test URL construction for different hours
<- sapply(strategic_hours, function(hour) {
test_urls construct_urma_url(test_date_string, hour)
})
Let’s test accessing a single URMA file to examine the available variables and data structure.
# Load required packages for raster data
library(terra)
# Test loading a single URMA file (18Z - typically peak heating period)
<- construct_urma_url(test_date_string, "18")
test_url_18z
paste("Testing data access from:", test_url_18z)
[1] "Testing data access from: https://noaa-urma-pds.s3.amazonaws.com/urma2p5.20220719/urma2p5.t18z.2dvaranl_ndfd.grb2_wexp"
# Load the URMA raster - this contains all meteorological variables
<- terra::rast(test_url_18z)
urma_test urma_test
class : SpatRaster
dimensions : 1597, 2345, 14 (nrow, ncol, nlyr)
resolution : 2539.703, 2539.703 (x, y)
extent : -3272417, 2683186, -265067.4, 3790838 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=lcc +lat_0=25 +lon_0=-95 +lat_1=25 +lat_2=25 +x_0=0 +y_0=0 +R=6371200 +units=m +no_defs
source : urma2p5.t18z.2dvaranl_ndfd.grb2_wexp
varname : urma2p5.t18z.2dvaranl_ndfd
names : SFC=G~[gpm], SFC=G~ [Pa], 2[m] ~e [C], 2[m] ~e [C], 10[m]~[m/s], 10[m]~[m/s], ...
unit : gpm, Pa, C, C, m/s, m/s, ...
time : 2022-07-19 18:00:00 UTC
Taking a quick look–we can see the URMA test layer contains 14 meteorological variables analyzed for 1pm Central Time (18Z UTC) on July 19, 2022, at a higher spatial resolution (2.5 km) than our soil moisture data. Notice the coordinate system uses Lambert Conformal Conic projection with units in meters rather than the geographic coordinates (degrees) we saw with SPORT-LIS—this; is a common projection for high-resolution weather analysis that minimizes distortion across the continental United States.
Let’s take a closer look at the available variables.
names(urma_test)
[1] "SFC=Ground or water surface; Geopotential height [gpm]"
[2] "SFC=Ground or water surface; Pressure [Pa]"
[3] "2[m] HTGL=Specified height level above ground; Temperature [C]"
[4] "2[m] HTGL=Specified height level above ground; Dew point temperature [C]"
[5] "10[m] HTGL=Specified height level above ground; u-component of wind [m/s]"
[6] "10[m] HTGL=Specified height level above ground; v-component of wind [m/s]"
[7] "2[m] HTGL=Specified height level above ground; Specific humidity [kg/kg]"
[8] "10[m] HTGL=Specified height level above ground; Wind direction (from which blowing) [deg true]"
[9] "10[m] HTGL=Specified height level above ground; Wind speed [m/s]"
[10] "10[m] HTGL=Specified height level above ground; Wind speed (gust) [m/s]"
[11] "SFC=Ground or water surface; Visibility [m]"
[12] "CEIL=Cloud ceiling; Ceiling [m]"
[13] "1[-] SFC=Ground or water surface; Significant height of combined wind waves and swell [m]"
[14] "EATM=Entire atmosphere (considered as a single layer); Total cloud cover [%]"
The layer names are a bit wordy but tell us all of atmospheric data available in each URMA file, including key flash drought drivers like 2-meter temperature and dew point (layers 3-4), 10-meter wind speed (layer 9), and surface pressure (layer 2). These standardized meteorological variables, measured at specific heights above ground, provide all the components needed to calculate evapotranspiration and understand the atmospheric conditions that can rapidly deplete soil moisture.
Now let’s examine the available variables and identify the layers we need for ET calculations.
# Identify key variables for ET calculation
<- list(
et_variables temperature = 3, # 2m temperature [C]
dewpoint = 4, # 2m dew point temperature [C]
humidity = 7, # 2m specific humidity [kg/kg]
wind_speed = 9, # 10m wind speed [m/s]
pressure = 2 # Surface pressure [Pa]
)
Let’s crop the data to our study region and examine the meteorological conditions during peak drought. First, we need to handle the coordinate reference system differences.
# The URMA data uses Lambert Conformal Conic projection in meters
# Our study_extent is in geographic coordinates (degrees)
# We need to reproject our extent to match the URMA CRS
# Create a polygon from our study extent and reproject it
<- terra::as.polygons(terra::ext(study_extent), crs = "EPSG:4326")
extent_poly <- terra::project(extent_poly, terra::crs(urma_test))
extent_poly_reproj
# Create new extent in URMA projection
<- terra::ext(extent_poly_reproj) study_extent_urma
Now we can properly crop the URMA data to our study region.
# Crop to our reprojected study extent
<- terra::crop(urma_test, study_extent_urma)
urma_cropped
urma_cropped
class : SpatRaster
dimensions : 801, 852, 14 (nrow, ncol, nlyr)
resolution : 2539.703, 2539.703 (x, y)
extent : -1357481, 806345.9, -46652.9, 1987649 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=lcc +lat_0=25 +lon_0=-95 +lat_1=25 +lat_2=25 +x_0=0 +y_0=0 +R=6371200 +units=m +no_defs
source(s) : memory
varname : urma2p5.t18z.2dvaranl_ndfd
names : SFC=G~[gpm], SFC=G~ [Pa], 2[m] ~e [C], 2[m] ~e [C], 10[m]~[m/s], 10[m]~[m/s], ...
min values : -7, 61994, 8.649988, -10.69001, -8.44, -11.93, ...
max values : 4226, 102200, 43.159998, 28.04998, 13.41, 9.98, ...
time : 2022-07-19 18:00:00 UTC
This properly reduced the size. Now let’s extract the key meteorological variables for analysis.
# Extract key variables for visualization
<- urma_cropped[[et_variables$temperature]] # Already in Celsius
temp_18z <- urma_cropped[[et_variables$dewpoint]] # Already in Celsius
dewpoint_18z <- urma_cropped[[et_variables$wind_speed]] # m/s
wind_18z <- urma_cropped[[et_variables$pressure]] # Pa
pressure_18z
# Calculate relative humidity from temperature and dew point
# RH = 100 * exp((17.625 * Td) / (243.04 + Td)) / exp((17.625 * T) / (243.04 + T))
<- 100 * exp((17.625 * dewpoint_18z) / (243.04 + dewpoint_18z)) /
rh_18z exp((17.625 * temp_18z) / (243.04 + temp_18z))
cat("Meteorological conditions on", test_date_string, "at 18Z:\n",
"Temperature - Min:", round(terra::global(temp_18z, "min", na.rm = TRUE)[[1]], 1), "°C",
" Max:", round(terra::global(temp_18z, "max", na.rm = TRUE)[[1]], 1), "°C\n",
"Wind Speed - Min:", round(terra::global(wind_18z, "min", na.rm = TRUE)[[1]], 1), "m/s",
" Max:", round(terra::global(wind_18z, "max", na.rm = TRUE)[[1]], 1), "m/s\n",
"Relative Humidity - Min:", round(terra::global(rh_18z, "min", na.rm = TRUE)[[1]], 1), "%",
" Max:", round(terra::global(rh_18z, "max", na.rm = TRUE)[[1]], 1), "%\n")
Meteorological conditions on 20220719 at 18Z:
Temperature - Min: 8.6 °C Max: 43.2 °C
Wind Speed - Min: 0 m/s Max: 14.4 m/s
Relative Humidity - Min: 9.3 % Max: 100 %
For visualization, we’ll need to reproject our state boundaries to match the URMA coordinate system.
# Reproject state boundaries to match URMA CRS for proper overlay
<- sf::st_transform(states_sf, terra::crs(urma_test)) states_sf_urma
Create a quick visualization showing the extreme conditions during peak flash drought.
# Create a 2x2 panel plot of key meteorological variables
par(mfrow = c(2, 2))
# Temperature
::plot(temp_18z, main = "Temperature (°C) - 18Z",
terracol = RColorBrewer::brewer.pal(9, "RdYlBu"))
plot(sf::st_geometry(states_sf_urma), add = TRUE, border = "black", lwd = 0.8)
# Wind Speed
::plot(wind_18z, main = "Wind Speed (m/s) - 18Z",
terracol = RColorBrewer::brewer.pal(9, "BuPu"))
plot(sf::st_geometry(states_sf_urma), add = TRUE, border = "black", lwd = 0.8)
# Relative Humidity
::plot(rh_18z, main = "Relative Humidity (%) - 18Z",
terracol = rev(RColorBrewer::brewer.pal(9, "RdYlBu")))
plot(sf::st_geometry(states_sf_urma), add = TRUE, border = "black", lwd = 0.8)
# Pressure (convert Pa to hPa for readability)
<- pressure_18z / 100
pressure_hpa ::plot(pressure_hpa, main = "Surface Pressure (hPa) - 18Z",
terracol = RColorBrewer::brewer.pal(9, "Spectral"))
plot(sf::st_geometry(states_sf_urma), add = TRUE, border = "black", lwd = 0.8)
This single URMA file contains all the meteorological variables needed for comprehensive ET calculations, including temperature, humidity, wind, and pressure - the key atmospheric drivers of flash drought conditions. The test data appear to be reasonable and are lined up with our study area. Now we can move on to a more systematic acquisition of all the points on our timeline.
URMA Technical Specifications:
- Spatial Resolution: 2.5 km grid spacing
- Temporal Resolution: Hourly analysis
- Coverage: Continental United States (CONUS)
- Data Source: NOAA/National Weather Service
- Update Frequency: Real-time operational system
- Variables: 40+ meteorological parameters including temperature, humidity, wind, pressure, precipitation
- File Format: GRIB2 format on AWS cloud storage
- Coordinate System: Lambert Conformal Conic projection
- Historical Archive: Available from 2006-present
- Quality Control: Multi-stage validation using observation networks
Key Variables for Flash Drought Analysis:
- 2-meter temperature (°C)
- 2-meter dew point temperature (°C)
- 10-meter wind speed (m/s)
- Surface pressure (Pa)
- Accumulated precipitation (mm)
- Downward shortwave radiation (W/m²)
URMA uses optimal interpolation techniques to blend observations from ~20,000 surface stations with first-guess fields from numerical weather models, creating a spatially complete analysis that’s updated every hour.
Citation: National Oceanic and Atmospheric Administration. NOAA Real-Time Mesoscale Analysis (RTMA) / Unrestricted Mesoscale Analysis (URMA). (2025).
Systematic URMA Time Series Processing
Now that we’ve confirmed our ability to access and process individual URMA files, we need to scale up to process the complete time series. This involves downloading multiple hourly files for each day and aggregating them into daily meteorological summaries that align with our soil moisture analysis.
The systematic processing approach will: 1. Use the same dates as our SPORT-LIS analysis for temporal alignment 2. Download 4 strategic hourly files per day (06Z, 12Z, 18Z, 00Z) 3. Calculate daily aggregates (Tmax, Tmin, Tmean, etc.) from the hourly data 4. Create daily raster time series for each meteorological variable
# Use the same date sequence as our soil moisture analysis for temporal alignment
<- all_dates # This matches our SPORT-LIS date sequence
urma_dates <- format(urma_dates, "%Y%m%d")
urma_date_strings
# Create a systematic processing plan
<- length(urma_dates) * length(strategic_hours)
total_files_needed cat("URMA processing plan:\n",
"Dates to process:", length(urma_dates), "\n",
"Hours per date:", length(strategic_hours), "\n",
"Total URMA files to download:", total_files_needed, "\n")
URMA processing plan:
Dates to process: 27
Hours per date: 4
Total URMA files to download: 108
We’ll organize our processing using a nested approach where each date contains data for all strategic hours. This structure makes it easy to calculate daily aggregates while maintaining access to individual hourly data if needed.
# Create comprehensive URL list for all dates and hours
<- expand.grid(
urma_url_plan date = urma_date_strings,
hour = strategic_hours,
stringsAsFactors = FALSE
|>
) ::mutate(
dplyrurl = mapply(construct_urma_url, date, hour),
date_obj = as.Date(date, format = "%Y%m%d")
)
# Display the processing plan structure
head(urma_url_plan)
Now we’ll implement the systematic processing loop with progress tracking and error handling. This process downloads each URMA file, crops it to our study region, and extracts the key meteorological variables. Downloading all the required files takes approximately 15 minutes on an above average home pc and a good internet connection.
# Initialize storage for processed URMA data
<- list()
urma_daily_data <- list()
processing_log
# Process each date
for(i in 1:length(urma_dates)) {
<- urma_dates[i]
current_date <- urma_date_strings[i]
current_date_string
# Initialize storage for this date
<- list()
daily_hourly_data
# Process each strategic hour for this date
for(j in 1:length(strategic_hours)) {
<- strategic_hours[j]
current_hour <- construct_urma_url(current_date_string, current_hour)
current_url
tryCatch({
# Load and process hourly URMA data
<- terra::rast(current_url)
urma_hourly <- terra::crop(urma_hourly, study_extent_urma)
urma_hourly_cropped
# Extract key meteorological variables
<- list(
hourly_vars temperature = urma_hourly_cropped[[et_variables$temperature]],
dewpoint = urma_hourly_cropped[[et_variables$dewpoint]],
wind_speed = urma_hourly_cropped[[et_variables$wind_speed]],
pressure = urma_hourly_cropped[[et_variables$pressure]]
)
# Calculate relative humidity for this hour
$rel_humidity <- 100 * exp((17.625 * hourly_vars$dewpoint) / (243.04 + hourly_vars$dewpoint)) /
hourly_varsexp((17.625 * hourly_vars$temperature) / (243.04 + hourly_vars$temperature))
# Store hourly data
paste0("hour_", current_hour)]] <- hourly_vars
daily_hourly_data[[
paste(" ", current_hour, "Z✓")
error = function(e) {
}, paste(" ", current_hour, "Z✗")
paste(current_date_string, current_hour, sep = "_")]] <- paste("Error:", e$message)
processing_log[[
})
}
# Calculate daily aggregates from hourly data if we have at least 3 hours
if(length(daily_hourly_data) >= 3) {
# Extract hourly rasters for each variable
<- lapply(daily_hourly_data, function(x) x$temperature)
temp_rasters <- lapply(daily_hourly_data, function(x) x$rel_humidity)
rh_rasters <- lapply(daily_hourly_data, function(x) x$wind_speed)
wind_rasters <- lapply(daily_hourly_data, function(x) x$pressure)
pressure_rasters
# Convert lists to multi-layer rasters for efficient processing
<- terra::rast(temp_rasters)
temp_stack <- terra::rast(rh_rasters)
rh_stack <- terra::rast(wind_rasters)
wind_stack <- terra::rast(pressure_rasters)
pressure_stack
# Calculate daily aggregates using terra::app for raster math
<- list(
daily_aggregates tmax = terra::app(temp_stack, max, na.rm = TRUE),
tmin = terra::app(temp_stack, min, na.rm = TRUE),
tmean = terra::app(temp_stack, mean, na.rm = TRUE),
rh_mean = terra::app(rh_stack, mean, na.rm = TRUE),
wind_mean = terra::app(wind_stack, mean, na.rm = TRUE),
pressure_mean = terra::app(pressure_stack, mean, na.rm = TRUE)
)
# Store daily aggregates
<- daily_aggregates
urma_daily_data[[current_date_string]]
else {
} paste(current_date_string, "daily", sep = "_")]] <- "Insufficient hourly data"
processing_log[[
} }
Creating Daily URMA Time Series
With our individual daily aggregates calculated, we need to combine them into time series raster stacks that match our soil moisture data structure. This will create separate raster time series for each meteorological variable.
# Extract successful processing dates
<- names(urma_daily_data)
successful_dates <- as.Date(successful_dates, format = "%Y%m%d")
successful_date_objects
cat("Creating daily time series rasters...\n",
"Dates with complete URMA data:", length(successful_dates), "\n")
Creating daily time series rasters...
Dates with complete URMA data: 27
# Initialize lists to store time series for each variable
<- list()
tmax_list <- list()
tmin_list <- list()
tmean_list <- list()
rh_list <- list()
wind_list <- list()
pressure_list
# Extract each variable across all dates
for(date_string in successful_dates) {
<- urma_daily_data[[date_string]]
daily_data
<- daily_data$tmax
tmax_list[[date_string]] <- daily_data$tmin
tmin_list[[date_string]] <- daily_data$tmean
tmean_list[[date_string]] <- daily_data$rh_mean
rh_list[[date_string]] <- daily_data$wind_mean
wind_list[[date_string]] <- daily_data$pressure_mean
pressure_list[[date_string]]
}
# Create raster time series stacks
<- terra::rast(tmax_list)
urma_tmax_series <- terra::rast(tmin_list)
urma_tmin_series <- terra::rast(tmean_list)
urma_tmean_series <- terra::rast(rh_list)
urma_rh_series <- terra::rast(wind_list)
urma_wind_series <- terra::rast(pressure_list)
urma_pressure_series
# Verify our time series structure
cat("URMA time series dimensions:\n",
"Tmax series:", paste(dim(urma_tmax_series), collapse = " x "), "\n",
"Tmin series:", paste(dim(urma_tmin_series), collapse = " x "), "\n",
"Tmean series:", paste(dim(urma_tmean_series), collapse = " x "), "\n",
"Relative humidity series:", paste(dim(urma_rh_series), collapse = " x "), "\n",
"Wind speed series:", paste(dim(urma_wind_series), collapse = " x "), "\n",
"Surface pressure series:", paste(dim(urma_pressure_series), collapse = " x "), "\n")
URMA time series dimensions:
Tmax series: 801 x 852 x 27
Tmin series: 801 x 852 x 27
Tmean series: 801 x 852 x 27
Relative humidity series: 801 x 852 x 27
Wind speed series: 801 x 852 x 27
Surface pressure series: 801 x 852 x 27
A few key stats on the dimensions for each series created shows that everything is cropped to the same spatial extent (801 x 852) and they all have 27 days in the time series. We are good to proceed.
Let’s create visualizations showing how key meteorological variables changed during our analysis period using ggplot2 facets for publication-quality plots with state boundary context.
# Select the same key dates we used for soil moisture analysis
<- match(format(available_dates, "%Y%m%d"), successful_dates)
key_urma_indices <- key_urma_indices[!is.na(key_urma_indices)]
available_urma_indices
if(length(available_urma_indices) > 0) {
# Extract key dates for visualization
<- urma_tmax_series[[available_urma_indices]]
tmax_key <- urma_rh_series[[available_urma_indices]]
rh_key
# Set names for better panel labels
names(tmax_key) <- format(available_dates, "%b %d")
names(rh_key) <- format(available_dates, "%b %d")
}
First, let’s examine how maximum temperatures evolved during the flash drought development:
# Plot maximum temperature progression using ggplot2 facets
if(length(available_urma_indices) > 0) {
# Convert temperature raster to data frame for ggplot2
<- tidyterra::as_tibble(tmax_key, xy = TRUE)
tmax_df
# Add proper date labels for the layers
<- format(available_dates, "%B %d, %Y")
date_labels names(tmax_df)[3:ncol(tmax_df)] <- date_labels
# Reshape for ggplot2 faceting
<- tmax_df |>
tmax_df_long ::pivot_longer(cols = -c(x, y), names_to = "date_event", values_to = "temperature")
tidyr
# Convert date_event to factor with chronological order
$date_event <- factor(tmax_df_long$date_event, levels = date_labels)
tmax_df_long
# Create ggplot2 visualization
::ggplot() +
ggplot2::geom_raster(data = tmax_df_long,
ggplot2::aes(x = x, y = y, fill = temperature)) +
ggplot2::geom_sf(data = states_sf_urma, fill = NA, color = "black", size = 0.8) +
ggplot2::scale_fill_gradientn(
ggplot2colors = rev(RColorBrewer::brewer.pal(11, "RdYlBu")),
name = "Temperature\n(°C)",
limits = c(0,50)
+
) ::facet_wrap(~ date_event, ncol = 3) +
ggplot2::labs(
ggplot2title = "Maximum Temperature Evolution",
subtitle = "Daily maximum temperatures showing atmospheric heating",
caption = "Source: NOAA URMA meteorological analysis"
+
) ::theme_minimal() +
ggplot2::theme(
ggplot2axis.text = ggplot2::element_blank(),
axis.ticks = ggplot2::element_blank(),
axis.title = ggplot2::element_blank(),
panel.grid = ggplot2::element_blank()
) }
Next, let’s look at how relative humidity patterns changed across the same period:
# Plot relative humidity progression using ggplot2 facets
if(length(available_urma_indices) > 0) {
# Convert humidity raster to data frame for ggplot2
<- tidyterra::as_tibble(rh_key, xy = TRUE)
rh_df
# Add proper date labels for the layers
names(rh_df)[3:ncol(rh_df)] <- date_labels
# Reshape for ggplot2 faceting
<- rh_df |>
rh_df_long ::pivot_longer(cols = -c(x, y), names_to = "date_event", values_to = "humidity")
tidyr
# Convert date_event to factor with chronological order
$date_event <- factor(rh_df_long$date_event, levels = date_labels)
rh_df_long
# Create ggplot2 visualization
::ggplot() +
ggplot2::geom_raster(data = rh_df_long,
ggplot2::aes(x = x, y = y, fill = humidity)) +
ggplot2::geom_sf(data = states_sf_urma, fill = NA, color = "black", size = 0.8) +
ggplot2::scale_fill_gradientn(
ggplot2colors = rev(RColorBrewer::brewer.pal(11, "RdYlBu")),
name = "Relative\nHumidity (%)"
+
) ::facet_wrap(~ date_event, ncol = 3) +
ggplot2::labs(
ggplot2title = "Relative Humidity Evolution",
subtitle = "Daily mean relative humidity showing atmospheric drying",
caption = "Source: NOAA URMA meteorological analysis"
+
) ::theme_minimal() +
ggplot2::theme(
ggplot2axis.text = ggplot2::element_blank(),
axis.ticks = ggplot2::element_blank(),
axis.title = ggplot2::element_blank(),
panel.grid = ggplot2::element_blank()
) }
These maps clearly show the atmospheric conditions that drove the 2022 flash drought development. The temperature evolution reveals how extreme heat built up across Texas and Oklahoma from June through July, with temperatures reaching 40-50°C (104-122°F) during the peak drought period - conditions that dramatically increased atmospheric demand for water. The relative humidity patterns tell the complementary story of atmospheric drying, with humidity dropping to critically low levels (below 25%) across much of the region during the same peak period. Together, these conditions - scorching temperatures combined with bone-dry air - created the perfect atmospheric recipe for rapidly pulling moisture from soils and vegetation, driving the characteristic rapid soil moisture depletion we observed in our SPORT-LIS analysis.
Now that we understand both what happened to soil moisture and the atmospheric conditions that caused it, we need to quantify the critical link between them: evapotranspiration. This will help us measure exactly how much water the atmosphere was demanding during these extreme conditions.
Which combination of meteorological conditions creates the strongest atmospheric demand for water that can lead to flash drought development?
- Cool temperatures, high humidity, and calm winds
- Moderate temperatures, variable humidity, and strong winds
- High temperatures, low humidity, and strong winds
- High temperatures, high humidity, and calm winds
Answer: c) High temperatures, low humidity, and strong winds
Explanation: This combination maximizes evapotranspiration rates: high temperatures increase the atmosphere’s capacity to hold moisture, low humidity creates a steep moisture gradient, and strong winds continuously remove humid air and replace it with dry air.
Evapotranspiration Analysis
We’ve examined what happened to soil moisture during the 2022 flash drought (it plummeted) and explored the atmospheric conditions that drove it (hot, dry, windy weather). Now we need to understand the critical link between these two components: evapotranspiration. Think of evapotranspiration (ET) as the atmospheric “demand” for water—it represents how much water the atmosphere wants to pull from soils and plants on any given day. When this demand becomes extreme and sustained, it can rapidly drain soil moisture reserves, especially when rainfall fails to replenish what’s lost.
Evapotranspiration combines two fundamental processes: evaporation from soil surfaces and transpiration from plant leaves. During normal conditions, these processes operate in balance with water supply—plants get the water they need, soils maintain adequate moisture, and the atmosphere’s thirst is satisfied. But during flash drought events, this balance breaks down catastrophically. The atmosphere becomes exceptionally demanding (due to high temperatures, low humidity, and strong winds), while water supply diminishes (due to lack of rainfall). The result is a vicious cycle where high atmospheric demand rapidly depletes soil moisture, which then reduces the landscape’s ability to cool itself through evaporation, leading to even higher temperatures and greater atmospheric demand.
The Flash Drought Feedback Loop
Understanding this feedback loop is crucial for grasping why flash droughts can intensify so rapidly:
- Initial conditions: High temperature, low humidity, and clear skies increase potential ET
- Soil moisture depletion: High ET rates rapidly draw down available soil water
- Reduced actual ET: As soil dries, plants close their stomata and actual ET declines
- Energy partitioning shift: Less energy goes to latent heat (cooling through ET), more to sensible heat (warming)
- Temperature amplification: Increased sensible heat further raises air temperature
- Accelerated drying: Higher temperatures increase potential ET, perpetuating the cycle
This self-reinforcing process helps explain why the 2022 Southern Plains flash drought intensified so quickly and why it was so difficult for natural systems to recover without significant rainfall intervention.
Calculating Atmospheric Water Demand
For our analysis, we’ll calculate reference evapotranspiration using the Hargreaves-Samani method—an approach that’s both scientifically robust and computationally practical for our educational purposes. While more complex methods like Penman-Monteith exist, Hargreaves-Samani strikes an excellent balance between accuracy and simplicity, requiring only temperature data that we already have from our URMA analysis.
The beauty of the Hargreaves-Samani approach lies in its use of diurnal temperature range (the difference between daily maximum and minimum temperatures) as a proxy for atmospheric conditions. Large day-night temperature swings typically indicate clear, dry conditions with intense solar heating during the day—exactly the conditions that create high evaporative demand. Small temperature ranges suggest cloudy, humid conditions with lower evaporative demand. This relationship makes the method particularly well-suited for flash drought analysis, where extreme diurnal temperature ranges often coincide with rapid soil moisture depletion.
Plant Energy Balance During Drought
Plants manage their energy balance through four key processes: sensible heat (warming), latent heat (cooling through transpiration), storage heat (temperature changes in plant tissues), and radiative heat (absorption and reflection of solar energy). During normal conditions, plants use transpiration (latent heat flux) as their primary cooling mechanism—much like humans sweating.
Stomatal Control and Water Stress:
When soil moisture becomes limited, plants face a critical trade-off. Stomata (leaf pores) must open to allow CO₂ uptake for photosynthesis, but this also allows water vapor to escape. As drought stress increases, plants progressively close their stomata to conserve water, reducing transpiration but also limiting photosynthesis and cooling capacity.
The Drought Stress Cascade:- Early stress: Slight stomatal closure, reduced transpiration
- Moderate stress: Significant stomatal closure, leaf temperature rises
- Severe stress: Stomatal closure, leaf wilting, potential tissue damage
- Extreme stress: Complete stomatal closure, plant survival mode
Energy Balance Implications:
When transpiration decreases due to drought stress, the energy that would normally be used for cooling (latent heat) is instead converted to sensible heat, warming both the plant and surrounding air. This contributes to the feedback loop we observe in flash drought events—as plants lose their ability to cool the landscape, air temperatures rise further, increasing atmospheric demand for the remaining soil moisture.
Connection to Our Metrics:- Reference ET: Represents potential atmospheric demand
- Actual ET: What plants can actually provide given water limitations
- Soil Moisture Percentiles: Determine plant water availability
- Temperature: Both driver and consequence of the energy balance shift
Setting Up the Calculations
The Hargreaves-Samani method requires one key astronomical component: extraterrestrial radiation, which represents the solar energy available at the top of Earth’s atmosphere before any atmospheric interference.
Extraterrestrial radiation varies with latitude and day of year due to Earth’s orbital mechanics. This calculation follows the FAO-56 methodology and depends on three key astronomical factors: the solar constant (energy output of the sun), Earth’s distance from the sun (which varies seasonally due to our elliptical orbit), and the angle at which solar radiation strikes a given latitude on a specific day.
# Function to calculate extraterrestrial radiation for Hargreaves-Samani
# Based on FAO-56 Equation 21 (Allen et al., 1998)
<- function(latitude_deg, day_of_year) {
calculate_ra_hargreaves # Convert latitude to radians
<- latitude_deg * pi / 180
lat_rad
# Solar declination (radians) - FAO-56 Equation 24
# Represents the angular position of the sun relative to Earth's equatorial plane
<- 0.409 * sin((2 * pi / 365) * day_of_year - 1.39)
solar_declination
# Sunset hour angle (radians) - FAO-56 Equation 25
# Determines the length of daylight hours for the given latitude and date
<- acos(-tan(lat_rad) * tan(solar_declination))
sunset_hour_angle
# Extraterrestrial radiation (MJ m-2 day-1) - FAO-56 Equation 21
# Uses solar constant Gsc = 0.0820 MJ m-2 min-1 (Allen et al., 1998)
<- (24 * 60 / pi) * 0.082 *
ra_mj 1 + 0.033 * cos(2 * pi * day_of_year / 365)) * # Earth-sun distance correction
(* sin(lat_rad) * sin(solar_declination) +
(sunset_hour_angle cos(lat_rad) * cos(solar_declination) * sin(sunset_hour_angle))
# Convert to mm/day equivalent for Hargreaves-Samani (Allen et al., 1998)
# Conversion factor 0.408 standardizes energy units to equivalent evaporation
<- ra_mj * 0.408
ra_mm
return(ra_mm)
}
# Test the function with our study region
<- mean(c(sf::st_bbox(states_sf)[2], sf::st_bbox(states_sf)[4]))
study_center_lat <- calculate_ra_hargreaves(study_center_lat, 200) # July 19th = day 200
test_ra
cat("Extraterrestrial radiation function test:\n",
"Study center latitude:", round(study_center_lat, 2), "degrees\n",
"Ra for day 200 (July 19):", round(test_ra, 2), "mm/day\n")
Extraterrestrial radiation function test:
Study center latitude: 33.23 degrees
Ra for day 200 (July 19): 16.51 mm/day
Creating Daily ET Rasters
Now we’ll apply the Hargreaves-Samani calculation to create daily ET rasters for our complete time series. The method’s key innovation lies in using diurnal temperature range (√(Tmax - Tmin)) as a proxy for atmospheric conditions. Large temperature differences typically indicate clear, dry conditions with high solar radiation and low humidity - factors that increase atmospheric demand for water vapor. This makes it particularly effective for analyzing the 2022 flash drought, where extreme day-night temperature swings were a hallmark of the intensification periods.
The solar declination varies seasonally as Earth orbits the sun, ranging from approximately +23.45° at the summer solstice to -23.45° at the winter solstice. The sunset hour angle determines day length, which varies with both latitude and season. The inverse relative distance factor accounts for Earth’s elliptical orbit, with Earth closest to the sun (perihelion) around January 3rd and farthest (aphelion) around July 4th.
Now let’s implement the Hargreaves-Samani calculations for our complete time series:
# Function to calculate Hargreaves-Samani ET for raster layers
# Implements equation: ET₀ = 0.0023 × (Tmean + 17.8) × Ra × √(Tmax - Tmin)
<- function(tmax_raster, tmin_raster, latitude_deg, day_of_year) {
calculate_hargreaves_raster # Calculate mean temperature
<- (tmax_raster + tmin_raster) / 2
tmean_raster
# Calculate extraterrestrial radiation (constant across study area for each date)
<- calculate_ra_hargreaves(latitude_deg, day_of_year)
ra_value
# Apply Hargreaves-Samani formula
<- 0.0023 * (tmean_raster + 17.8) * ra_value * sqrt(tmax_raster - tmin_raster)
et_raster
return(et_raster)
}
# Get processing dates and calculate day of year for each
<- names(urma_daily_data)
urma_dates_available <- as.Date(urma_dates_available, format = "%Y%m%d")
urma_date_objects <- lubridate::yday(urma_date_objects)
day_of_year_values
cat("Hargreaves-Samani ET calculation setup:\n",
"Dates to process:", length(urma_dates_available), "\n",
"Date range:", as.character(range(urma_date_objects)), "\n",
"Day of year range:", range(day_of_year_values), "\n")
Hargreaves-Samani ET calculation setup:
Dates to process: 27
Date range: 2022-06-01 2022-10-31
Day of year range: 152 304
The empirical coefficient 0.0023 was originally calibrated using lysimeter data from Davis, California, representing the relationship between temperature range and atmospheric demand under semi-arid conditions. The temperature offset of 17.8°C adjusts for the exponential relationship between temperature and saturated vapor pressure. Numerous validation studies have shown the method provides reliable ET estimates across diverse climates.
With our function defined, we can now calculate ET for each day in our analysis period:
# Calculate Hargreaves-Samani ET for all dates
<- list()
hargreaves_et_list
for(i in 1:length(urma_dates_available)) {
<- urma_dates_available[i]
date_string <- urma_date_objects[i]
date_obj <- day_of_year_values[i]
day_of_year
# Get temperature rasters for this date
<- urma_daily_data[[date_string]]$tmax
tmax_raster <- urma_daily_data[[date_string]]$tmin
tmin_raster
# Calculate ET raster
<- calculate_hargreaves_raster(
et_raster tmax_raster = tmax_raster,
tmin_raster = tmin_raster,
latitude_deg = study_center_lat,
day_of_year = day_of_year
)
# Store result
<- et_raster
hargreaves_et_list[[date_string]]
# Progress indicator
if(i %% 10 == 0) paste(" ", i)
}
Now let’s organize our results and examine the patterns:
# Combine ET rasters into time series
<- terra::rast(hargreaves_et_list)
et_hargreaves_series
# Verify the time series structure
cat("Hargreaves-Samani ET time series:\n",
"Dimensions:", paste(dim(et_hargreaves_series), collapse = " x "), "\n",
"Temporal extent:", length(urma_dates_available), "days\n")
Hargreaves-Samani ET time series:
Dimensions: 801 x 852 x 27
Temporal extent: 27 days
# Calculate summary statistics using terra::global()
<- terra::global(et_hargreaves_series, "min", na.rm = TRUE)
et_min_raw <- terra::global(et_hargreaves_series, "max", na.rm = TRUE)
et_max_raw <- terra::global(et_hargreaves_series, "mean", na.rm = TRUE)
et_mean_raw
# Extract overall statistics across all dates and pixels
<- min(et_min_raw$min)
et_min_overall <- max(et_max_raw$max)
et_max_overall <- mean(et_mean_raw$mean)
et_mean_overall
cat("\n\nET summary statistics:\n",
"Overall minimum ET:", round(et_min_overall, 2), "mm/day\n",
"Overall maximum ET:", round(et_max_overall, 2), "mm/day\n",
"Overall mean ET:", round(et_mean_overall, 2), "mm/day\n")
ET summary statistics:
Overall minimum ET: 0.1 mm/day
Overall maximum ET: 8.73 mm/day
Overall mean ET: 4.07 mm/day
These ET values represent the atmospheric demand for water during our analysis period. The range from minimum to maximum shows how dramatically this demand varied as the flash drought developed and intensified. The overall mean of 4.07 mm/day indicates substantial atmospheric water demand, while the maximum values reaching 8.73 mm/day during peak conditions represent extremely high evaporative demand—the kind of atmospheric “thirst” that can rapidly deplete soil moisture reserves when precipitation is absent.
To put this in perspective, ET values above 6-7 mm/day under drought conditions indicate severe atmospheric stress that can quickly exhaust available soil water, especially when combined with the precipitation deficits we observed during summer 2022. Next, we’ll visualize these patterns to see how atmospheric water demand evolved spatially and temporally during the 2022 event.
Visualization of Key ET Periods
Let’s create a visualization showing how ET demand changed during our key drought dates, using the same dates we used for soil moisture analysis.
# Use the expanded key dates that include analysis boundaries
<- c(analysis_start, first_intensification, peak_severity, partial_recovery, peak_conus_extent, analysis_end)
key_event_dates_expanded
# Find matching indices in our URMA data
<- match(format(key_event_dates_expanded, "%Y%m%d"), urma_dates_available)
key_et_indices
# Extract key dates for visualization
<- et_hargreaves_series[[key_et_indices]]
et_key_dates
# Create ggplot2 visualization with state boundaries
library(tidyterra)
# Convert raster to data frame for ggplot2
<- tidyterra::as_tibble(et_key_dates, xy = TRUE)
et_df
# Add proper date labels for the layers
<- format(key_event_dates_expanded, "%B %d, %Y")
date_labels names(et_df)[3:ncol(et_df)] <- date_labels
# Reshape for ggplot2 faceting
<- et_df |>
et_df_long ::pivot_longer(cols = -c(x, y), names_to = "date_event", values_to = "et_mm_day")
tidyr
# Convert date_event to factor with chronological order
$date_event <- factor(et_df_long$date_event, levels = date_labels)
et_df_long
# Create ggplot2 visualization
::ggplot() +
ggplot2::geom_raster(data = et_df_long,
ggplot2::aes(x = x, y = y, fill = et_mm_day)) +
ggplot2::geom_sf(data = states_sf_urma, fill = NA, color = "black", size = 0.8) +
ggplot2::scale_fill_gradientn(
ggplot2colors = RColorBrewer::brewer.pal(9, "YlOrRd"),
name = "Reference ET\n(mm/day)",
limits = c(et_min_overall, et_max_overall)
+
) ::facet_wrap(~ date_event, ncol = 3) +
ggplot2::labs(
ggplot2title = "Hargreaves-Samani Reference ET Evolution",
subtitle = "Key dates showing atmospheric water demand",
caption = "Source: NOAA URMA meteorological data"
+
) ::theme_minimal() +
ggplot2::theme(
ggplot2axis.text = ggplot2::element_blank(),
axis.ticks = ggplot2::element_blank(),
axis.title = ggplot2::element_blank(),
panel.grid = ggplot2::element_blank()
)
This ET evolution map tells an interesting story about atmospheric water demand during the 2022 flash drought. The deep red colors across Texas and Oklahoma during July (reaching 6-8 mm/day) represent extreme atmospheric “thirst” - conditions where the air was aggressively pulling moisture from any available source. Notice how the peak ET demand (July 19 and 22) coincides perfectly with the period when we observed the most severe soil moisture depletion in our SPORT-LIS analysis.
The progression from June through October shows how atmospheric demand intensified during the flash drought development, then gradually decreased as temperatures cooled in fall. During the peak period, much of Texas and Oklahoma experienced ET rates above 6 mm/day - levels that can exhaust soil moisture reserves in just weeks when precipitation is absent. This visualization clearly demonstrates why the combination of high atmospheric demand and precipitation deficits created such devastating conditions for agriculture and natural ecosystems across the Southern Plains.
During the 2022 flash drought, when reference evapotranspiration (ET) reached 7-8 mm/day across much of Texas and Oklahoma, this indicates:
- Plants were transpiring at maximum rates
- Soil moisture was abundant
- The atmosphere was demanding large amounts of water from the landscape
- Precipitation was exceeding evaporation
Answer: c) The atmosphere was demanding large amounts of water from the landscape
Explanation: Reference ET represents the atmospheric demand for water under given meteorological conditions. High reference ET values (7-8 mm/day) indicate extreme atmospheric “thirst” that can rapidly deplete soil moisture reserves, especially when precipitation is absent.
Integrated Visualization
Up to this point, we’ve examined each component of the 2022 flash drought separately: the dramatic soil moisture depletion captured by SPORT-LIS, the extreme atmospheric conditions revealed by URMA, and the intense water demand calculated through evapotranspiration. While each analysis tells part of the story, the real power comes from seeing how these variables evolved together as a complete system.
Flash drought isn’t just about soil moisture dropping or temperatures rising—it’s about the dynamic interactions between all these components that create a self-reinforcing cycle of rapid drying. By integrating our datasets into a comprehensive state-level time series, we can track how atmospheric demand, meteorological extremes, and soil moisture depletion evolved in lockstep across our six-state study region. This integrated view will reveal the temporal relationships and feedback mechanisms that made the 2022 event so devastating and help us understand why it developed with such unprecedented speed.
We already have state-level soil moisture data from our SPORT-LIS analysis. Now we need to extract state-level averages for ET and all meteorological variables. First the ET data we developed.
# Extract state-level means for ET
<- exactextractr::exact_extract(
state_et
et_hargreaves_series,
states_sf_urma, 'mean',
force_df = TRUE,
full_colnames = TRUE,
append_cols = "state_name",
progress = FALSE
)
Now the URMA temperature variables.
# Extract state-level means for temperature variables
<- exactextractr::exact_extract(
state_tmax
urma_tmax_series,
states_sf_urma, 'mean',
force_df = TRUE,
full_colnames = TRUE,
append_cols = "state_name",
progress = FALSE
)
<- exactextractr::exact_extract(
state_tmin
urma_tmin_series,
states_sf_urma, 'mean',
force_df = TRUE,
full_colnames = TRUE,
append_cols = "state_name",
progress = FALSE
)
<- exactextractr::exact_extract(
state_tmean
urma_tmean_series,
states_sf_urma, 'mean',
force_df = TRUE,
full_colnames = TRUE,
append_cols = "state_name",
progress = FALSE
)
And finally humidity and wind.
# Extract state-level means for humidity and wind
<- exactextractr::exact_extract(
state_rh
urma_rh_series,
states_sf_urma, 'mean',
force_df = TRUE,
full_colnames = TRUE,
append_cols = "state_name",
progress = FALSE
)
<- exactextractr::exact_extract(
state_wind
urma_wind_series,
states_sf_urma, 'mean',
force_df = TRUE,
full_colnames = TRUE,
append_cols = "state_name",
progress = FALSE
)
Now we’ll reshape the new meteorological and ET data into long format and combine with our existing soil moisture time series.
# Function to reshape exactextractr output
<- function(data, variable_name) {
reshape_state_data |>
data ::pivot_longer(
tidyrcols = starts_with("mean."),
names_to = "date_column",
values_to = "value"
|>
) ::mutate(
dplyrdate = as.Date(gsub("^mean\\.", "", date_column), format = "%Y%m%d"),
variable = variable_name
|>
) ::select(state = state_name, date, variable, value)
dplyr
}
# Reshape new variables
<- reshape_state_data(state_et, "Reference ET (mm/day)")
et_long <- reshape_state_data(state_tmax, "Maximum Temperature (°C)")
tmax_long <- reshape_state_data(state_tmin, "Minimum Temperature (°C)")
tmin_long <- reshape_state_data(state_tmean, "Mean Temperature (°C)")
tmean_long <- reshape_state_data(state_rh, "Relative Humidity (%)")
rh_long <- reshape_state_data(state_wind, "Wind Speed (m/s)") wind_long
Prep the soil moisture data so it matches up and combine.
# Prepare soil moisture data to match the new format
<- state_timeseries_long |>
soil_moisture_long ::mutate(variable = "Soil Moisture Percentile") |>
dplyr::rename(value = soil_moisture_percentile) |>
dplyr::select(state, date, variable, value)
dplyr
# Combine all variables into single data frame with desired order
<- dplyr::bind_rows(
integrated_timeseries
soil_moisture_long, et_long, tmax_long, tmin_long, tmean_long, rh_long, wind_long
)
# Set factor levels to control top-to-bottom order in facet_grid
<- c(
variable_order "Soil Moisture Percentile",
"Reference ET (mm/day)",
"Maximum Temperature (°C)",
"Minimum Temperature (°C)",
"Mean Temperature (°C)",
"Relative Humidity (%)",
"Wind Speed (m/s)"
)
$variable <- factor(integrated_timeseries$variable, levels = variable_order)
integrated_timeseries
head(integrated_timeseries)
The final data.frame
combines all the flash drought variables into a single “long format” table where each row represents one measurement for one state on one date, with the variable type identified in the “variable” column and the actual measurement in the “value” column. This is the preferred format for most visualization packages with R.
Now we’ll create a comprehensive strip plot using facet_grid()
for better vertical label layout.
# Create integrated strip plot with vertical facet labels
::ggplot(integrated_timeseries,
ggplot2::aes(x = date, y = value, color = state)) +
ggplot2::geom_line(size = 1.2, alpha = 0.8) +
ggplot2::geom_point(size = 1, alpha = 0.6) +
ggplot2::facet_grid(variable ~ ., scales = "free_y", switch = "y") +
ggplot2::scale_x_date(
ggplot2name = "Date",
date_breaks = "2 weeks",
date_labels = "%b %d",
limits = c(analysis_start, analysis_end),
expand = c(0, 2) # Removes any padding beyond the limits
+
)::scale_color_brewer(
ggplot2name = "State",
type = "qual",
palette = "Set2"
+
) ::labs(
ggplot2title = "Integrated Flash Drought Analysis - 2022 Southern Plains",
subtitle = "Soil moisture, atmospheric demand, and meteorological drivers by state",
caption = "Sources: NASA SPORT-LIS soil moisture, NOAA URMA meteorological analysis",
y = ""
+
) ::theme(
ggplot2# Remove grey background from plot panels
panel.background = ggplot2::element_blank(),
panel.grid.major = ggplot2::element_line(color = "white", size = 0.5),
panel.grid.minor = ggplot2::element_blank(),
# Remove grey background from legend
legend.background = ggplot2::element_blank(),
legend.key = ggplot2::element_blank(),
legend.position = "bottom",
# Keep strip labels clean
strip.placement = "outside",
strip.text.y.left = ggplot2::element_text(angle = 0, hjust = 1),
axis.text.x = ggplot2::element_text(angle = 45, hjust = 1)
+
) ::theme_minimal() +
ggplot2::theme(
ggplot2strip.placement = "outside",
axis.text.x = ggplot2::element_text(angle = 45, hjust = 1),
legend.position = "bottom",
panel.grid.minor = ggplot2::element_blank()
)
This integrated strip plot reveals the complete story of how the 2022 Southern Plains flash drought developed as a coupled atmospheric-land system. The top panel shows the drought’s fundamental impact - soil moisture percentiles collapsing from near-normal conditions in early June to exceptional drought (below 10th percentile) within just 4-6 weeks, with Texas and Oklahoma experiencing the most severe and sustained impacts. The atmospheric drivers tell the story of why this happened so rapidly: reference evapotranspiration reached extreme levels of 6-8 mm/day during July, driven by maximum temperatures regularly exceeding 40°C (104°F), relative humidity dropping below 25%, and elevated wind speeds that created perfect conditions for rapid moisture extraction from soils and vegetation. What makes this visualization so powerful is the temporal alignment - all the meteorological extremes peaked simultaneously in mid-July, driving maximum atmospheric water demand at exactly the same time soil moisture was plummeting most rapidly. This demonstrates the coupled nature of flash drought development where, unlike conventional droughts that develop gradually over months, all the key drivers aligned simultaneously to create a perfect storm for rapid drying that gave agricultural communities little time to adapt or implement mitigation strategies.
Looking at the integrated time series for the 2022 event, what pattern demonstrates the coupled nature of flash drought development?
- Soil moisture and temperature changed independently
- High reference ET, extreme temperatures, and low soil moisture percentiles all peaked simultaneously in mid-July
- Each state showed completely different patterns
- Atmospheric conditions improved while soil moisture continued to decline
Answer: b) High reference ET, extreme temperatures, and low soil moisture percentiles all peaked simultaneously in mid-July
Explanation: The synchronized peaks of all drought indicators in mid-July demonstrate flash drought’s coupled nature - atmospheric extremes (high temps, low humidity) drove maximum water demand (high ET) precisely when soil moisture was most depleted, creating a self-reinforcing cycle of rapid drying.
Agricultural Outcomes Analysis
From Physical Indicators to Real-World Impacts
Our integrated analysis has revealed the dramatic physical story of the 2022 flash drought - how soil moisture percentiles collapsed while atmospheric water demand soared, driven by the perfect storm of extreme heat, low humidity, and persistent winds. But these meteorological and hydrological variables, while scientifically essential, represent drought indicators rather than drought outcomes. They tell us about the physical conditions that create stress in natural and agricultural systems, but they don’t directly measure what happened to the farms, ranches, and rural communities that depend on those systems for their livelihoods.
The critical question becomes: how did these extreme physical conditions translate into real-world agricultural impacts? When soil moisture percentiles drop below the 10th percentile and reference evapotranspiration exceeds 7 mm/day for weeks at a time, what actually happens to crops in the field, cattle on pasture, and farming operations across the landscape? This is where the connection between atmospheric science and agricultural economics becomes clear, and where we can assess whether our sophisticated monitoring systems are actually capturing the drought stress that matters to agricultural communities. To understand the complete story of the 2022 flash drought, we need to examine agricultural outcomes - the actual impacts on crops, livestock, and farming operations that transformed meteorological extremes into one of the most costly agricultural disasters in recent U.S. history.
USDA Crop Progress and Condition Reports: Translating Physical Stress into Agricultural Impact
The United States Department of Agriculture (USDA) National Agricultural Statistics Service (NASS) provides the essential bridge between our meteorological analysis and real-world agricultural impacts through their weekly Crop Progress and Condition reports (U.S. Department of Agriculture, National Agricultural Statistics Service 2025). These reports, published consistently since the 1980s, represent one of the most comprehensive and standardized assessments of agricultural drought stress available anywhere in the world. What makes this dataset particularly valuable for our flash drought analysis is that it captures how rapidly changing physical conditions translate into observable stress in agricultural systems across all major farming states.
For flash drought analysis, pasture and rangeland conditions serve as an ideal agricultural indicator because they respond directly to the soil moisture and evapotranspiration patterns we’ve been tracking. Unlike our SPORT-LIS and URMA datasets that measure physical conditions, pasture condition surveys measure biological responses - how plants and agricultural systems actually react to the atmospheric and soil moisture stress we’ve documented. This makes them perfect for validating whether our sophisticated Earth observation systems are detecting drought impacts that matter to agricultural communities.
Why Pasture Conditions Excel as Flash Drought Indicators:
Rapid Response: Pasturelands respond to soil moisture deficits within 2-3 weeks of drought onset, making them ideal early warning indicators that match the timescales of flash drought development
Direct Exposure: Unlike irrigated crops that can buffer drought stress through water management, most pasturelands depend entirely on natural precipitation and soil moisture - the same variables driving our SPORT-LIS percentiles
Continuous Assessment: Weekly surveys throughout the growing season provide the temporal resolution needed to track rapid changes characteristic of flash drought events
Economic Connectivity: Pasture conditions directly drive livestock operations, feed costs, and rural economic stability - connecting our physical drought analysis to measurable economic outcomes
Operational Consistency: Standardized methodologies across all agricultural states enable regional comparisons that align with our six-state meteorological analysis
Survey Methodology: From Field Observation to Quantitative Assessment
USDA state agricultural statisticians collaborate with university extension services, farm organizations, and agricultural professionals to conduct systematic weekly assessments of pasture and rangeland conditions. These trained observers evaluate pasturelands using five standardized categories that capture the full spectrum from optimal growing conditions to severe drought stress:
- Excellent (>85th percentile): Abundant growth, dense stands, well above normal conditions for the time of year
- Good (65th-85th percentile): Above average growth and vigor, healthy plant color and density
- Fair (35th-65th percentile): Average growth and condition, typical appearance for the season
- Poor (15th-35th percentile): Below average growth, visible stress signs, declining plant vigor
- Very Poor (<15th percentile): Severely stressed vegetation, minimal growth, significant deterioration
Each week, statisticians report the percentage of each state’s pasturelands falling into each category, creating a quantitative time series that directly measures agricultural drought stress. This approach transforms subjective field observations into standardized data that can be compared across states, years, and drought events - providing exactly the kind of agricultural outcome data needed to validate our meteorological drought analysis.
USDA NASS QuickStats Technical Specifications:
- Temporal Resolution: Weekly surveys during growing season (April-October)
- Spatial Coverage: All 50 states plus Puerto Rico
- Data Source: USDA National Agricultural Statistics Service (NASS)
- Historical Archive: Available from 1980-present (40+ years)
- File Format: JSON, XML, CSV via RESTful API
- Update Frequency: Published every Monday (covering previous week)
- Sample Size: Varies by state, typically 200-500 respondents per state
Key Variables for Flash Drought Analysis:
- Pasture and rangeland conditions (5-category scale)
- Major crop conditions for corn, soybeans, wheat, cotton
- Agricultural land use and irrigation statistics
Survey Design:
- Stratified Sampling: Counties stratified by agricultural importance
- Multiple Data Sources: Producer surveys, extension agents, agricultural advisors
- Weighted Estimates: Results weighted by agricultural land area and production
- Seasonal Coverage: Intensive monitoring during critical growing periods
API Access: Base URL https://quickstats.nass.usda.gov/api, requires free API key, flexible output formats (JSON, XML, CSV)
Citation: U.S. Department of Agriculture, National Agricultural Statistics Service. USDA Crop Progress Reports. (2025).
Data Access and API Authentication
USDA NASS QuickStats API
The USDA NASS provides free access to agricultural survey data through their QuickStats API. This service allows programmatic access to the same data available through their web interface at quickstats.nass.usda.gov.
API Registration Process: 1. Visit https://quickstats.nass.usda.gov/api 2. Click “Get API Key” and complete the simple registration form 3. Check your email for the API key (usually arrives within minutes) 4. Store the key securely in your analysis environment
Proper API Key Management
Security Best Practices: API keys should never be written directly into code or shared publicly. They provide access to government services and should be treated as sensitive credentials.
Environment Variable Approach: The recommended method is storing API keys as environment variables, which keeps them separate from your analysis code while allowing secure access.
Professional Development Standards: - Never commit API keys to version control (git) - Use .env
files locally (and add .env
to .gitignore
) - Use environment variables in production environments - Rotate keys periodically as a security measure
API Authentication Setup
Pull the USDA_NASS_API key from environmental variables.
# Load API key from environment variables
<- Sys.getenv("USDA_NASS_API") nass_api_key
Pass it to the authorization function.
# Configure rnassqs package authentication
::nassqs_auth(key = nass_api_key)
rnassqs
paste("rnassqs package authentication configured")
[1] "rnassqs package authentication configured"
Agricultural Data Retrieval and Pre-processing
Defining Analysis Parameters
We’ll analyze pasture conditions across all six states in our study region: Kansas, Missouri, Oklahoma, Arkansas, Texas, and Louisiana. These states represent the complete geographic scope of our 2022 flash drought analysis and experienced varying timing and severity of agricultural impacts across the region.
# Define agricultural analysis parameters - all six study states
<- c("KANSAS", "MISSOURI", "OKLAHOMA", "ARKANSAS", "TEXAS", "LOUISIANA")
agricultural_states <- 2022
analysis_year
# Display our analysis scope
cat("Agricultural impact analysis parameters:\n",
"Target states:", paste(agricultural_states, collapse = ", "), "\n",
"Analysis year:", analysis_year, "\n",
"Focus: Pasture and rangeland conditions\n")
Agricultural impact analysis parameters:
Target states: KANSAS, MISSOURI, OKLAHOMA, ARKANSAS, TEXAS, LOUISIANA
Analysis year: 2022
Focus: Pasture and rangeland conditions
USDA NASS Data Parameters
The USDA NASS QuickStats database uses specific parameter names and values to filter data requests. Understanding these parameters is important for retrieving the correct data.
# USDA NASS query parameters for pasture condition data
<- list(
pasture_params commodity_desc = "PASTURELAND", # Commodity: pasture and rangeland
statisticcat_desc = "CONDITION", # Statistic: condition assessments
agg_level_desc = "STATE", # Geographic level: state summaries
state_name = agricultural_states, # States: KS, MO, OK, AR, TX, LA
year = analysis_year # Temporal scope: 2022
)
Data Availability Check
Before downloading the full dataset, we’ll check how many records are available. This helps verify our query parameters are correct and gives us an estimate of the data volume.
# Check data availability using nassqs_record_count()
<- rnassqs::nassqs_record_count(pasture_params) record_count
Downloading: 14 B
Downloading: 14 B
Downloading: 14 B
Downloading: 14 B
cat("Data availability check:\n",
"Records available:", record_count$count, "\n")
Data availability check:
Records available: 1145
That seems reasonable. Let’s proceed.
Downloading Pasture Condition Data
Now we’ll download the complete pasture condition dataset for our six focus states during 2022.
# Download the complete pasture condition dataset
<- rnassqs::nassqs(pasture_params) pasture_raw_data
Examining Data Structure
Let’s explore the structure of the USDA data to understand what condition categories and temporal coverage we have. The USDA NASS database returns data in a standardized format, but understanding the column structure is essential for effective analysis.
# Examine the raw data structure
cat("Raw data structure:\n",
"Columns:", ncol(pasture_raw_data), "\n",
"Rows:", nrow(pasture_raw_data), "\n")
Raw data structure:
Columns: 39
Rows: 1145
# Display key column names
<- c("state_name", "commodity_desc", "short_desc",
key_columns "reference_period_desc", "Value", "week_ending")
<- key_columns[key_columns %in% names(pasture_raw_data)]
available_columns cat("Key columns available:\n")
Key columns available:
for(col in available_columns) {
cat(" -", col, "\n")
}
- state_name
- commodity_desc
- short_desc
- reference_period_desc
- Value
- week_ending
The USDA data structure includes dozens of columns, but we only need a few key ones for our analysis. The state_name
column identifies the geographic unit, short_desc
contains the specific condition measurement, reference_period_desc
indicates the week, and Value
contains the actual percentage data.
# Examine available condition categories
<- unique(pasture_raw_data$short_desc)
condition_types condition_types
[1] "PASTURELAND - CONDITION, MEASURED IN PCT EXCELLENT"
[2] "PASTURELAND - CONDITION, MEASURED IN PCT FAIR"
[3] "PASTURELAND - CONDITION, MEASURED IN PCT GOOD"
[4] "PASTURELAND - CONDITION, MEASURED IN PCT POOR"
[5] "PASTURELAND - CONDITION, MEASURED IN PCT VERY POOR"
The short_desc
field reveals that USDA tracks five distinct condition categories, each reported as a percentage of total pasturelands in that state. These categories represent a continuum from excellent (optimal conditions) to very poor (severe stress). Notice that each category is measured independently - they should sum to 100% for each state and week.
Condition Category Interpretation: - Excellent: Pasturelands with exceptional growth and vigor, well above normal - Good: Above-average conditions with strong growth - Fair: Normal conditions for the time of year - Poor: Below-average growth showing stress from drought or other factors - Very Poor: Severely stressed pasturelands with minimal growth
For drought analysis, we focus primarily on the “Poor” and “Very Poor” categories, as these directly indicate agricultural stress from moisture deficits.
# Examine temporal coverage
<- unique(pasture_raw_data$reference_period_desc)
time_periods cat("\nTemporal coverage:\n",
"Weeks available:", length(time_periods), "\n",
"First week:", min(time_periods), "\n",
"Last week:", max(time_periods), "\n")
Temporal coverage:
Weeks available: 45
First week: WEEK #03
Last week: WEEK #47
The temporal coverage shows USDA’s systematic weekly reporting throughout the 2022 growing season. The week numbering system starts with “WEEK #03” (mid-January) and continues through “WEEK #47” (late November), providing nearly year-round coverage of pasture conditions.
USDA Week Numbering System: - Week numbers correspond to calendar weeks (Week #1 = first week of January) - Each week typically runs Sunday through Saturday - Pasture condition surveys are conducted and reported every Tuesday - The survey represents conditions as of the weekend ending that week
The temporal structure and resolution are a good match for the weekly climate data we prepared in previous sections.
Data Quality Assessment
Before processing the data, we need to assess data quality and identify any missing or suppressed values. USDA data can contain special codes that indicate confidential information or missing data that could affect our analysis.
# Check data quality and missing values
cat("Data quality assessment:\n",
"Total records:", nrow(pasture_raw_data), "\n")
Data quality assessment:
Total records: 1145
# Count valid data values
<- !is.na(pasture_raw_data$Value) &
valid_data $Value != "" &
pasture_raw_data$Value != "(D)"
pasture_raw_datacat("Records with valid values:", sum(valid_data), "\n",
"Missing or suppressed records:", sum(!valid_data), "\n",
"Data completeness:", round(100 * sum(valid_data) / nrow(pasture_raw_data), 1), "%\n")
Records with valid values: 1145
Missing or suppressed records: 0
Data completeness: 100 %
# Check for suppressed data "(D)" which indicates confidential information
<- sum(pasture_raw_data$Value == "(D)", na.rm = TRUE)
suppressed_count if(suppressed_count > 0) {
cat("Note:", suppressed_count, "records marked as '(D)' (confidential)\n")
else {
} cat("✅ No confidential data suppression detected\n")
}
✅ No confidential data suppression detected
Everything looks good here. We can move on.
- Blank/Empty values: Data not collected or reported for that period
- D: Data withheld to avoid disclosing confidential information (rare for state-level aggregates)
- Z: Less than half the unit shown (uncommon in percentage data)
For pasture condition data at the state level, we typically expect high data completeness because these surveys are conducted systematically across all major agricultural states. Missing data is more common early or late in the season when pasture monitoring may be less intensive.
High data completeness (>95%) indicates reliable, consistent monitoring throughout our analysis period, which is essential for tracking the rapid changes characteristic of flash drought development.
Data Cleaning and Processing
Now we’ll clean the raw data and convert it into a format suitable for time series analysis. This involves extracting condition categories from the detailed USDA descriptions and organizing the data for comparison with our meteorological datasets.
# Clean and process the pasture condition data
<- pasture_raw_data |>
pasture_clean # Filter to valid data only
::filter(!is.na(Value) & Value != "" & Value != "(D)") |>
dplyr# Convert Value to numeric
::mutate(Value = as.numeric(Value)) |>
dplyr# Extract condition type from description
::mutate(
dplyrcondition_type = dplyr::case_when(
grepl("EXCELLENT", short_desc) ~ "Excellent",
grepl("GOOD", short_desc) ~ "Good",
grepl("FAIR", short_desc) ~ "Fair",
grepl("POOR", short_desc) & !grepl("VERY", short_desc) ~ "Poor",
grepl("VERY POOR", short_desc) ~ "Very Poor",
TRUE ~ "Unknown"
)|>
) # Clean up week information
::mutate(
dplyrweek_number = as.numeric(gsub("WEEK #", "", reference_period_desc))
|>
) # Select and organize key columns
::select(
dplyrstate = state_name,
condition_type,
week_number,week_desc = reference_period_desc,
percent_condition = Value
|>
) # Arrange for logical viewing
::arrange(state, week_number, condition_type)
dplyr
head(pasture_clean)
Data Transformation Explanation:
The cleaning process transforms the verbose USDA format into analysis-ready data:
Condition Type Extraction: The
short_desc
field contains text like “PASTURELAND - CONDITION, MEASURED IN PCT POOR” which we parse to extract just “Poor”Week Number Parsing: The
reference_period_desc
field contains “WEEK #27” format, which we convert to numeric week numbers for easier date calculationsData Validation: We filter out any records with missing values, ensuring our analysis works with complete data only
Column Standardization: We rename and select only the columns needed for our analysis, creating a clean, focused dataset
The resulting dataset has one row per state-week-condition combination, making it easy to aggregate and visualize drought stress patterns across time and geography.
Creating Date Alignment
To compare agricultural outcomes with our meteorological analysis, we need to convert USDA week numbers to actual dates that align with our existing analysis timeframe.
# Convert week numbers to dates for alignment with meteorological data
# USDA weeks are typically Sunday-to-Saturday
# Week 1 starts the first Sunday of the year
# Calculate the first Sunday of 2022
<- as.Date("2022-01-01")
jan_1_2022 <- jan_1_2022 + (7 - as.numeric(format(jan_1_2022, "%u"))) %% 7
first_sunday_2022
# Create date mapping for each week
<- pasture_clean |>
pasture_with_dates ::mutate(
dplyr# Calculate the ending date of each week (Saturday)
week_end_date = first_sunday_2022 + (week_number - 1) * 7 + 6,
# Calculate the middle date of each week for plotting
week_mid_date = week_end_date - 3
)
# Verify date alignment looks reasonable
<- pasture_with_dates |>
date_check ::select(week_number, week_desc, week_end_date, week_mid_date) |>
dplyr::distinct() |>
dplyr::arrange(week_number) |>
dplyrhead(6)
date_check
Date Conversion Methodology:
The USDA week numbering system requires careful conversion to calendar dates:
- Week Definition: USDA weeks run Sunday through Saturday, consistent with the ISO week system
- Week 1 Calculation: We calculate the first Sunday of 2022 as our baseline (January 2, 2022)
- Week End Dates: Each week’s Saturday represents the end of the survey period
- Week Mid Dates: We use Wednesday (middle of the week) for plotting to better represent the survey period
This conversion allows us to overlay agricultural condition data with our daily meteorological and soil moisture time series. The weekly resolution of agricultural data provides excellent temporal granularity for tracking flash drought impacts, which typically develop over periods of 2-6 weeks.
Why Date Alignment Matters: - Enables correlation analysis between meteorological drivers and agricultural outcomes - Allows identification of lag times between weather conditions and agricultural stress - Provides framework for understanding cause-and-effect relationships in flash drought development
This looks good; now we can prepare some visualizations.
Preview of Agricultural Drought Signal
Let’s create a quick preview of the agricultural drought signal by examining the “Poor” and “Very Poor” conditions across our six study states. This preview will help us assess the educational value of our dataset and identify the timing and magnitude of agricultural drought impacts.
# Extract drought stress indicators (Poor + Very Poor conditions)
<- pasture_with_dates |>
drought_stress ::filter(condition_type %in% c("Poor", "Very Poor")) |>
dplyr::group_by(state, week_number, week_mid_date) |>
dplyr::summarise(
dplyrtotal_poor_percent = sum(percent_condition),
.groups = "drop"
|>
) ::arrange(state, week_number)
dplyr
# Display maximum drought stress by state
<- drought_stress |>
max_stress_by_state ::group_by(state) |>
dplyr::summarise(
dplyrmax_poor_percent = max(total_poor_percent),
peak_week = week_number[which.max(total_poor_percent)],
peak_date = week_mid_date[which.max(total_poor_percent)],
.groups = "drop"
)
cat("Peak agricultural drought stress by state:\n")
Peak agricultural drought stress by state:
print(max_stress_by_state)
# A tibble: 6 × 4
state max_poor_percent peak_week peak_date
<chr> <dbl> <dbl> <date>
1 ARKANSAS 79 43 2022-10-26
2 KANSAS 81 45 2022-11-09
3 LOUISIANA 34 47 2022-11-23
4 MISSOURI 68 42 2022-10-19
5 OKLAHOMA 82 44 2022-11-02
6 TEXAS 91 30 2022-07-27
# Quick assessment of educational value
<- max(max_stress_by_state$max_poor_percent)
overall_max cat("\nOverall maximum stress:", overall_max, "% poor/very poor conditions\n")
Overall maximum stress: 91 % poor/very poor conditions
if(overall_max > 60) {
cat("✅ EXCELLENT drought signal - very strong educational value!\n")
else if(overall_max > 40) {
} cat("✅ STRONG drought signal - good educational value\n")
else if(overall_max > 20) {
} cat("⚠️ MODERATE drought signal - adequate for education\n")
else {
} cat("❌ WEAK drought signal - limited educational value\n")
}
✅ EXCELLENT drought signal - very strong educational value!
These results reveal the devastating scope and timing of agricultural impacts during the 2022 flash drought. Texas experienced the most severe pasture stress, with 91% of pasturelands rated as “Poor” or “Very Poor” by late July - an extraordinary level of agricultural deterioration that represents near-complete pastoral collapse across the state. Oklahoma and Kansas followed closely behind with peak stress levels of 82% and 81% respectively, while Missouri reached 68% stressed pasturelands. Even Arkansas, which showed more resilience in our meteorological analysis, still experienced 79% of pasturelands under stress. Only Louisiana remained below the severe impact threshold, peaking at 34% stressed pasturelands.
Interpreting Agricultural Drought Stress:
The “Poor” and “Very Poor” categories combined represent the percentage of pasturelands experiencing significant drought stress in each state. This metric translates our meteorological analysis into measurable agricultural consequences:
Direct Impact Measurement: Unlike our soil moisture percentiles and ET calculations, these values represent actual agricultural stress that directly affected farm operations, livestock management, and rural economies across the region
Validation of Physical Indicators: The peak stress levels align remarkably well with our SPORT-LIS and URMA analysis - Texas and Oklahoma, which showed the most extreme soil moisture depletion and atmospheric demand, also experienced the highest agricultural stress levels
Temporal Progression: The peak timing reveals how flash drought impacts spread across the region - Texas peaked earliest (July 27), followed by Missouri (October 19), then Oklahoma (November 2), Kansas (November 9), and finally Louisiana (November 23), showing a clear temporal progression from south to north
Agricultural Time Series Visualization
Stacked Bar Chart of Pasture Conditions
Now that we have our agricultural data processed and aligned with our meteorological analysis timeframe, let’s create a comprehensive visualization showing how pasture conditions changed throughout 2022. We’ll use a stacked bar chart approach that shows all five condition categories simultaneously, making it easy to see how conditions shifted from good to poor during the flash drought. We want to set the categorical variable for condition_type
for plotting.
# Prepare data for stacked bar visualization
# Focus on our flash drought analysis period (June through October)
<- pasture_with_dates|>
pasture_viz_data # Filter to flash drought analysis period
::filter(week_mid_date >= analysis_start & week_mid_date <= analysis_end)|>
dplyr# Create ordered factor for proper stacking order (worst to best from bottom up)
::mutate(
dplyrcondition_type = factor(condition_type,
levels = c("Very Poor", "Poor", "Fair", "Good", "Excellent"),
ordered = TRUE)
)
Adapting NIDIS Drought Colors for Pasture Conditions
We’ll use the official NIDIS drought color palette we developed earlier, adapting it for our five pasture condition categories. The color scheme will make drought stress (Poor/Very Poor) visually prominent while representing good conditions with cooler colors.
# Adapt the official NIDIS drought color palette for pasture conditions
# Based on the drought colors we used earlier for soil moisture analysis
<- c(
pasture_condition_colors "Very Poor" = "#730000", # Exceptional drought (D4) - darkest red
"Poor" = "#E60000", # Extreme drought (D3) - bright red
"Fair" = "#FFAA00", # Severe drought (D2) - orange
"Good" = "#90EE90", # Light green (good conditions)
"Excellent" = "#228B22" # Forest green (lush, excellent conditions)
)
This color scheme follows drought monitoring conventions where red colors indicate stress and deteriorating conditions, while yellow represents normal conditions and blue indicates above-normal or excellent conditions. The visual progression from red through orange to yellow to blue provides an intuitive understanding of agricultural drought severity.
Creating the Stacked Bar Time Series
Now we’ll create a multi-panel stacked bar chart with each state in its own subplot, arranged in a 2-column by 3-row layout for easy comparison across the region.
# Create stacked bar chart faceted by state
<- ggplot2::ggplot(pasture_viz_data,
agricultural_timeseries_plot ::aes(x = week_mid_date, y = percent_condition,
ggplot2fill = condition_type)) +
::geom_bar(stat = "identity", position = "stack", width = 6) +
ggplot2::facet_wrap(~ state, ncol = 2, scales = "free_x") +
ggplot2::scale_fill_manual(
ggplot2name = "Condition:",
values = pasture_condition_colors,
guide = ggplot2::guide_legend(reverse = TRUE) # Show Excellent at top, Very Poor at bottom
+
) # Let ggplot2 set limits automatically based on data
::scale_x_date(
ggplot2name = "Date",
date_breaks = "1 month",
date_labels = "%b",
expand = c(0.01, 0)
+
)::scale_y_continuous(
ggplot2name = "Percentage of Pasturelands",
limits = c(0, 100),
breaks = c(0, 25, 50, 75, 100),
expand = c(0, 0)
+
) ::labs(
ggplot2title = "2022 Flash Drought:Pasture Condition Evolution",
subtitle = "Weekly pasture condition surveys showing rapid deterioration during flash drought",
caption = "Source: USDA NASS Crop Progress and Condition Reports"
+
) ::theme_minimal() +
ggplot2::theme(
ggplot2# Clean up panel spacing and borders
panel.spacing = ggplot2::unit(0.8, "lines"),
panel.grid.minor = ggplot2::element_blank(),
panel.grid.major.x = ggplot2::element_blank(),
# Improve text readability
axis.text.x = ggplot2::element_text(angle = 45, hjust = 1, size = 10),
axis.text.y = ggplot2::element_text(size = 10),
axis.title = ggplot2::element_text(size = 12),
# Style the facet labels (state names)
strip.text = ggplot2::element_text(size = 12, face = "bold"),
strip.background = ggplot2::element_rect(fill = "grey90", color = "grey70"),
# Position legend appropriately
legend.position = "bottom",
legend.title = ggplot2::element_text(size = 11, face = "bold"),
legend.text = ggplot2::element_text(size = 10),
# Clean up plot titles
plot.title = ggplot2::element_text(size = 14, face = "bold"),
plot.subtitle = ggplot2::element_text(size = 12, color = "grey30"),
plot.caption = ggplot2::element_text(size = 9, color = "grey50")
)
# Display the plot
agricultural_timeseries_plot
This illustrates the dramatic agricultural story of the 2022 flash drought, showing how rapidly pastoral conditions deteriorated across all six states. The weekly progression captures the speed and severity that defines flash drought impacts on agricultural systems.
Visual Interpretation Guide: - Dark red sections (Very Poor): Severely stressed pasturelands with minimal growth - Red sections (Poor): Below-average growth showing visible drought stress - Orange sections (Fair): Average conditions for the time of year - Green sections (Good/Excellent): Above-average to abundant growth
Texas and Oklahoma show the classic flash drought pattern - rapid transitions from predominantly good conditions (green) in early June to overwhelming poor/very poor conditions (red/dark red) by late July, with minimal recovery through fall. Kansas shows significant stress developing later in the season, while Arkansas and Missouri demonstrate more variable patterns with periods of recovery. Louisiana maintained relatively stable conditions with less extreme deterioration.
This agricultural visualization directly validates our meteorological analysis - the states showing the most extreme soil moisture depletion and highest atmospheric water demand (Texas, Oklahoma, Kansas) are exactly the states experiencing the most severe and sustained agricultural stress.
Integrated Flash Drought Analysis
Connecting the Flash Drought Impact Chain
Our analysis has traced the complete pathway from atmospheric forcing to agricultural collapse during the 2022 Southern Plains flash drought. We’ve documented how extreme meteorological conditions (scorching temperatures, bone-dry air, persistent winds) drove unprecedented atmospheric water demand that rapidly depleted soil moisture reserves, ultimately manifesting as widespread agricultural stress across millions of acres of pasturelands. But understanding flash drought requires more than examining each component separately - it demands seeing how these elements functioned as an integrated system where each component amplified the others.
The power of flash drought lies in its coupled nature: when atmospheric demand skyrockets while soil moisture plummets, the result isn’t simply additive stress on agricultural systems - it’s multiplicative. Plants facing both extreme atmospheric demand and depleted soil water reserves experience compounding stress that can trigger rapid deterioration, as we observed across Texas, Oklahoma, and Kansas during summer 2022. Now we’ll create an integrated analysis that reveals these cause-and-effect relationships by tracking how soil moisture percentiles, atmospheric water demand, and pasture condition stress evolved together, demonstrating the interconnected processes that transformed meteorological extremes into one of the most devastating flash droughts in recent agricultural history.
Preparing Agricultural Drought Stress Data
First, we’ll aggregate the pasture condition data to create a single drought stress indicator - the combined percentage of pasturelands in “Poor” and “Very Poor” conditions for each state and week.
# Calculate combined Poor + Very Poor percentages by state and week
<- pasture_with_dates |>
agricultural_drought_stress ::filter(condition_type %in% c("Poor", "Very Poor")) |>
dplyr::filter(week_mid_date >= analysis_start & week_mid_date <= analysis_end) |>
dplyr::group_by(state, week_number, week_mid_date) |>
dplyr::summarise(
dplyrpoor_very_poor_percent = sum(percent_condition, na.rm = TRUE),
.groups = "drop"
|>
) ::rename(date = week_mid_date) |>
dplyr::select(state, date, poor_very_poor_percent)
dplyr
cat("Agricultural drought stress data prepared\n",
"Date range:", as.character(range(agricultural_drought_stress$date)), "\n",
"Max stress level:", round(max(agricultural_drought_stress$poor_very_poor_percent), 1), "%\n")
Agricultural drought stress data prepared
Date range: 2022-06-01 2022-10-26
Max stress level: 91 %
Aligning Soil Moisture and ET Data
We need to extract the soil moisture and ET data for the same dates as our agricultural data. Since we already have state-level time series for both variables, we’ll filter them to match the agricultural survey dates.
# Get the agricultural survey dates for alignment
<- unique(agricultural_drought_stress$date)
agricultural_dates
# Filter soil moisture data to agricultural dates
<- state_timeseries_long |>
soil_moisture_weekly ::filter(date %in% agricultural_dates) |>
dplyr::select(state, date, soil_moisture_percentile)
dplyr
paste("Soil moisture weekly data aligned:", nrow(soil_moisture_weekly), "records")
[1] "Soil moisture weekly data aligned: 132 records"
Now the ET data.
# Filter ET data to agricultural dates
# We need to reshape our existing integrated_timeseries data to get ET values
<- integrated_timeseries |>
et_weekly ::filter(variable == "Reference ET (mm/day)") |>
dplyr::filter(date %in% agricultural_dates) |>
dplyr::select(state, date, value) |>
dplyr::rename(et_mm_day = value)
dplyr
paste("ET weekly data aligned:", nrow(et_weekly), "records")
[1] "ET weekly data aligned: 132 records"
The state names are slightly different in the varying datasets; we need to make sure they’re the same before proceeding to standardization and plotting.
# Fix state name inconsistencies before standardization
<- agricultural_drought_stress |>
agricultural_drought_stress ::mutate(state = stringr::str_to_title(state))
dplyr
<- soil_moisture_weekly |>
soil_moisture_weekly ::mutate(state = stringr::str_to_title(state))
dplyr
<- et_weekly |>
et_weekly ::mutate(state = stringr::str_to_title(state)) dplyr
Data Standardization for Multi-Variable Plotting
Creating an integrated visualization that reveals the relationships between soil moisture, atmospheric demand, and agricultural stress presents a significant analytical challenge: these variables exist on completely different scales and units. Soil moisture percentiles naturally range from 0-100, representing how current conditions rank against historical distributions. Reference evapotranspiration varies from roughly 2-10 mm/day based on seasonal atmospheric conditions and geographic location. Agricultural stress percentages (poor + very poor pasture conditions) span 0-100% based on the proportion of stressed pasturelands in each state.
Variable Standardization Techniques:
Min-Max Scaling (0-100): Transforms data to a specified range while preserving relative relationships-
Formula:
scaled = (value - min) / (max - min) * 100
- Preserves original distribution shape
- Sensitive to outliers but maintains interpretability
-
Formula:
z = (value - mean) / standard_deviation
- Useful when data follows normal distribution
- Less intuitive for mixed audiences
- Robust to outliers and skewed distributions
- Natural 0-100 scale interpretation
- May lose information about magnitude differences
- Maintains intuitive 0-100 interpretation
- Preserves temporal patterns essential for flash drought analysis
- Allows direct visual comparison across variables with different units
- Facilitates identification of synchronous patterns and lag relationships
Alternative Approaches: Could use separate y-axes, but this obscures cross-variable relationships critical for understanding coupled Earth system processes.
Attempting to plot these variables together on their native scales would render the visualization meaningless - the ET line would be barely visible compared to the soil moisture and agricultural data, making it impossible to discern the temporal relationships and feedback mechanisms that define flash drought development. To solve this problem, we need to standardize all three variables to a common 0-100 scale that preserves their relative patterns while enabling direct visual comparison. This standardization process will transform our data while maintaining the essential information about timing, intensity, and inter-variable relationships needed to understand how meteorological extremes cascaded through the Earth system to create agricultural impacts.
# Soil moisture percentiles are already 0-100, so no transformation needed
<- soil_moisture_weekly |>
soil_moisture_standardized ::mutate(
dplyrstandardized_value = soil_moisture_percentile,
variable_type = "Soil Moisture (%)"
|>
) ::select(state, date, standardized_value, variable_type) dplyr
Now the same for the ET data.
# Standardize ET to 0-100 scale using min-max normalization
<- min(et_weekly$et_mm_day, na.rm = TRUE)
et_min <- max(et_weekly$et_mm_day, na.rm = TRUE)
et_max
<- et_weekly |>
et_standardized ::mutate(
dplyrstandardized_value = 100 * (et_mm_day - et_min) / (et_max - et_min),
variable_type = "ET Demand (standardized)"
|>
) ::select(state, date, standardized_value, variable_type)
dplyr
paste("ET standardization range:", round(et_min, 2), "to", round(et_max, 2), "mm/day")
[1] "ET standardization range: 1.77 to 7.17 mm/day"
Agricultural stress is already on a 0-100 scale.
# Agricultural stress is already in percentage (0-100), so no transformation needed
<- agricultural_drought_stress |>
agricultural_standardized ::mutate(
dplyrstandardized_value = poor_very_poor_percent,
variable_type = "Agricultural Stress (%)"
|>
) ::select(state, date, standardized_value, variable_type) dplyr
Now combine them together.
# Combine all standardized variables into single dataset
<- dplyr::bind_rows(
integrated_data
soil_moisture_standardized,
et_standardized,
agricultural_standardized )
Color Palette for Multi-Variable Plot
We’ll create a three-color palette that complements our existing NIDIS drought colors while maintaining good visual distinction between the three variables.
# Define color palette for the three key variables
<- c(
integrated_colors "Soil Moisture (%)" = "#4169E1", # Royal blue (cool, water-related)
"ET Demand (standardized)" = "#FF8C00", # Dark orange (warm, energy-related)
"Agricultural Stress (%)" = "#228B22" # Forest green (agriculture-related)
)
Creating the Integrated Time Series Plot
Now we’ll create a multi-panel strip plot with one panel per state (6 rows, 1 column), showing all three variables on standardized scales to reveal the temporal relationships between meteorological drivers, soil moisture response, and agricultural impacts.
# Create the faceted plot
<- ggplot2::ggplot(integrated_data,
integrated_timeseries_plot ::aes(x = date, y = standardized_value,
ggplot2color = variable_type)) +
::geom_line(size = 1.2, alpha = 0.8) +
ggplot2::geom_point(size = 1.5, alpha = 0.7) +
ggplot2::facet_wrap(~ state, ncol = 2, scales = "fixed") + # Changed to facet_wrap
ggplot2::scale_color_manual(
ggplot2name = "Component:",
values = integrated_colors
+
) ::scale_x_date(
ggplot2name = "Date",
date_breaks = "1 month",
date_labels = "%b %d",
expand = c(0.02, 0)
+
) ::scale_y_continuous(
ggplot2name = "Standardized Values (0-100 scale)",
limits = c(0, 100),
breaks = c(0, 25, 50, 75, 100),
expand = c(0.02, 0)
+
) ::labs(
ggplot2title = "Integrated Drought Analysis: Drivers, Indicators, and Outcomes",
subtitle = "Standardized time series for soil moisture, atmospheric demand, and agricultural stress",
caption = "Sources: NASA SPORT-LIS (soil moisture), NOAA URMA (meteorology), USDA NASS (agriculture)\nNote: All variables standardized to 0-100 scale for comparison"
+
) ::theme_minimal() +
ggplot2::theme(
ggplot2panel.spacing = ggplot2::unit(0.8, "lines"),
panel.grid.minor = ggplot2::element_blank(),
panel.grid.major.x = ggplot2::element_blank(),
axis.text.x = ggplot2::element_text(angle = 45, hjust = 1, size = 10),
axis.text.y = ggplot2::element_text(size = 10),
axis.title = ggplot2::element_text(size = 12, face = "bold"),
strip.text = ggplot2::element_text(size = 12, face = "bold"), # Changed from strip.text.y
strip.background = ggplot2::element_rect(fill = "grey95", color = "grey80"),
legend.position = "bottom",
legend.direction = "horizontal",
legend.title = ggplot2::element_text(size = 11, face = "bold"),
legend.text = ggplot2::element_text(size = 10),
plot.title = ggplot2::element_text(size = 14, face = "bold"),
plot.subtitle = ggplot2::element_text(size = 12, color = "grey30"),
plot.caption = ggplot2::element_text(size = 9, color = "grey50", hjust = 0)
)
integrated_timeseries_plot
This integrated visualization reveals the precise mechanisms through which the 2022 flash drought transformed atmospheric extremes into agricultural catastrophe across the Southern Plains. Texas and Oklahoma show the classic flash drought signature: soil moisture percentiles (blue line) plummeting from moderate levels to near-zero by mid-July, while ET demand (orange line) simultaneously spiked to maximum levels, creating a devastating combination that triggered immediate agricultural stress (green line) that persisted through the entire analysis period. Louisiana demonstrates resilience, with soil moisture remaining relatively stable and agricultural stress staying minimal despite elevated ET demand. Missouri shows an intermediate pattern where early-season agricultural stress developed before the major soil moisture decline, suggesting complex regional factors beyond simple soil-atmosphere coupling.
The most striking pattern emerges in Kansas, Oklahoma, and Arkansas, where agricultural systems appear to have crossed a critical threshold during the peak ET demand period (July-August) from which they never recovered. In these states, even when soil moisture percentiles began to improve in late summer and fall, agricultural stress remained elevated or continued to worsen, suggesting that pasture systems had entered a state of irreversible deterioration. This pattern reveals a fundamental characteristic of flash drought impacts: once vegetation crosses certain physiological thresholds under extreme atmospheric demand, recovery becomes limited even when soil moisture conditions improve. The sustained agricultural stress despite recovering soil moisture demonstrates why flash drought can have such long-lasting economic impacts - the rapid initial deterioration can damage agricultural systems in ways that persist long after the meteorological extremes subside, explaining why the 2022 event continued to affect ranching operations well into 2023.
Quantifying Flash Drought Relationships
Our integrated visualization clearly demonstrates how soil moisture depletion, atmospheric water demand, and agricultural stress evolved together during the 2022 flash drought, revealing the threshold behaviors and temporal patterns that distinguish flash drought from conventional drought development. But visual interpretation only takes us so far - to truly understand the strength and consistency of these relationships, we need quantitative analysis that can measure precisely how these variables interact across different states and time periods.
From Visualization to Quantification:
While operational flash drought research employs sophisticated methods like machine learning algorithms, time series modeling, and threshold analysis to develop early warning systems, we’ll use correlation analysis - a foundational statistical approach that provides clear, interpretable measures of how strongly our variables relate to each other. Correlation coefficients range from -1 to +1, where values near -1 indicate strong negative relationships (as one variable increases, the other decreases), values near +1 indicate strong positive relationships (both variables increase together), and values near 0 suggest little linear relationship.
Expected Flash Drought Relationships:
Based on our understanding of flash drought processes, we anticipate seeing negative correlations between soil moisture and agricultural stress (when soil moisture drops, agricultural stress increases), positive correlations between atmospheric demand and agricultural stress (higher ET demand drives more agricultural stress), and negative correlations between soil moisture and ET demand (high atmospheric demand depletes soil moisture). These correlation patterns will help us quantify the coupling strength between atmospheric drivers and land surface responses that makes flash drought such a devastating phenomenon for agricultural systems.
# Calculate correlations between the three flash drought variables by state
<- integrated_data |>
correlation_data ::select(state, date, variable_type, standardized_value) |>
dplyr::pivot_wider(names_from = variable_type, values_from = standardized_value)
tidyr
<- correlation_data |>
state_correlations ::group_by(state) |>
dplyr::summarise(
dplyrsoil_vs_agricultural = cor(`Soil Moisture (%)`, `Agricultural Stress (%)`, use = "complete.obs"),
et_vs_agricultural = cor(`ET Demand (standardized)`, `Agricultural Stress (%)`, use = "complete.obs"),
soil_vs_et = cor(`Soil Moisture (%)`, `ET Demand (standardized)`, use = "complete.obs"),
.groups = "drop"
)
# Create formatted HTML table for state correlations
|>
state_correlations ::kable(
knitrdigits = 3,
caption = "Flash Drought Variable Correlations by State",
col.names = c("State", "Soil Moist. vs\nAg. Stress", "ET Demand vs\nAg. Stress", "Soil Moist. vs\nET Demand")
|>
) ::kable_styling(bootstrap_options = c("striped", "hover", "condensed")) |>
kableExtra::add_header_above(c(" " = 1, "Flash Drought Relationships" = 3)) |>
kableExtra::column_spec(1, bold = TRUE) |>
kableExtra::column_spec(2, color = ifelse(state_correlations$soil_vs_agricultural < -0.6, "red", "black")) |>
kableExtra::column_spec(3, color = ifelse(state_correlations$et_vs_agricultural > 0.6, "blue", "black")) |>
kableExtra::column_spec(4, color = ifelse(state_correlations$soil_vs_et < -0.6, "green", "black")) kableExtra
State | Soil Moist. vs Ag. Stres | | ET Demand vs Ag. Stre | s| Soil Moist. vs ET Dem |
---|---|---|---|
Arkansas | -0.788 | -0.289 | 0.035 |
Kansas | -0.866 | -0.627 | 0.479 |
Louisiana | -0.350 | -0.554 | -0.207 |
Missouri | -0.712 | -0.639 | 0.327 |
Oklahoma | -0.401 | -0.601 | 0.047 |
Texas | -0.786 | 0.731 | -0.661 |
These correlation results reveal fascinating differences in how flash drought processes operated across our six-state study region during 2022. Kansas shows the strongest soil moisture-agricultural stress relationship (-0.866), indicating that pasture conditions were tightly coupled to soil moisture availability - when soil moisture dropped, agricultural stress increased almost predictably. Arkansas (-0.788) and Texas (-0.786) also show strong negative relationships, confirming that soil moisture depletion directly drove agricultural deterioration in these states.
The atmospheric demand relationships tell a more complex story. Texas stands out with an unexpectedly strong positive correlation (0.731) between ET demand and agricultural stress, suggesting that atmospheric “thirst” directly translated into pasture stress. However, most other states show weaker or even negative relationships between ET and agricultural stress, indicating that soil moisture availability was more important than atmospheric demand for determining agricultural outcomes. Louisiana shows the weakest correlations across all relationships (-0.350, -0.554, -0.207), confirming our earlier observation that this state remained relatively resilient to flash drought impacts. Perhaps most intriguingly, Oklahoma shows a weak soil moisture-agricultural stress correlation (-0.401) despite experiencing severe drought conditions, suggesting that other factors beyond simple soil moisture availability - such as soil type, vegetation characteristics, or management practices - may have influenced how agricultural systems responded to the 2022 flash drought in this state.
Conclusion
This lesson provided a comprehensive exploration of the 2022 Southern Plains flash drought through an integrated multi-dataset analysis that connected atmospheric forcing to agricultural outcomes. By systematically examining NASA SPORT-LIS soil moisture percentiles, NOAA URMA meteorological data, calculated evapotranspiration demand, and USDA agricultural surveys, we traced the complete pathway from extreme atmospheric conditions to devastating agricultural impacts across six states. The analysis revealed how flash drought develops as a coupled Earth system phenomenon where atmospheric extremes, land surface responses, and agricultural systems interact through complex feedback mechanisms that can rapidly transform normal conditions into agricultural catastrophe.
Our integrated approach demonstrated the power of combining operational monitoring datasets to understand extreme events. The correlation analysis confirmed that soil moisture depletion was the primary driver of agricultural stress in most states, while revealing important regional differences in how flash drought processes operated across different landscapes and agricultural systems. The temporal analysis showed how rapidly these relationships can develop - with agricultural stress appearing within weeks of soil moisture decline and atmospheric demand spikes - highlighting why flash drought poses such significant challenges for agricultural communities who have little time to implement adaptive strategies.
Perhaps most importantly, this lesson illustrated how open science and freely accessible datasets enable comprehensive analysis of complex environmental phenomena. The NASA, NOAA, and USDA data portals we accessed represent massive investments in Earth observation infrastructure that democratize access to the information needed for understanding and responding to extreme events. By learning to work with these operational datasets, you’ve gained skills directly applicable to careers in agricultural consulting, drought monitoring, emergency management, and environmental research - while developing the quantitative and data visualization capabilities needed to communicate complex Earth system science to diverse audiences.
Key Learning Points
Congratulations! In this lesson you:
- Accessed and processed operational drought monitoring data from NASA SPORT-LIS, extracting daily soil moisture percentiles and creating time series analyses that revealed the rapid soil drying characteristic of flash drought development
- Retrieved and analyzed high-resolution meteorological data from NOAA URMA, including temperature, humidity, and wind variables that drive evapotranspiration and flash drought intensification
- Calculated reference evapotranspiration using the Hargreaves-Samani method to quantify atmospheric water demand during extreme conditions, demonstrating how meteorological variables combine to create “atmospheric thirst”
- Processed agricultural survey data from USDA NASS to examine real-world drought impacts on pasture and rangeland conditions, connecting physical drought indicators to measurable agricultural outcomes
- Created integrated visualizations combining multiple datasets with different scales and units to reveal temporal relationships between meteorological drivers, soil moisture depletion, and agricultural stress
- Applied data standardization techniques using min-max scaling to enable multi-variable comparisons while preserving essential temporal patterns and relative relationships
- Performed temporal and correlation analysis to quantify the strength of flash drought relationships across different states and identify regional variations in drought response mechanisms
- Interpreted complex multi-panel visualizations including strip plots, stacked bar charts, and time series that demonstrate the progression from atmospheric extremes to agricultural impacts
- Connected Earth system science to real-world applications by demonstrating how operational monitoring systems track flash drought development and support agricultural decision-making
- Developed professional data management practices including API authentication, environmental variable security, and reproducible analytical workflows essential for working with operational datasets