#Load necessary libraries
library(tidyverse) #this package contains most common R functions
library(epiextractr) #allows you to get CPS (e.g., Basic, ORG) extracts
library(labelled) #allows you to easily clean data; includes functions like 'to_factor'
library(zoo) #contains functions like rollmean and rollsum
library(openxlsx2) #allows you to format and export data to an Excel workbookCalculating rolling averages
This script will teach you how to calculate simple rolling averages using employment-to-population ratios (EPOPs). Note: this code will be updated soon to account for missing October 2025 CPS data.
Load required libraries
The following chunk of code loads the R libraries necessary for this exercise. You may need to install them to run this code. If you haven’t yet set up your computer to run EPI CPS microdata extracts, complete that process before running this code.
Import and clean data
Note: Don’t forget to update years to match your setup before running the script.
Running this script chunk will call the BLS Current Population Survey Basic data required to calculate rolling average EPOPs. It will also create a variable that will allow you to calculate prime-age EPOPs and set up some date formatting for the rolling average calculation later.
# Import CPS ORG data
basic <- load_basic(2018:2024, "year", "month", "basicwgt", "age", "wbhao", "selfinc", "emp") |>
#Filter to working age population with employment information
filter(age >= 16, !is.na(emp)) |>
# Create prime-age variable
mutate(prime_age = case_when(between(age, 25, 54) ~ 1, TRUE ~ 0)) |>
# Create date in yyyy-mm-dd format
mutate(date = ymd(paste0(year,'-', month,'-1'))) Calculate monthly rolling averages for EPOPs
This code chunk calculates the (prime-age) EPOP by finding a weighted average of the binary variable emp (1 = employed, 0 = unemployed or not in the labor force), sets up the rolling average, and applies an (optional) data suppression threshold. The suppression threshold is likely not necessary for national data, but it can become important at the state level or with narrow demographic cuts. Though there is no hard and fast rule for how large a sample needs to be, smaller samples will produce more noisy, less reliable data.
epop_all <- basic |>
# Optional: filter to calculate prime-age EPOPS
filter(prime_age == 1) |>
# Calculate EPOP as the weighted mean of 'emp' variable
summarise(epop = weighted.mean(emp, w = basicwgt),
n=n(),
.by=date) |>
# Calculate rolling average of previous 12 months for both EPOP and sample size
mutate(all_epop12 = rollmean(epop, k = 12, align = "right", fill = NA),
all_sample12 = rollsum(n, k = 12, align = "right", fill = NA)) |>
select(date, all_epop12, all_sample12) |>
# Suppress data with sample size of < 250
mutate(all_epop12 = if_else(is.na(all_sample12) | all_sample12 < 250,
NA_real_, all_epop12))Calculate EPOPs by race/ethnicity and format output
This section adds code that breaks out EPOPs by race/ethnicity, using the wbhao variable. It also pivots the data so that each race/ethnicity group is its own column.
epop_race <- basic |>
summarise(epop = weighted.mean(emp, w = basicwgt),
n=n(),
.by=c(wbhao, date)) |>
mutate(race = to_factor(wbhao)) |>
# Ensure correct order within each racial/ethnic category
arrange(race, date) |>
# Compute rolling mean per 'race' group
group_by(race) |>
mutate(epop12 = rollmean(epop, k = 12, align = "right", fill = NA),
sample12 = rollsum(n, k = 12, align = "right", fill = NA)) |>
ungroup() |>
select(date, race, epop12, sample12) |>
# Pivot the data to display each race/ethnicity as its own column with appropriate label
pivot_wider(
names_from = race,
values_from = c(epop12, sample12),
names_glue = "{race}_{.value}"
)Export your data to a downloadable Excel
Use openxlsx2 to create a downloadable Excel. Don’t forget to edit the file path to ensure it saves to the right location.
# Export epop tables to one workbook
wb <- wb_workbook()
wb$
# Add worksheet
add_worksheet(sheet = "US EPOPs")$
add_data(x = epop_all)$
add_worksheet(sheet = "EPOPs by race")$
add_data(x = epop_race)
wb_save(wb, file = "output/epop_tables.xlsx", overwrite = TRUE)And that’s it! You can benchmark your data to the State of Working America Data Library before filtering to specific states or demographic cuts to ensure reliability. As always, be sure to keep an eye on those sample sizes!