Skip to content

This vignette previews how text, lists, and code blocks render under the epitemplate pkgdown theme.

Text

The Economic Policy Institute’s CPS extracts contain microdata from the Current Population Survey, including variables on earnings, hours, demographics, and union status. These extracts are maintained by EPI and distributed through the epiextractr R package.

Researchers can load ORG, Basic Monthly, March, and May supplement files. All columns are selected by default, but users can specify a subset of variables for faster loading and lower memory use.

Bullet lists

Key variables in the CPS ORG extracts:

  • year and month — survey reference period
  • wage — hourly wage (set to hourly earnings for hourly workers, usual weekly earnings divided by usual hours for non-hourly workers)
  • orgwgt — CPS ORG supplement weight
  • female — indicator for gender
  • wbhao — race/ethnicity (white, Black, Hispanic, Asian, other)
  • union — union membership status

Steps for a typical analysis:

  1. Download the extracts with download_cps()
  2. Load the data with load_org() or load_cps()
  3. Filter to the relevant sample
  4. Calculate weighted statistics using the survey weights

Nested list

  • CPS supplements
    • ORG (Outgoing Rotation Group)
    • Basic Monthly
    • March (ASEC)
    • May
  • Common analyses
    • Wage percentiles
    • Employment-to-population ratios
    • Union wage premiums

Numbered list with detail

  1. Load data. Use load_org() to read the ORG supplement for selected years. Specify only the variables you need to keep memory use low.

  2. Filter the sample. Drop observations with zero or missing weights. Restrict to wage earners if estimating the wage distribution.

  3. Calculate estimates. Use weighted.mean() or Hmisc::wtd.quantile() with the appropriate survey weight. Group by year, demographics, or other characteristics as needed.

Code

Load a selection of years and variables from the CPS ORG extracts:

library(epiextractr)
library(dplyr)

cps_org <- load_org(2019:2024, year, month, orgwgt, female, wbhao, wage)

Calculate median wages by year:

cps_org |>
  filter(orgwgt > 0, !is.na(wage)) |>
  summarize(
    median_wage = Hmisc::wtd.quantile(wage, orgwgt, probs = 0.5),
    .by = year
  )

Calculate employment-to-population ratios by race/ethnicity using Basic Monthly data:

load_cps("basic", 2010:2024, year, basicwgt, wbhao, emp) |>
  filter(basicwgt > 0) |>
  summarize(
    epop = weighted.mean(emp, w = basicwgt),
    .by = c(year, wbhao)
  )

Use load_org_sample() for a small demonstration dataset that does not require downloading the full extracts. See the epiextractr documentation for details.