# download blsR
#install.library("blsR")
# import relevant libraries
library(tidyverse)
library(blsR)
library(here)
Downloading BLS data via blsR
Loading blsR and setting API key
blsR is the R package written and maintained by the Bureau of Labor Statistics to provide users with BLS data, including Current Employment Statistics (CES), Worker characteristics data (CPS), inflation & prices (CPI), and Job Openings Layoff and Turnover Survey (JOLTS), among others.
BLS provides this data through the use of an Application Programming Interface (API), which allows R to directly import data from BLS using the series IDs which are extensively documented by BLS. This process automates the manual retrieval of data using the BLS series report and delivers the tidied data.
The following chunk of code loads the R libraries necessary for this exercise. You may need to install them to run this code.
In order to access BLS data you must register for a unique API key. Save your key as an environmental object to be used later.
# set key for BLS api
#note: each user must register for unique BLS API key here: https://www.bls.gov/developers/home.htm
<- Sys.getenv("your-key-goes-here") bls_key
Nominal average hourly earnings of production and non-supervisory employees
You can import a single table using the blsR::get_n_series_table() function:
## use blsR to pull in nominal wages
<- get_n_series_table(series = "CEU0500000008", api_key = bls_key,
nominal_wages start_year = 1965, end_year = 2023,
tidy = TRUE, annualaverage = TRUE) %>%
# filter for annual data
filter(month == 13)
Real average hourly earnings of production and non-supervsory employees
You can also import multiple series at once. For example, we can import AHE and CPI in order to calculate real wages.
# set cpi codes
<- c("CUUR0000SA0",
cpi_codes "CEU0500000008")
# set cpi base
#note: used to set base year for inflation-adjustment
<- get_n_series_table(series_ids = "CUUR0000SA0", start_year = 2023, end_year = 2023,
cpi_base api_key = bls_key, tidy = TRUE, annualaverage = TRUE) %>%
# filter annual 2023 data and pull CPI value as base
filter(month == 13) %>% pull(CUUR0000SA0)
# use blsR to pull CPI data
<- get_n_series_table(series_ids = cpi_codes, start_year = 1947, end_year = 2023,
cpi_output api_key = bls_key, tidy = TRUE, annualaverage = TRUE) %>%
# filter annual data
filter(month == 13) %>%
# rename for easier handling
rename(ahe = CEU0500000008, cpi = CUUR0000SA0) %>%
# calculate real wages
mutate(ahe_real = ahe * (cpi_base/cpi)) %>%
# export to delimited file
write.csv(here("ahe_real.csv"))
Pulling more than 50 series
The BLS API limits each call to 50 series, which can be limiting when you are trying to pull in large datasets. Here is a trick I use in the code that runs our Jobs and Unemployment page to workaround this limitation:
# read in bls series codes
#note: this is a random selection of codes used in our Jobs and Unemployment figures
<- read.csv(here("bls_codes.csv"))
bls_codes # remove any blanks
<- bls_codes$series_id[bls_codes$series_id != ""]
bls_codes
# use map to iteratively call blsR api at max number of series id
#note: BLS restricts to max 50 series in a single call
<- map(split(bls_codes, ceiling(seq_along(bls_codes) / 50)), # split codes into groups of 50
jobs_day_df # call blsR using series ids sliced into groups of 50
~ get_n_series_table(series_ids = .x, start_year = 1939, end_year = 2023,
api_key = bls_key, tidy = TRUE)) %>%
# map returns list, flatten by joining data
reduce(., function(df1, df2) full_join(df1, df2, by = c("year", "month"))) %>%
# define date
mutate(date = as.POSIXct(paste(year,month,1, sep = "-")),
date = as.Date(date)) %>%
write_csv(here("jobs_day_example.csv"))