This vignette explores advanced uses of the npi package.
npi
is an R package that allows R users to access the U.S. National Provider
Identifier (NPI) Registry API by the Center for Medicare and
Medicaid Services (CMS). The package makes it easy to obtain
administrative data linked to a specific individual or organizational
healthcare provider. Additionally, users can perform advanced searches
based on provider name, location, type of service, credentials, and many
other attributes.
See the npi::npi vignette for an introduction to the package.
Note on NPI Downloadable Files
CMS regularly releases full NPI data files here. We
recommend that users download the data file if they need to work with
the entire dataset. The API and npi_search()
returns a
maximum of 1,200 records. Also consider downloading the entire data if
you need to work with more than the maximum. Data dissemination files
are zipped and will exceed 4GB upon decompression.
Run npi_search()
on multiple search terms
npi_search()
enables search for a defined set query
parameters. The function is not designed for search on multiple values
of the same argument at once, as for example in the case of multiple NPI
numbers in a single function call. However, users can still serially
execute searches for multiple values of a single query parameter by
using npi
in combination with the purrr
package. In
the example below, we search multiple NPI numbers. A single tibble is
returned with record information corresponding to matching records. The
purrr:map()
function is used to apply the npi_search()
function on each
element of the vector. Thereafter, the dplyr::bind_rows()
function is used to combine the list of dataframes together into a
single dataframe.
npis <- c(1992708929, 1831192848, 1699778688, 1111111111) # Last element doesn't exist
out <- npis %>%
purrr::map(., ~ npi_search(number = .)) %>%
dplyr::bind_rows()
#> Error in get(paste0(generic, ".", class), envir = get_method_env()) :
#> object 'type_sum.accel' not found
#> 10 records requested
#> Requesting records 0-10...
#> 10 records requested
#> Requesting records 0-10...
#> 10 records requested
#> Requesting records 0-10...
#> 10 records requested
#> Requesting records 0-10...
npi_summarize(out)
#> # A tibble: 2 × 6
#> npi name enumeration_type primary_practice_add…¹ phone primary_taxonomy
#> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 19927089… NOVA… Organization 3200 DOWNWOOD CIR NW … 404-… Orthopaedic Sur…
#> 2 18311928… MATT… Individual 3672 MARATHON CIRCLE … 770-… Clinic/Center, …
#> # ℹ abbreviated name: ¹primary_practice_address
Here we search for multiple zip codes in Los Angeles County.
codes <- c(90210, 90211, 90212)
zip_3 <- codes %>%
purrr::map(., ~ npi_search(postal_code = .)) %>%
dplyr::bind_rows()
#> 10 records requested
#> Requesting records 0-10...
#> 10 records requested
#> Requesting records 0-10...
#> 10 records requested
#> Requesting records 0-10...
npi_flatten(zip_3)
#> # A tibble: 104 × 47
#> npi basic_organization_n…¹ basic_organizational…² basic_enumeration_date
#> <chr> <chr> <chr> <chr>
#> 1 1073703… A R T REPRODUCTIVE CE… NO 2007-07-25
#> 2 1073703… A R T REPRODUCTIVE CE… NO 2007-07-25
#> 3 1154507… AARON M. PERLMUTTER, … NO 2008-01-11
#> 4 1154507… AARON M. PERLMUTTER, … NO 2008-01-11
#> 5 1154507… AARON M. PERLMUTTER, … NO 2008-01-11
#> 6 1154507… AARON M. PERLMUTTER, … NO 2008-01-11
#> 7 1154507… AARON M. PERLMUTTER, … NO 2008-01-11
#> 8 1154507… AARON M. PERLMUTTER, … NO 2008-01-11
#> 9 1154507… AARON M. PERLMUTTER, … NO 2008-01-11
#> 10 1154507… AARON M. PERLMUTTER, … NO 2008-01-11
#> # ℹ 94 more rows
#> # ℹ abbreviated names: ¹basic_organization_name, ²basic_organizational_subpart
#> # ℹ 43 more variables: basic_last_updated <chr>, basic_status <chr>,
#> # basic_authorized_official_first_name <chr>,
#> # basic_authorized_official_last_name <chr>,
#> # basic_authorized_official_middle_name <chr>,
#> # basic_authorized_official_telephone_number <chr>, …
Consult the R for Data Science chapter on iteration to
learn more about using the purrr
package.
Alternatively, you can use a simple for loop instead if you are unfamiliar with the tidyverse approach.
npis <- c(1992708929, 1831192848, 1699778688, 1111111111) # Last element doesn't exist
combined_df <- data.frame()
for (i in npis) {
combined_df <- rbind(combined_df, npi_search(number = i))
}
#> 10 records requested
#> Requesting records 0-10...
#> 10 records requested
#> Requesting records 0-10...
#> 10 records requested
#> Requesting records 0-10...
#> 10 records requested
#> Requesting records 0-10...
npi_summarize(combined_df)
#> # A tibble: 2 × 6
#> npi name enumeration_type primary_practice_add…¹ phone primary_taxonomy
#> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 19927089… NOVA… Organization 3200 DOWNWOOD CIR NW … 404-… Orthopaedic Sur…
#> 2 18311928… MATT… Individual 3672 MARATHON CIRCLE … 770-… Clinic/Center, …
#> # ℹ abbreviated name: ¹primary_practice_address