
rfema: Getting Started
Dylan Turner
2025-08-09
Source:vignettes/getting_started.Rmd
getting_started.Rmd
Introduction
This vignette provides a brief overview on using the
rfema
package to obtain data from the Open FEMA API. The
rest of this vignette covers how to install the package, followed by
examples on using the package to obtain data for various objectives.
Installation
Right now, the best way to install and use the rfema
package is by installing directly from rOpenSci using
install.packages("rfema", repos = "https://ropensci.r-universe.dev")
.
The FEMA API does not require an API key, meaning no further setup steps
need be taken to start using the package
Available Datasets
For those unfamiliar with the data sets available through the FEMA
API, a good starting place is to visit the FEMA API
documentation page. However, if you are already familiar with the
data and want to quickly reference the data set names or another piece
of meta data, using the fema_data_sets()
function to obtain
a tibble of available data sets along with associated meta data is a
convenient option.
# store the available data sets as an object in your R environment that can be referenced later
data_sets <- fema_data_sets()
# view data
data_sets
## # A tibble: 46 × 35
## identifier name title description webService dataDictionary keyword modified
## <chr> <chr> <chr> <chr> <chr> <chr> <list> <chr>
## 1 openfema-… Publ… Publ… "This data… https://w… https://www.f… <list> 2024-10…
## 2 openfema-… Haza… Haza… "This data… https://w… https://www.f… <list> 2024-04…
## 3 openfema-… Miss… Miss… "A mission… https://w… https://www.f… <list> 2024-08…
## 4 openfema-… Ipaw… IPAW… "The Integ… https://w… https://www.f… <list> 2024-05…
## 5 openfema-… Fima… FIMA… "Congress … https://w… https://www.f… <list> 2023-03…
## 6 openfema-… Haza… Haza… "This data… https://w… https://www.f… <list> 2024-11…
## 7 openfema-… Fema… FEMA… "This data… https://w… https://www.f… <list> 2023-01…
## 8 openfema-1 Publ… Publ… "FEMA prov… https://w… https://www.f… <list> 2024-01…
## 9 openfema-… Indi… Indi… "Individua… https://w… https://www.f… <list> 2024-01…
## 10 openfema-… Fema… FEMA… "Provides … https://w… https://www.f… <list> 2023-09…
## # ℹ 36 more rows
## # ℹ 27 more variables: publisher <chr>, contactPoint <chr>, mbox <chr>,
## # accessLevel <chr>, landingPage <list>, temporal <list>, api <lgl>,
## # version <int>, bureauCode <chr>, programCode <chr>, license <list>,
## # theme <list>, dataQuality <chr>, accrualPeriodicity <chr>, language <chr>,
## # references <list>, issued <chr>, recordCount <list>, depDate <chr>,
## # depApiMessage <chr>, depWebMessage <chr>, depNewURL <chr>, hash <chr>, …
# print out just the names of the available data sets without all the other meta data
paste(data_sets$title, sep = ", ")
## [1] "Public Assistance Grant Award Activities"
## [2] "Hazard Mitigation Assistance Mitigated Properties"
## [3] "Mission Assignments"
## [4] "IPAWS Archived Alerts"
## [5] "FIMA NFIP Redacted Claims"
## [6] "Hazard Mitigation Assistance Projects Financial Transactions"
## [7] "FEMA Web Disaster Summaries"
## [8] "Public Assistance Funded Project Summaries"
## [9] "Individual Assistance Housing Registrants - Large Disasters"
## [10] "FEMA Regions"
## [11] "Housing Assistance Program Data - Owners"
## [12] "HMA Subapplications By NFIP CRS Communities"
## [13] "Public Assistance Applicants"
## [14] "FEMA Web Disaster Declarations"
## [15] "OpenFEMA Data Set Fields"
## [16] "OpenFEMA Dataset Codes"
## [17] "FIMA NFIP Redacted Policies"
## [18] "NFIP Community Layer Comprehensive"
## [19] "HMA Subapplications Congressional Districts"
## [20] "Public Assistance Funded Projects Details"
## [21] "NFIP Community Layer No Overlaps Split"
## [22] "NFIP Residential Penetration Rates"
## [23] "Individual Assistance Multiple Loss Flood Properties"
## [24] "NFIP Community Layer No Overlaps Whole"
## [25] "NFIP Multiple Loss Properties"
## [26] "Housing Assistance Program Data - Renters"
## [27] "Individuals and Households Program - Valid Registrations"
## [28] "Emergency Management Performance Grants"
## [29] "Non-Disaster and Assistance to Firefighter Grants"
## [30] "HMA Subapplications Project Site Inventories"
## [31] "HMA Subapplications"
## [32] "Declaration Denials"
## [33] "Public Assistance Applicants Program Deliveries"
## [34] "Hazard Mitigation Assistance Projects"
## [35] "Registration Intake and Individuals Household Program (RI-IHP)"
## [36] "HMA Subapplications Financial Transactions"
## [37] "Hazard Mitigation Assistance Projects by NFIP CRS Communities"
## [38] "Hazard Mitigation Plan Statuses"
## [39] "Hazard Mitigation Grant Program - Disaster Summaries"
## [40] "Individuals and Households Program - Valid Registrations"
## [41] "OpenFEMA Data Sets"
## [42] "Public Assistance Funded Projects Details"
## [43] "Disaster Declarations Summaries"
## [44] "Public Assistance Second Appeals Tracker"
## [45] "FEMA Web Declaration Areas"
## [46] "NFIP Community Status Book"
Example Workflow
Once we know what data set we want to access, or perhaps if we want
to know more about what data is available in a given data set, we can
use the fema_data_fields()
function to get a look at the
available data fields in a given data set by setting the “data_set”
parameter to one of the “name” columns in the data frame returned by the
fema_data_sets()
function.
# obtain all the data fields for the NFIP Policies data set
df <- fema_data_fields(data_set = "fimaNfipPolicies")
##
Obtaining Data: 1 out of 2 iterations (50% complete)
Obtaining Data: 2 out of 2
## iterations (100% complete)
# Note: the data set field is not case sensitive, meaning you do not need to
# use camel case names despite that being the convention in the FEMA documentation.
df <- fema_data_fields(data_set = "fimanfippolicies")
# view the data fields
df
## # A tibble: 81 × 16
## datasetId openFemaDataSet datasetVersion name title description type
## <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 openfema-74 FimaNfipPolicies 2 building… Buil… Estimated … bigi…
## 2 openfema-74 FimaNfipPolicies 2 baseFloo… Base… Base Flood… deci…
## 3 openfema-74 FimaNfipPolicies 2 lowestAd… Lowe… Lowest nat… deci…
## 4 openfema-74 FimaNfipPolicies 2 lowestFl… Lowe… A building… deci…
## 5 openfema-74 FimaNfipPolicies 2 cancella… Canc… Reason cod… text
## 6 openfema-74 FimaNfipPolicies 2 basicBui… Basi… Basic buil… deci…
## 7 openfema-74 FimaNfipPolicies 2 addition… Addi… Additional… deci…
## 8 openfema-74 FimaNfipPolicies 2 basicCon… Basi… Basic cont… deci…
## 9 openfema-74 FimaNfipPolicies 2 Addition… Addi… Additional… deci…
## 10 openfema-74 FimaNfipPolicies 2 agricult… Agri… Indicates … bool…
## # ℹ 71 more rows
## # ℹ 9 more variables: sortOrder <chr>, isSearchable <chr>,
## # isNestedObject <chr>, isNullable <chr>, primaryKey <chr>, id <chr>,
## # lastRefresh <chr>, hash <chr>, srid <chr>
The FEMA API limits the number of records that can be returned in a
single query to 1000, meaning if we want more observations than that, a
loop is necessary to iterate over multiple API calls. The
open_fema
function handles this process automatically, but
by default will issue a warning letting you know how many records match
your criteria and how many API calls it will take to retrieve all those
records and ask you to confirm the request before it starts retrieving
data (this behavior can be turned off by setting the
ask_before_call
argument to FALSE
).
Additionally an estimated time will be issued to give you a sense of how
long it will take to complete the request. For example, requesting the
entire NFIP claims data set via
open_fema(data_set = "fimaNfipClaims")
will yield the
following output in the R console.
Calculating estimated API call time...
2600579 matching records found. At 1000 records per call, it will take 2601 individual API calls to get all matching records. It's estimated that this will take approximately 2.12 hours. Continue?
1 - Yes, get that data!, 0 - No, let me rethink my API call:
Note that the estimated time is based on network conditions at the initial time the call is being made and may not be accurate for large data requests that take long enough for network conditions to potential change significantly during the request.
Alternatively, we could specify the top_n argument to limit the
number of records returned. Specifying top_n greater than 1000 will
initiate the same message letting you know how many iterations it will
take to get your data. If top_n
is less than 1000, the API
call will automatically be carried out. In the case below, we will
return the first 10 records from the NFIP Claims data.
df <- open_fema(data_set = "fimaNfipClaims", top_n = 10)
df
## # A tibble: 10 × 73
## agricultureStructure…¹ asOfDate basementEnclosureCra…² policyCount
## <chr> <dttm> <chr> <chr>
## 1 FALSE 2025-07-01 00:00:00 NULL 1
## 2 FALSE 2025-07-01 00:00:00 NULL 1
## 3 FALSE 2025-07-01 00:00:00 NULL 1
## 4 FALSE 2025-07-01 00:00:00 NULL 1
## 5 FALSE 2025-07-01 00:00:00 0 1
## 6 FALSE 2025-07-01 00:00:00 NULL 1
## 7 FALSE 2025-07-01 00:00:00 4 1
## 8 FALSE 2025-07-01 00:00:00 NULL 1
## 9 FALSE 2025-07-01 00:00:00 1 1
## 10 FALSE 2025-07-01 00:00:00 1 1
## # ℹ abbreviated names: ¹agricultureStructureIndicator,
## # ²basementEnclosureCrawlspaceType
## # ℹ 69 more variables: crsClassificationCode <chr>, dateOfLoss <dttm>,
## # elevatedBuildingIndicator <chr>, elevationCertificateIndicator <chr>,
## # elevationDifference <chr>, baseFloodElevation <chr>, ratedFloodZone <chr>,
## # houseWorship <chr>, locationOfContents <chr>, lowestAdjacentGrade <chr>,
## # lowestFloorElevation <chr>, numberOfFloorsInTheInsuredBuilding <chr>, …
If we wanted to limit the columns returned we could do so by passing
a character vector of data fields to be included in the returned data
frame. The data fields for a given data set can be retrieved using the
fema_data_fields()
function.
data_fields <- fema_data_fields("fimanfipclaims")
data_fields
## # A tibble: 73 × 16
## datasetId openFemaDataSet datasetVersion name title description type
## <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 openfema-73 FimaNfipClaims 2 asOfDate As o… The effect… date…
## 2 openfema-73 FimaNfipClaims 2 amountPai… Amou… Dollar amo… deci…
## 3 openfema-73 FimaNfipClaims 2 amountPai… Amou… Dollar amo… deci…
## 4 openfema-73 FimaNfipClaims 2 amountPai… Amou… ICC covera… deci…
## 5 openfema-73 FimaNfipClaims 2 netBuildi… Net … Net buildi… deci…
## 6 openfema-73 FimaNfipClaims 2 netConten… Net … Net conten… deci…
## 7 openfema-73 FimaNfipClaims 2 agricultu… Agri… Indicates … bool…
## 8 openfema-73 FimaNfipClaims 2 basementE… Base… Basement i… smal…
## 9 openfema-73 FimaNfipClaims 2 policyCou… Poli… Insured un… smal…
## 10 openfema-73 FimaNfipClaims 2 crsClassi… CRS … The Commun… smal…
## # ℹ 63 more rows
## # ℹ 9 more variables: sortOrder <chr>, isSearchable <chr>,
## # isNestedObject <chr>, isNullable <chr>, primaryKey <chr>, id <chr>,
## # lastRefresh <chr>, hash <chr>, srid <chr>
In this case we will return only the policyCount
and
countyCode
columns.
## # A tibble: 10 × 2
## policyCount countyCode
## <chr> <chr>
## 1 1 34009
## 2 1 12086
## 3 1 12086
## 4 1 34031
## 5 1 48167
## 6 1 22109
## 7 1 36103
## 8 1 36059
## 9 1 34039
## 10 1 29189
If we want to limit the rows returned rather than the columns, we can
also apply filters by specifying values of the columns to return. If we
want to quickly see the set of variables that can be used to filter API
queries with, we can use the valid_parameters()
function to
return a tibble containing the variables that are “searchable” for a
particular data set.
params <- valid_parameters(data_set = "fimaNfipClaims")
params
## # A tibble: 63 × 1
## params
## <chr>
## 1 asOfDate
## 2 amountPaidOnBuildingClaim
## 3 amountPaidOnContentsClaim
## 4 amountPaidOnIncreasedCostOfComplianceClaim
## 5 netBuildingPaymentAmount
## 6 netContentsPaymentAmount
## 7 basementEnclosureCrawlspaceType
## 8 policyCount
## 9 crsClassificationCode
## 10 dateOfLoss
## # ℹ 53 more rows
We can see from the above that both waterDepth
and
ratedfloodZone
are both searchable variables. Thus we can
specify a list that contains the values of each variable that we want
returned. Before doing that however, it can be useful to learn a bit
more about each parameter by using the parameter_values()
function.
# get more information onf the "ratedfloodZone" parameter from the NFIP Claims data set
parameter_values(data_set = "fimaNfipClaims",data_field = "ratedFloodZone")
## Data Set: FimaNfipClaims
## Data Field: ratedFloodZone
## Data Field Description: Formerly called floodZone. NFIP Flood Zone derived from the Flood Insurance Rate Map (FIRM) used to rate the insured property.A - Special Flood with no Base Flood Elevation on FIRM; AE, A1-A30 - Special Flood with Base Flood Elevation on FIRM; A99 - Special Flood with Protection Zone; AH, AHB* - Special Flood with Shallow Ponding; AO, AOB* - Special Flood with Sheet Flow; X, B - Moderate Flood from primary water source. Pockets of areas subject to drainage problems; X, C - Minimal Flood from primary water source. Pockets of areas subject to drainage problems; D - Possible Flood; V - Velocity Flood with no Base Flood Elevation on FIRM; VE, V1-V30 - Velocity Flood with Base Flood Elevation on FIRM; AE, VE, X - New zone designations used on new maps starting January 1, 1986, in lieu of A1-A30, V1-V30, and B and C; AR - A Special Flood Hazard Area that results from the decertification of a previously accredited flood protection system that is determined to be in the process of being restored to provide base flood protection;AR Dual Zones - (AR/AE, AR/A1-A30, AR/AH, AR/AO, AR/A) Areas subject to flooding from failure of the flood protection system (Zone AR) which also overlap an existing Special Flood Hazard Area as a dual zone; *AHB, AOB, ARE, ARH, ARO, and ARA are not risk zones shown on a map, but are acceptable values for rating purposes
## Data Field Example Values: c("AE", "A13", "X", "A04", "C", "A99")
## More Information Available at: https://www.fema.gov/about/openfema/data-sets
As can be seen, parameter_values()
returns the data set
name, the data field (i.e. the searchable parameter), a description of
the data field, and a vector of examples of the data field values which
can be useful for seeing how the values are formatted in the data.
We can see from the above that ratedFloodZone
is a
character in the data and from the description we know that “AE” and “X”
are both valid values for the ratedFloodZone
parameter. We
can thus define a filter to return only records from AE and X flood
zones.
# construct a filter that limits records to those in AE flood zones
my_filters <- list(ratedFloodZone = c("AE","X"))
# pass the filter to the open_fema function.
df <- open_fema(data_set = "fimaNfipclaims", top_n = 10,
select = c("policyCount","ratedFloodZone"),
filters = my_filters)
df
## # A tibble: 10 × 2
## policyCount ratedFloodZone
## <chr> <chr>
## 1 1 AE
## 2 1 AE
## 3 1 AE
## 4 1 X
## 5 1 X
## 6 1 AE
## 7 1 X
## 8 1 X
## 9 1 AE
## 10 1 AE
More Examples
Example: Return the first 100 NFIP claims for Florida that happened between 2010 and 2020.
df <- open_fema(data_set = "fimaNfipClaims",
top_n = 100,
filters = list(state = "FL",
yearOfLoss = ">= 2010",
yearOfLoss = "<= 2020"))
df
## # A tibble: 100 × 73
## agricultureStructure…¹ asOfDate basementEnclosureCra…² policyCount
## <chr> <dttm> <chr> <chr>
## 1 FALSE 2025-07-01 00:00:00 NULL 1
## 2 FALSE 2025-07-01 00:00:00 NULL 1
## 3 FALSE 2025-07-01 00:00:00 2 73
## 4 FALSE 2025-07-01 00:00:00 0 1
## 5 FALSE 2025-07-01 00:00:00 NULL 1
## 6 FALSE 2025-07-01 00:00:00 NULL 1
## 7 FALSE 2025-07-01 00:00:00 0 1
## 8 FALSE 2025-07-01 00:00:00 NULL 1
## 9 FALSE 2025-07-01 00:00:00 NULL 1
## 10 FALSE 2025-07-01 00:00:00 NULL 1
## # ℹ 90 more rows
## # ℹ abbreviated names: ¹agricultureStructureIndicator,
## # ²basementEnclosureCrawlspaceType
## # ℹ 69 more variables: crsClassificationCode <chr>, dateOfLoss <dttm>,
## # elevatedBuildingIndicator <chr>, elevationCertificateIndicator <chr>,
## # elevationDifference <chr>, baseFloodElevation <chr>, ratedFloodZone <chr>,
## # houseWorship <chr>, locationOfContents <chr>, lowestAdjacentGrade <chr>, …
Example: Get data on all Hazard Mitigation Assistance Projects associated with flood mitigation in Florida.
# see which parameter can be used for filtering the Hazard Mitigation Grants data set
valid_parameters("HazardMitigationAssistanceProjects")
## # A tibble: 31 × 1
## params
## <chr>
## 1 initialObligationDate
## 2 initialObligationAmount
## 3 federalShareObligated
## 4 subrecipientAdminCostAmt
## 5 srmcObligatedAmt
## 6 recipientAdminCostAmt
## 7 costSharePercentage
## 8 benefitCostRatio
## 9 netValueBenefits
## 10 numberOfFinalProperties
## # ℹ 21 more rows
# see how values of "programArea" are formatted
params <- parameter_values(data_set = "HazardMitigationAssistanceProjects", data_field = "programArea", message = F)
params
## # A tibble: 1 × 4
## `Data Set` `Data Field` Data Field Descripti…¹ Data Field Example V…²
## <chr> <chr> <chr> <list>
## 1 HazardMitigationAs… programArea Hazard Mitigation Ass… <tibble [2 × 1]>
## # ℹ abbreviated names: ¹`Data Field Description`, ²`Data Field Example Values`
# check to see how "state" is formatted
params <- parameter_values(data_set = "HazardMitigationAssistanceProjects", data_field = "state", message = F)
params
## # A tibble: 1 × 4
## `Data Set` `Data Field` Data Field Descripti…¹ Data Field Example V…²
## <chr> <chr> <chr> <list>
## 1 HazardMitigationAs… state Full name of the Stat… <tibble [6 × 1]>
## # ℹ abbreviated names: ¹`Data Field Description`, ²`Data Field Example Values`
# construct a list containing filters for Flood Mitigation Assistance projects in Florida
filter_list <- c(programArea = c("FMA"),
state = c("Florida"))
# pass filter_list to the open_fema function to retrieve data.
df <- open_fema(data_set = "HazardMitigationAssistanceProjects", filters = filter_list,
ask_before_call = FALSE)
df
## # A tibble: 686 × 33
## projectIdentifier programArea programFy region state stateNumberCode county
## <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 FMA-PJ-04-FL-2013-… FMA 2013 4 Flor… 12 Clay
## 2 FMA-PJ-04-FL-2018-… FMA 2018 4 Flor… 12 Clay
## 3 FMA-PJ-04-FL-2007-… FMA 2007 4 Flor… 12 Volus…
## 4 FMA-PJ-04-FL-2023-… FMA 2023 4 Flor… 12 Volus…
## 5 FMA-PL-04-FL-2014-… FMA 2014 4 Flor… 12 Flagl…
## 6 FMA-PJ-04-FL-2013-… FMA 2013 4 Flor… 12 Clay
## 7 FMA-PJ-04-FL-2013-… FMA 2013 4 Flor… 12 Volus…
## 8 FMA-PJ-04-FL-2023-… FMA 2023 4 Flor… 12 Volus…
## 9 FMA-PJ-04-FL-2006-… FMA 2006 4 Flor… 12 Volus…
## 10 FMA-PJ-04-FL-2013-… FMA 2013 4 Flor… 12 Clay
## # ℹ 676 more rows
## # ℹ 26 more variables: countyCode <chr>, disasterNumber <chr>,
## # projectCounties <chr>, projectType <chr>, status <chr>, recipient <chr>,
## # recipientTribalIndicator <chr>, subrecipient <chr>,
## # subrecipientTribalIndicator <chr>, dataSource <chr>, dateApproved <dttm>,
## # dateClosed <dttm>, dateInitiallyApproved <dttm>, projectAmount <chr>,
## # initialObligationDate <dttm>, initialObligationAmount <chr>, …
Example: Determine how much money was awarded by FEMA for rental assistance following Hurricane Irma.
Get a dataset description for the
HousingAssistanceRenters
data set to see if this is the
right data set for the question
# get meta data for the `HousingAssistanceRenters`
ds <- fema_data_sets()
ds <- ds[which(ds$name == "HousingAssistanceRenters"),]
# there are two entries corresponding to two versions of the data set,
# we want the most recent one
nrow(ds)
## [1] 1
ds <- ds[which(ds$version == max(as.numeric(ds$version))),]
# now print out the data set description and make sure its the data set
# that applicable or our research question
print(ds$description)
## [1] "This dataset was generated by FEMA's Individual Assistance (IA) reporting team to share data on FEMA's Housing Assistance program for house renters within the state, county, and zip code where the registration is valid for the declarations, starting with disaster declaration DR1439 (declared in 2002). It contains aggregated, non-PII data. Core data elements include number of applicants, county, zip code, inspections, severity of damage, and assistance provided. \n\nData is self-reported and subject to human error. For example, when an applicant registers online, they enter their street and city address. While the county is inferred by the system, it may be overridden by the applicant. Similarly, with a call center registration, the Human Services Specialist (HSS) representatives are instructed to ask in what county the applicant resides, but the applicant has the right to choose the county. To learn more about disaster assistance please visit https://www.fema.gov/individual-disaster-assistance.\n\nThe financial information is derived from NEMIS and not FEMA's official financial systems. Due to differences in reporting periods, status of obligations and how business rules are applied, this financial information may differ slightly from official publication on public websites such as usaspending.gov; this dataset is not intended to be used for any official federal financial reporting.\n\nFEMA's terms and conditions and citation requirements for datasets (API usage or file downloads) can be found on the OpenFEMA Terms and Conditions page: https://www.fema.gov/about/openfema/terms-conditions\n\nFor answers to Frequently Asked Questions (FAQs) about the OpenFEMA program, API, and publicly available datasets, please visit: https://www.fema.gov/about/openfema/faq\n\nIf you have media inquiries about this dataset, please email the FEMA Press Office at FEMA-Press-Office@fema.dhs.gov or call (202) 646-3272. For inquiries about FEMA's data and Open Government program, please email the OpenFEMA team at OpenFEMA@fema.dhs.gov."
See which columns we can filter on to select just Hurricane Irma related grants
# see which parameter can be used for filtering the Housing Assistance for Renters
valid_parameters("HousingAssistanceRenters")
## # A tibble: 21 × 1
## params
## <chr>
## 1 disasterNumber
## 2 state
## 3 county
## 4 city
## 5 zipCode
## 6 validRegistrations
## 7 totalInspected
## 8 totalInspectedWithNoDamage
## 9 totalWithModerateDamage
## 10 totalWithMajorDamage
## # ℹ 11 more rows
All we have in this data set is the disasterNumber
.
Thus, to filter on a specific disaster we have to load the
FemaWebDisasterDeclarations
data find the disaster number
associated with the event we are interested in.
# call the disaster declarations
dd <- rfema::open_fema(data_set = "FemaWebDisasterDeclarations", ask_before_call = F)
##
Obtaining Data: 1 out of 6 iterations (16.67% complete)
Obtaining Data: 2 out of
## 6 iterations (33.33% complete)
Obtaining Data: 3 out of 6 iterations (50%
## complete)
Obtaining Data: 4 out of 6 iterations (66.67% complete)
Obtaining Data:
## 5 out of 6 iterations (83.33% complete)
Obtaining Data: 6 out of 6 iterations
## (100% complete)
# filter disaster declarations to those with "hurricane" in the name
hurricanes <- distinct(dd %>% filter(grepl("hurricane",tolower(disasterName))) %>% select(disasterName, disasterNumber))
hurricanes
## # A tibble: 410 × 2
## disasterName disasterNumber
## <chr> <chr>
## 1 "HURRICANE MILTON" 4844
## 2 "HURRICANE MILTON " 4834
## 3 "HURRICANE MILTON" 3623
## 4 "HURRICANE HELENE" 4830
## 5 "HURRICANE HELENE" 4828
## 6 "HURRICANE DEBBY" 4806
## 7 "HURRICANE DEBBY" 3607
## 8 "HURRICANE DEBBY" 3606
## 9 "HURRICANE BERYL " 4798
## 10 "HURRICANE & FLOODS" 45
## # ℹ 400 more rows
We can see immediately that disaster numbers do not uniquely identify an event, since multiple disaster declarations may be declared for the same event, but in different locations. Thus to filter on a particular event, we need to collect all the disaster declaration numbers corresponding to that event (in this case Hurricane Irma).
# get all disaster declarations associated with hurricane irma.
# notice the use of grepl() which picked up a disaster declaration name
# that was different than all the others.
dd_irma <- hurricanes %>% filter(grepl("irma",tolower(disasterName)))
dd_irma
## # A tibble: 13 × 2
## disasterName disasterNumber
## <chr> <chr>
## 1 HURRICANE IRMA 4346
## 2 HURRICANE IRMA - SEMINOLE TRIBE OF FLORIDA 4341
## 3 HURRICANE IRMA 4338
## 4 HURRICANE IRMA 3389
## 5 HURRICANE IRMA 4337
## 6 HURRICANE IRMA 4336
## 7 HURRICANE IRMA 3388
## 8 HURRICANE IRMA 3387
## 9 HURRICANE IRMA 3386
## 10 HURRICANE IRMA 4335
## 11 HURRICANE IRMA 3385
## 12 HURRICANE IRMA 3384
## 13 HURRICANE IRMA 3383
# get a vector of just the disaster declaration numbers
dd_nums_irma <- dd_irma$disasterNumber
Now we are read to filter our API call for the
HousingAssistanceRenters
data set.
# construct filter list
filter_list <- list(disasterNumber = dd_nums_irma)
# make the API call to get individual assistance grants awarded to renters for hurricane Irma damages.
assistance_irma <- open_fema(data_set = "HousingAssistanceRenters", filters = filter_list, ask_before_call = F)
##
Obtaining Data: 1 out of 6 iterations (16.67% complete)
Obtaining Data: 2 out of
## 6 iterations (33.33% complete)
Obtaining Data: 3 out of 6 iterations (50%
## complete)
Obtaining Data: 4 out of 6 iterations (66.67% complete)
Obtaining Data:
## 5 out of 6 iterations (83.33% complete)
Obtaining Data: 6 out of 6 iterations
## (100% complete)
Check out the returned data
# check out the returned data
assistance_irma
## # A tibble: 5,361 × 21
## disasterNumber state county city zipCode validRegistrations totalInspected
## <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 4335 VI St. Joh… ST J… 00083 1 0
## 2 4335 VI St. Tho… CHAL… 00801 1 0
## 3 4335 VI St. Tho… CHAR… 00801 1 1
## 4 4335 VI St. Tho… CHAR… 00801 10 8
## 5 4335 VI St. Tho… CHAR… 00801 1 1
## 6 4335 VI St. Tho… CHAR… 00801 1 1
## 7 4335 VI St. Tho… SAIN… 00801 7 6
## 8 4335 VI St. Tho… STT 00801 1 1
## 9 4335 VI St. Tho… STTH… 00801 6 4
## 10 4335 VI St. Tho… ST T… 00801 219 181
## # ℹ 5,351 more rows
## # ℹ 14 more variables: totalInspectedWithNoDamage <chr>,
## # totalWithModerateDamage <chr>, totalWithMajorDamage <chr>,
## # totalWithSubstantialDamage <chr>, approvedForFemaAssistance <chr>,
## # totalApprovedIhpAmount <chr>, repairReplaceAmount <chr>,
## # rentalAmount <chr>, otherNeedsAmount <chr>, approvedBetween1And10000 <chr>,
## # approvedBetween10001And25000 <chr>, approvedBetween25001AndMax <chr>, …
Now we can answer our original question: How much did FEMA award for rental assistance following Hurricane Irma?
# sum the rentalAmount Column
rent_assistance <- sum(as.numeric(assistance_irma$rentalAmount))
# scale to millions
rent_assistance <- rent_assistance/1000000
print(paste0("$",round(rent_assistance,2),
" million was awarded by FEMA for rental assistance following Hurricane Irma"))
## [1] "$314.64 million was awarded by FEMA for rental assistance following Hurricane Irma"
Clean one of the data sets with a nested structure
Some data sets that get returned from the FEMA API will be in a nested format. Data from the Integrated Public Alert & Warning System (IPAWS) is one such example of this. See for example the first column of the IPAWS data set, which is XML data returned as a character. Most of the useful information from this data set is in that first column, but isn’t in a form that will be useful for most R users.
# get the first ten entries from the IPAWS data set
ipaws <- rfema::open_fema("IpawsArchivedAlerts", top_n = 100)
ipaws
## # A tibble: 100 × 18
## originalMessage identifier sender sent status msgType source scope
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 "<?xml version=\"1.0\" e… NWS-IDP-P… w-nws… 2018… Actual Alert NULL Publ…
## 2 "<?xml version=\"1.0\" e… NWS-IDP-P… w-nws… 2018… Actual Update NULL Publ…
## 3 "<?xml version=\"1.0\" e… NWS-IDP-P… w-nws… 2018… Actual Update NULL Publ…
## 4 "<?xml version=\"1.0\" e… NWS-IDP-P… w-nws… 2018… Actual Update NULL Publ…
## 5 "<?xml version=\"1.0\" e… NWS-IDP-P… w-nws… 2018… Actual Alert NULL Publ…
## 6 "<?xml version=\"1.0\" e… NWS-IDP-P… w-nws… 2018… Actual Update NULL Publ…
## 7 "<?xml version=\"1.0\" e… NWS-IDP-P… w-nws… 2018… Actual Update NULL Publ…
## 8 "<?xml version=\"1.0\" e… NWS-IDP-P… w-nws… 2018… Actual Update NULL Publ…
## 9 "<?xml version=\"1.0\" e… NWS-IDP-P… w-nws… 2018… Actual Update NULL Publ…
## 10 "<?xml version=\"1.0\" e… NWS-IDP-P… w-nws… 2018… Actual Update NULL Publ…
## # ℹ 90 more rows
## # ℹ 10 more variables: restriction <chr>, addresses <chr>, code <chr>,
## # note <chr>, searchGeometry <chr>, incidents <chr>, cogId <chr>, id <chr>,
## # xmlns <chr>, info <chr>
The following is one method for converting the xml data into tabular form.
library(dplyr)
library(XML)
# create function to unnest the ipaws entries
unnest_ipaws <- function(xml_entry){
# convert the raw xml data to a list
xml_data <- XML::xmlToList(xml_entry)
# get names of the list elements in xml data
names <- names(xml_data)
# get a summary of the data to id which elements are nested
data_sum <- summary(xml_data)
# put all the non nested elements into a data frame
df <- data.frame(xml_data[names[which(as.numeric(data_sum[,1]) == 1)]])
# get vector of elements that need to be unnested
needs_unnesting <- which(as.numeric(data_sum[,1]) > 1)
# loop over the elements identified above
for(k in needs_unnesting){
# unlist the nested data
unlisted_data <- t(unlist(xml_data[k], recursive = T, use.names = T))
# store the unlisted data as a data frame
temp_df <- data.frame(unlisted_data)
# add the unnested data frame to the existing "df" data frame
df <- cbind.data.frame(df,temp_df)
}
return(df)
}
# get the first 100 entries from the IPAWS alerts data set
ipaws <- rfema::open_fema("IpawsArchivedAlerts", top_n = 100)
# apply the `unnest_ipaws` function over all the XML entries in the returned `ipaws` object
ipaws_list <- sapply(ipaws$originalMessage, unnest_ipaws, simplify = T)
# convert the `ipaws_list` into a data frame
ipaws_df <- dplyr::bind_rows(ipaws_list)
## New names:
## New names:
## New names:
## New names:
## New names:
## New names:
## New names:
## • `info.language` -> `info.language...8`
## • `info.category` -> `info.category...9`
## • `info.event` -> `info.event...10`
## • `info.responseType` -> `info.responseType...11`
## • `info.urgency` -> `info.urgency...12`
## • `info.severity` -> `info.severity...13`
## • `info.certainty` -> `info.certainty...14`
## • `info.eventCode.valueName` -> `info.eventCode.valueName...15`
## • `info.eventCode.value` -> `info.eventCode.value...16`
## • `info.eventCode.valueName.1` -> `info.eventCode.valueName.1...17`
## • `info.eventCode.value.1` -> `info.eventCode.value.1...18`
## • `info.effective` -> `info.effective...19`
## • `info.onset` -> `info.onset...20`
## • `info.expires` -> `info.expires...21`
## • `info.senderName` -> `info.senderName...22`
## • `info.headline` -> `info.headline...23`
## • `info.description` -> `info.description...24`
## • `info.instruction` -> `info.instruction...25`
## • `info.web` -> `info.web...26`
## • `info.parameter.valueName` -> `info.parameter.valueName...27`
## • `info.parameter.value` -> `info.parameter.value...28`
## • `info.parameter.valueName.1` -> `info.parameter.valueName.1...29`
## • `info.parameter.value.1` -> `info.parameter.value.1...30`
## • `info.parameter.valueName.2` -> `info.parameter.valueName.2...31`
## • `info.parameter.value.2` -> `info.parameter.value.2...32`
## • `info.parameter.valueName.3` -> `info.parameter.valueName.3...33`
## • `info.parameter.value.3` -> `info.parameter.value.3...34`
## • `info.parameter.valueName.4` -> `info.parameter.valueName.4...35`
## • `info.parameter.value.4` -> `info.parameter.value.4...36`
## • `info.parameter.valueName.5` -> `info.parameter.valueName.5...37`
## • `info.parameter.value.5` -> `info.parameter.value.5...38`
## • `info.parameter.valueName.6` -> `info.parameter.valueName.6...39`
## • `info.parameter.value.6` -> `info.parameter.value.6...40`
## • `info.parameter.valueName.7` -> `info.parameter.valueName.7...41`
## • `info.parameter.value.7` -> `info.parameter.value.7...42`
## • `info.parameter.valueName.8` -> `info.parameter.valueName.8...43`
## • `info.parameter.value.8` -> `info.parameter.value.8...44`
## • `info.parameter.valueName.9` -> `info.parameter.valueName.9...45`
## • `info.parameter.value.9` -> `info.parameter.value.9...46`
## • `info.parameter.valueName.10` -> `info.parameter.valueName.10...47`
## • `info.parameter.value.10` -> `info.parameter.value.10...48`
## • `info.parameter.valueName.11` -> `info.parameter.valueName.11...49`
## • `info.parameter.value.11` -> `info.parameter.value.11...50`
## • `info.parameter.valueName.12` -> `info.parameter.valueName.12...51`
## • `info.parameter.value.12` -> `info.parameter.value.12...52`
## • `info.area.areaDesc` -> `info.area.areaDesc...53`
## • `info.area.polygon` -> `info.area.polygon...54`
## • `info.area.geocode.valueName` -> `info.area.geocode.valueName...55`
## • `info.area.geocode.value` -> `info.area.geocode.value...56`
## • `info.area.geocode.valueName.1` -> `info.area.geocode.valueName.1...57`
## • `info.area.geocode.value.1` -> `info.area.geocode.value.1...58`
## • `info.language` -> `info.language...59`
## • `info.category` -> `info.category...60`
## • `info.event` -> `info.event...61`
## • `info.responseType` -> `info.responseType...62`
## • `info.urgency` -> `info.urgency...63`
## • `info.severity` -> `info.severity...64`
## • `info.certainty` -> `info.certainty...65`
## • `info.eventCode.valueName` -> `info.eventCode.valueName...66`
## • `info.eventCode.value` -> `info.eventCode.value...67`
## • `info.eventCode.valueName.1` -> `info.eventCode.valueName.1...68`
## • `info.eventCode.value.1` -> `info.eventCode.value.1...69`
## • `info.effective` -> `info.effective...70`
## • `info.onset` -> `info.onset...71`
## • `info.expires` -> `info.expires...72`
## • `info.senderName` -> `info.senderName...73`
## • `info.headline` -> `info.headline...74`
## • `info.description` -> `info.description...75`
## • `info.instruction` -> `info.instruction...76`
## • `info.web` -> `info.web...77`
## • `info.parameter.valueName` -> `info.parameter.valueName...78`
## • `info.parameter.value` -> `info.parameter.value...79`
## • `info.parameter.valueName.1` -> `info.parameter.valueName.1...80`
## • `info.parameter.value.1` -> `info.parameter.value.1...81`
## • `info.parameter.valueName.2` -> `info.parameter.valueName.2...82`
## • `info.parameter.value.2` -> `info.parameter.value.2...83`
## • `info.parameter.valueName.3` -> `info.parameter.valueName.3...84`
## • `info.parameter.value.3` -> `info.parameter.value.3...85`
## • `info.parameter.valueName.4` -> `info.parameter.valueName.4...86`
## • `info.parameter.value.4` -> `info.parameter.value.4...87`
## • `info.parameter.valueName.5` -> `info.parameter.valueName.5...88`
## • `info.parameter.value.5` -> `info.parameter.value.5...89`
## • `info.parameter.valueName.6` -> `info.parameter.valueName.6...90`
## • `info.parameter.value.6` -> `info.parameter.value.6...91`
## • `info.parameter.valueName.7` -> `info.parameter.valueName.7...92`
## • `info.parameter.value.7` -> `info.parameter.value.7...93`
## • `info.parameter.valueName.8` -> `info.parameter.valueName.8...94`
## • `info.parameter.value.8` -> `info.parameter.value.8...95`
## • `info.parameter.valueName.9` -> `info.parameter.valueName.9...96`
## • `info.parameter.value.9` -> `info.parameter.value.9...97`
## • `info.parameter.valueName.10` -> `info.parameter.valueName.10...98`
## • `info.parameter.value.10` -> `info.parameter.value.10...99`
## • `info.parameter.valueName.11` -> `info.parameter.valueName.11...100`
## • `info.parameter.value.11` -> `info.parameter.value.11...101`
## • `info.parameter.valueName.12` -> `info.parameter.valueName.12...102`
## • `info.parameter.value.12` -> `info.parameter.value.12...103`
## • `info.area.areaDesc` -> `info.area.areaDesc...104`
## • `info.area.polygon` -> `info.area.polygon...105`
## • `info.area.geocode.valueName` -> `info.area.geocode.valueName...106`
## • `info.area.geocode.value` -> `info.area.geocode.value...107`
## • `info.area.geocode.valueName.1` -> `info.area.geocode.valueName.1...108`
## • `info.area.geocode.value.1` -> `info.area.geocode.value.1...109`
# the number of columns can get unwieldy because of all the unique pieces of information
# in that "info" element that get tacked on
# dropping the geocoding columns could help simplify
ipaws_df <- ipaws_df %>% select(-contains("geocode"))
# dropping the "parameter value" columns would also help
# (depending on if those are needed or not)
ipaws_df <- ipaws_df %>% select(-contains("parameter.value"))
# view the final data
as_tibble(ipaws_df)
## # A tibble: 100 × 123
## identifier sender sent status msgType scope code info.category info.event
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 NWS-IDP-PRO… w-nws… 2018… Actual Alert Publ… IPAW… Met Special W…
## 2 NWS-IDP-PRO… w-nws… 2018… Actual Update Publ… IPAW… Met Winter St…
## 3 NWS-IDP-PRO… w-nws… 2018… Actual Update Publ… IPAW… Met Gale Warn…
## 4 NWS-IDP-PRO… w-nws… 2018… Actual Update Publ… IPAW… Met Flood Adv…
## 5 NWS-IDP-PRO… w-nws… 2018… Actual Alert Publ… IPAW… Met Dense Fog…
## 6 NWS-IDP-PRO… w-nws… 2018… Actual Update Publ… IPAW… Met Small Cra…
## 7 NWS-IDP-PRO… w-nws… 2018… Actual Update Publ… IPAW… Met Winter We…
## 8 NWS-IDP-PRO… w-nws… 2018… Actual Update Publ… IPAW… Met Flood War…
## 9 NWS-IDP-PRO… w-nws… 2018… Actual Update Publ… IPAW… Met Flood War…
## 10 NWS-IDP-PRO… w-nws… 2018… Actual Update Publ… IPAW… Met Flood War…
## # ℹ 90 more rows
## # ℹ 114 more variables: info.responseType <chr>, info.urgency <chr>,
## # info.severity <chr>, info.certainty <chr>, info.eventCode.valueName <chr>,
## # info.eventCode.value <chr>, info.effective <chr>, info.onset <chr>,
## # info.expires <chr>, info.senderName <chr>, info.headline <chr>,
## # info.description <chr>, info.instruction <chr>, info.web <chr>,
## # info.area.areaDesc <chr>, …