General Introduction
rrricanes
is intended to give easy access to hurricane archives. It is a web-scraping tool that parses the National Hurricane Center’s (NHC) archives to get storm data. Data is available for storms dating back to 1998.
There are two basins which data is available: north Atlantic (“AL”) and northeastern Pacific (“EP”). The northeastern Pacific basin typically covers from the west coast of North America to -140° longitude (140°W).
Get Storms
By default, get_storms
will return all storms that have developed for the current year in both basins. If no storms have developed, an error will be generated. For this example, we’ll use 2012.
df.al_2012 <- get_storms(years = 2012, basins = "AL")
Getting Storm Data
get_storm_data
can be used to retrieve one or multiple products for one or more cyclones. A list of dataframes is returned.
df.al_18_2012_fstadv <- df.al_2012 %>%
filter(Name == "Hurricane Sandy") %>%
.$Link %>%
get_storm_data(products = "fstadv")
We can get the forecast/advisory data and wind speed probabilities at once:
df.al_18_2012 <- df.al_2012 %>%
filter(Name == "Hurricane Sandy") %>%
.$Link %>%
get_storm_data(c("fstadv", "wndprb"))
df.al_18_2012
now contains two dataframes for Hurricane Sandy; fstadv
and wndprb
.
Forecast/Advisory Product (fstadv
)
The core of a storm’s dataset is located in the Forecast/Advisory product, fstadv
. This product contains current location, forecast position, movement and structural details of the cyclone.
To access only this product, we can use get_fstadv
:
df.al_18_2012_fstadv <- df.al_2012 %>%
filter(Name == "Hurricane Sandy") %>%
.$Link %>%
get_fstadv()
As you may have noticed above, the dataframe is very wide at 149 variables. There are four groups of variables in this dataset: current details, current wind radii, forecast positions, and forecast wind radii.
Current Details
Let’s look at an example of the current details.
## tibble [31 × 18] (S3: tbl_df/tbl/data.frame)
## $ Status : chr [1:31] "Tropical Depression" "Tropical Storm" "Tropical Storm" "Tropical Storm" ...
## $ Name : chr [1:31] "Eighteen" "Sandy" "Sandy" "Sandy" ...
## $ Adv : num [1:31] 1 2 3 4 5 6 7 8 9 10 ...
## $ Date : POSIXct[1:31], format: "2012-10-22 15:00:00" "2012-10-22 21:00:00" ...
## $ Key : chr [1:31] "AL182012" "AL182012" "AL182012" "AL182012" ...
## $ Lat : num [1:31] 13.5 12.5 12.7 13.3 13.8 14.3 15.2 16.3 17.1 18.3 ...
## $ Lon : num [1:31] -78 -78.5 -78.6 -78.6 -77.8 -77.6 -77.2 -77 -76.7 -76.6 ...
## $ Wind : num [1:31] 25 35 40 40 45 45 50 60 70 70 ...
## $ Gust : num [1:31] 35 45 50 50 55 55 60 75 85 85 ...
## $ Pressure: num [1:31] 1003 999 998 998 993 ...
## $ PosAcc : num [1:31] 45 50 25 40 30 30 20 20 20 20 ...
## $ FwdDir : num [1:31] 230 NA NA 360 20 20 15 10 15 10 ...
## $ FwdSpeed: num [1:31] 4 NA NA 3 4 5 9 12 11 12 ...
## $ Eye : num [1:31] NA NA NA NA NA NA 25 NA NA NA ...
## $ SeasNE : num [1:31] NA 60 60 70 75 90 180 180 180 180 ...
## $ SeasSE : num [1:31] NA 60 45 80 60 90 180 180 240 240 ...
## $ SeasSW : num [1:31] NA 0 0 0 0 0 0 45 0 0 ...
## $ SeasNW : num [1:31] NA 0 45 50 50 50 0 100 90 90 ...
The most important variable in this dataset is Key
. Key
is a unique identifier for each storm that develops in either basin. It is formatted such as “AABBCCCC” where “AA” is the basin abbreviation (AL or EP), “BB” is the year number of the storm left-padded, and “CC” is the year of the storm.
Adv
is the second-most important variable here. You’ll notice it is in character format. For regularly-scheduled advisories, advisory numbers are always numeric. However, when watches and warnings are in effect, intermediate advisories are issued which are given alpha suffixes; i.e., 1, 2, 3, 3A, 4, 4A, 4B, 5, etc.
Only the Public Advisory (public
) will be issued more frequently. All other regular products (discus
, fstadv
, prblty
, wndprb
) are generally issued every six hours.
Status
lists the current designation of the cyclone, i.e., Tropical Depression, Tropical Storm, etc. A Name
is given once a storm crosses the threshold of Tropical Storm; that is, winds greater than 33kts.
Lat
and Lon
are the current position of the storm within PosAcc
nautical miles. All distance measurements are in nautical miles.
Wind
and Gust
are current one-minute sustained wind speeds in knots (kts). You can use the function knots_to_mph
to convert this. All wind speed values are in knots.
Pressure
is the lowest atmospheric pressure of the cyclone either measured or estimated. It’s value is in millibars but you can use mb_to_in()
to convert to inches.
FwdDir
and FwdSpeed
show the compass direction of the forward movement of the cyclone. NA values indicate the storm is stationary or drifting. FwdSpeed
is measured in knots.
In some cases, where hurricanes have an identifiable Eye
, it’s diameter in nautical miles will also be listed.
Lastly, the Seas
variables will exist for a storm of at least tropical storm-strength. This is the distance from the center of circulation that 12ft seas can be found in each quadrant. The measurement is in nautical miles.
Helper function tidy_adv
will subset this data to a narrow dataframe.
tidy_adv(df.al_18_2012_fstadv)
## # A tibble: 31 × 18
## Key Adv Date Status Name Lat Lon Wind Gust Pressure
## <chr> <dbl> <dttm> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 AL18… 1 2012-10-22 15:00:00 Tropi… Eigh… 13.5 -78 25 35 1003
## 2 AL18… 2 2012-10-22 21:00:00 Tropi… Sandy 12.5 -78.5 35 45 999
## 3 AL18… 3 2012-10-23 03:00:00 Tropi… Sandy 12.7 -78.6 40 50 998
## 4 AL18… 4 2012-10-23 09:00:00 Tropi… Sandy 13.3 -78.6 40 50 998
## 5 AL18… 5 2012-10-23 15:00:00 Tropi… Sandy 13.8 -77.8 45 55 993
## 6 AL18… 6 2012-10-23 21:00:00 Tropi… Sandy 14.3 -77.6 45 55 993
## 7 AL18… 7 2012-10-24 03:00:00 Tropi… Sandy 15.2 -77.2 50 60 989
## 8 AL18… 8 2012-10-24 09:00:00 Tropi… Sandy 16.3 -77 60 75 986
## 9 AL18… 9 2012-10-24 15:00:00 Hurri… Sandy 17.1 -76.7 70 85 973
## 10 AL18… 10 2012-10-24 21:00:00 Hurri… Sandy 18.3 -76.6 70 85 970
## # … with 21 more rows, and 8 more variables: PosAcc <dbl>, FwdDir <dbl>,
## # FwdSpeed <dbl>, Eye <dbl>, SeasNE <dbl>, SeasSE <dbl>, SeasSW <dbl>,
## # SeasNW <dbl>
Wind Radius
Any cyclone of at least tropical storm-strength will have associated wind radius values. This is the distance from the center of circulation that a specified wind speed (34kts, 50kts, 64kts) can be found in each quadrant. Measurement is in nautical miles.
## tibble [31 × 12] (S3: tbl_df/tbl/data.frame)
## $ NE64: num [1:31] NA NA NA NA NA NA NA NA 20 25 ...
## $ SE64: num [1:31] NA NA NA NA NA NA NA NA 20 20 ...
## $ SW64: num [1:31] NA NA NA NA NA NA NA NA 0 0 ...
## $ NW64: num [1:31] NA NA NA NA NA NA NA NA 0 0 ...
## $ NE50: num [1:31] NA NA NA NA NA NA 0 50 50 50 ...
## $ SE50: num [1:31] NA NA NA NA NA NA 80 70 60 60 ...
## $ SW50: num [1:31] NA NA NA NA NA NA 0 0 30 40 ...
## $ NW50: num [1:31] NA NA NA NA NA NA 0 0 30 40 ...
## $ NE34: num [1:31] NA 50 50 70 70 80 90 100 110 110 ...
## $ SE34: num [1:31] NA 60 60 80 80 90 120 120 120 120 ...
## $ SW34: num [1:31] NA 0 0 0 0 0 0 45 60 70 ...
## $ NW34: num [1:31] NA 0 0 0 0 0 30 45 60 60 ...
A helper function, tidy_wr
will reorganize this data into a narrow format and tidied up. Complete wind radius values that are NA are removed for efficiency.
tidy_wr(df.al_18_2012_fstadv)
## # A tibble: 77 × 8
## Key Adv Date WindField NE SE SW NW
## <chr> <dbl> <dttm> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 AL182012 2 2012-10-22 21:00:00 34 50 60 0 0
## 2 AL182012 3 2012-10-23 03:00:00 34 50 60 0 0
## 3 AL182012 4 2012-10-23 09:00:00 34 70 80 0 0
## 4 AL182012 5 2012-10-23 15:00:00 34 70 80 0 0
## 5 AL182012 6 2012-10-23 21:00:00 34 80 90 0 0
## 6 AL182012 7 2012-10-24 03:00:00 34 90 120 0 30
## 7 AL182012 7 2012-10-24 03:00:00 50 0 80 0 0
## 8 AL182012 8 2012-10-24 09:00:00 34 100 120 45 45
## 9 AL182012 8 2012-10-24 09:00:00 50 50 70 0 0
## 10 AL182012 9 2012-10-24 15:00:00 34 110 120 60 60
## # … with 67 more rows
Forecast
Most Forecast/Advisory products will have forecast data associated with it unless the storm has dissipated or is no longer tropical. There may be up to seven forecast positions. These positions are issued by 12-hour intervals through 48 hours where they are then at 24-hour intervals; 12, 24, 36, 48, 72, 96 and 120 hours.
Notice each variable begins with the prefix “Hrn” where n is the forecast period as noted above. Only Date, Lat, Lon, Wind, Gust and wind radius (will discuss shortly) are given for forecast periods.
Use tidy_fcst
to tidy forecast data.
tidy_fcst(df.al_18_2012_fstadv)
## # A tibble: 216 × 8
## Key Adv Date FcstDate Lat Lon Wind Gust
## <chr> <dbl> <dttm> <dttm> <dbl> <dbl> <dbl> <dbl>
## 1 AL1820… 1 2012-10-22 15:00:00 2012-10-23 00:00:00 13.7 -78.3 35 45
## 2 AL1820… 1 2012-10-22 15:00:00 2012-10-23 12:00:00 14.3 -78.1 45 55
## 3 AL1820… 1 2012-10-22 15:00:00 2012-10-24 00:00:00 15.7 -77.6 55 65
## 4 AL1820… 1 2012-10-22 15:00:00 2012-10-24 12:00:00 17.4 -77 60 75
## 5 AL1820… 1 2012-10-22 15:00:00 2012-10-25 12:00:00 20.5 -76 55 65
## 6 AL1820… 1 2012-10-22 15:00:00 2012-10-26 12:00:00 24.5 -74.5 55 65
## 7 AL1820… 1 2012-10-22 15:00:00 2012-10-27 12:00:00 27 -73 50 60
## 8 AL1820… 2 2012-10-22 21:00:00 2012-10-23 06:00:00 13.6 -78.5 35 45
## 9 AL1820… 2 2012-10-22 21:00:00 2012-10-23 18:00:00 14.9 -78.3 45 55
## 10 AL1820… 2 2012-10-22 21:00:00 2012-10-24 06:00:00 16.4 -77.8 55 65
## # … with 206 more rows
Forecast Dates/Times
A note about forecast times.
## # A tibble: 1 × 2
## Date Hr12FcstDate
## <dttm> <dttm>
## 1 2012-10-22 15:00:00 2012-10-23 00:00:00
Notice the Date
of this advisory is Oct 22 at 15:00 UTC. The Hr12FcstDate
is Oct 23, 00:00 UTC. This difference, obviously, is not 12 hours. What gives? Forecast/Advisory products are issued with two “current” positions: one that is current (and provided in the dataset) and a position from three hours prior. So, in this specific advisory the text would contain the position of the storm for Oct 22, 12:00 UTC. It is from this position the forecast points are based. I do not know why.
Therefore, while officially the forecast periods are 12, 24, 36, … hours, in reality they are 9, 21, 33, … hours from the issuance time of the product.
Forecast Wind Radius
Some forecast positions may also contain wind radius information (only up to 72 hours).
Again, these variables are prepended with the prefix prefix “Hrn” where n notes the forecast period.
tidy_fcst_wr
will tidy this subset of data.
tidy_fcst_wr(df.al_18_2012_fstadv)
## # A tibble: 337 × 9
## Key Adv Date FcstDate WindField NE SE
## <chr> <dbl> <dttm> <dttm> <dbl> <dbl> <dbl>
## 1 AL182012 1 2012-10-22 15:00:00 2012-10-23 00:00:00 34 40 30
## 2 AL182012 1 2012-10-22 15:00:00 2012-10-23 12:00:00 34 50 60
## 3 AL182012 1 2012-10-22 15:00:00 2012-10-24 00:00:00 34 80 80
## 4 AL182012 1 2012-10-22 15:00:00 2012-10-24 00:00:00 50 30 40
## 5 AL182012 1 2012-10-22 15:00:00 2012-10-24 12:00:00 34 90 90
## 6 AL182012 1 2012-10-22 15:00:00 2012-10-24 12:00:00 50 40 40
## 7 AL182012 1 2012-10-22 15:00:00 2012-10-25 12:00:00 34 200 180
## 8 AL182012 1 2012-10-22 15:00:00 2012-10-25 12:00:00 50 50 40
## 9 AL182012 2 2012-10-22 21:00:00 2012-10-23 06:00:00 34 50 60
## 10 AL182012 2 2012-10-22 21:00:00 2012-10-23 18:00:00 34 50 60
## # … with 327 more rows, and 2 more variables: SW <dbl>, NW <dbl>
Please see the National Hurricane Center’s website for more information on understanding the Forecast/Advisory product.
Strike Probabilities (prblty
)
Strike probabilities were discontinued after the 2005 hurricane season (replaced by Wind Speed Probabilities; wndprb
). For this example, we’ll look at Hurricane Katrina. For this we use the function get_prblty
.
df.al_12_2005_prblty <- get_storms(year = 2005, basin = "AL") %>%
filter(Name == "Hurricane Katrina") %>%
.$Link %>%
get_prblty()
str(df.al_12_2005_prblty)
## tibble [937 × 10] (S3: tbl_df/tbl/data.frame)
## $ Status : chr [1:937] "Tropical Depression" "Tropical Depression" "Tropical Depression" "Tropical Depression" ...
## $ Name : chr [1:937] "Twelve" "Twelve" "Twelve" "Twelve" ...
## $ Adv : chr [1:937] "1" "1" "1" "1" ...
## $ Date : POSIXct[1:937], format: "2005-08-23 21:00:00" "2005-08-23 21:00:00" ...
## $ Location: chr [1:937] "25.0N 77.7W" "JACKSONVILLE FL" "25.7N 78.5W" "SAVANNAH GA" ...
## $ A : num [1:937] 50 0 36 0 19 0 0 0 0 0 ...
## $ B : num [1:937] 0 0 0 0 6 0 1 0 0 5 ...
## $ C : num [1:937] 0 2 0 0 1 0 0 0 0 3 ...
## $ D : num [1:937] 0 9 1 6 1 3 3 2 2 5 ...
## $ E : num [1:937] 50 11 37 6 27 3 4 2 2 13 ...
This dataframe contains the possibility of a cyclone passing within 65 nautical miles of Location
. The variables A
, B
, C
, D
, and E
are as they appear in the products and were left as-is to avoid confusion. They’re definition is as follows:
-
A
- current through 12 hours. -
B
- within the next 12-24 hours -
C
- within the next 24-36 hours -
D
- within the next 36-48 hours -
E
- Total probability from current through 48 hours.
Many values in the text product may be “X” for less than 1% chance of a strike. These values are converted to 0 as the fields are numeric.
The strike probability products did not contain Key
which is the unique identifier for every cyclone. So the best way to do any joins will be by Name
, Adv
and Date
.
Strike Probabilities may not exist for most Pacific cyclones.
Wind Speed Probabilities (wndprb
)
df.al_18_2012_wndprb <- df.al_2012 %>%
filter(Name == "Hurricane Sandy") %>%
.$Link %>%
get_wndprb()
str(df.al_18_2012_wndprb)
## tibble [2,227 × 18] (S3: tbl_df/tbl/data.frame)
## $ Key : chr [1:2227] "AL182012" "AL182012" "AL182012" "AL182012" ...
## $ Adv : num [1:2227] 1 1 1 1 1 1 1 1 1 1 ...
## $ Date : POSIXct[1:2227], format: "2012-10-22 15:00:00" "2012-10-22 15:00:00" ...
## $ Location : chr [1:2227] "FT PIERCE FL" "W PALM BEACH" "MIAMI FL" "MARATHON FL" ...
## $ Wind : num [1:2227] 34 34 34 34 34 34 34 50 64 34 ...
## $ Wind12 : num [1:2227] 0 0 0 0 0 0 0 0 0 0 ...
## $ Wind24 : num [1:2227] 0 0 0 0 1 0 0 0 0 0 ...
## $ Wind24Cum : num [1:2227] 0 0 0 0 1 0 0 0 0 0 ...
## $ Wind36 : num [1:2227] 0 0 0 0 1 0 0 0 0 0 ...
## $ Wind36Cum : num [1:2227] 0 0 0 0 2 0 0 0 0 0 ...
## $ Wind48 : num [1:2227] 0 0 0 0 1 0 0 0 0 0 ...
## $ Wind48Cum : num [1:2227] 0 0 0 0 3 0 0 0 0 0 ...
## $ Wind72 : num [1:2227] 0 0 0 0 0 0 3 0 0 5 ...
## $ Wind72Cum : num [1:2227] 0 0 0 0 3 0 3 0 0 5 ...
## $ Wind96 : num [1:2227] 1 2 3 2 0 5 10 3 1 12 ...
## $ Wind96Cum : num [1:2227] 1 2 3 2 3 5 13 3 1 17 ...
## $ Wind120 : num [1:2227] 2 2 1 1 0 3 5 2 0 3 ...
## $ Wind120Cum: num [1:2227] 3 4 4 3 3 8 18 5 1 20 ...
Wind Speed Probabilities are a bit more advanced than their predecessor. The Wind
variable is for 34kt, 50kt and 64kt winds expected within a specific time period.
Each consecutive variable is within a specific time-frame (12, 24, 36, 48, 72, 96 and 120 hours) for both that time frame and cumulative.
For example, Wind24
is the chance of Wind
between 12-24 hours. Wind24Cum
is the cumulative probability from Date
through 24 hours.
As with strike probabilities, an “X” in the original text product meant less than 0.5% chance for the specified wind in the specified time period. “X” has been replaced by 0 in this package.
Wind Speed Probabilities may not exist for most Pacific cyclones.
See Tropical Cyclone Wind Speed Probabilities Products for more information.
Other products
Other products are available:
get_public
for Public Advisory statements. Think general information for the public audience. May not exist for some Pacific cyclones. Additionally, when watches and warnings are issued, these are issued every 3 hours (and, in some cases, every two).get_discus
for Storm Discussions. These are more technical statements on the structure of a storm, forecast model tendencies and satellite presentation.get_update
These are brief update statements when something considerable has changed in the cyclone or if the cyclone is making landfall.get_posest
. Position estimates are generally issued when a storm is making landfall and may be issued hourly.
Hurricane Ike, 2008, has both updates and position estimates.
At this time none of these products are parsed. Only the content of the product is returned.