Skip to contents

General Introduction

rrricanes is intended to give easy access to hurricane archives. It is a web-scraping tool that parses the National Hurricane Center’s (NHC) archives to get storm data. Data is available for storms dating back to 1998.

There are two basins which data is available: north Atlantic (“AL”) and northeastern Pacific (“EP”). The northeastern Pacific basin typically covers from the west coast of North America to -140° longitude (140°W).

Get Storms

By default, get_storms will return all storms that have developed for the current year in both basins. If no storms have developed, an error will be generated. For this example, we’ll use 2012.

df.al_2012 <- get_storms(years = 2012, basins = "AL")

Getting Storm Data

get_storm_data can be used to retrieve one or multiple products for one or more cyclones. A list of dataframes is returned.

df.al_18_2012_fstadv <- df.al_2012 %>% 
    filter(Name == "Hurricane Sandy") %>% 
    .$Link %>% 
    get_storm_data(products = "fstadv")

We can get the forecast/advisory data and wind speed probabilities at once:

df.al_18_2012 <- df.al_2012 %>% 
    filter(Name == "Hurricane Sandy") %>% 
    .$Link %>% 
    get_storm_data(c("fstadv", "wndprb"))

df.al_18_2012 now contains two dataframes for Hurricane Sandy; fstadv and wndprb.

Forecast/Advisory Product (fstadv)

The core of a storm’s dataset is located in the Forecast/Advisory product, fstadv. This product contains current location, forecast position, movement and structural details of the cyclone.

To access only this product, we can use get_fstadv:

df.al_18_2012_fstadv <- df.al_2012 %>% 
    filter(Name == "Hurricane Sandy") %>% 
    .$Link %>% 
    get_fstadv()

As you may have noticed above, the dataframe is very wide at 149 variables. There are four groups of variables in this dataset: current details, current wind radii, forecast positions, and forecast wind radii.

Current Details

Let’s look at an example of the current details.

str(df.al_18_2012_fstadv %>% select(Status:Eye, SeasNE:SeasNW))
## tibble [31 × 18] (S3: tbl_df/tbl/data.frame)
##  $ Status  : chr [1:31] "Tropical Depression" "Tropical Storm" "Tropical Storm" "Tropical Storm" ...
##  $ Name    : chr [1:31] "Eighteen" "Sandy" "Sandy" "Sandy" ...
##  $ Adv     : num [1:31] 1 2 3 4 5 6 7 8 9 10 ...
##  $ Date    : POSIXct[1:31], format: "2012-10-22 15:00:00" "2012-10-22 21:00:00" ...
##  $ Key     : chr [1:31] "AL182012" "AL182012" "AL182012" "AL182012" ...
##  $ Lat     : num [1:31] 13.5 12.5 12.7 13.3 13.8 14.3 15.2 16.3 17.1 18.3 ...
##  $ Lon     : num [1:31] -78 -78.5 -78.6 -78.6 -77.8 -77.6 -77.2 -77 -76.7 -76.6 ...
##  $ Wind    : num [1:31] 25 35 40 40 45 45 50 60 70 70 ...
##  $ Gust    : num [1:31] 35 45 50 50 55 55 60 75 85 85 ...
##  $ Pressure: num [1:31] 1003 999 998 998 993 ...
##  $ PosAcc  : num [1:31] 45 50 25 40 30 30 20 20 20 20 ...
##  $ FwdDir  : num [1:31] 230 NA NA 360 20 20 15 10 15 10 ...
##  $ FwdSpeed: num [1:31] 4 NA NA 3 4 5 9 12 11 12 ...
##  $ Eye     : num [1:31] NA NA NA NA NA NA 25 NA NA NA ...
##  $ SeasNE  : num [1:31] NA 60 60 70 75 90 180 180 180 180 ...
##  $ SeasSE  : num [1:31] NA 60 45 80 60 90 180 180 240 240 ...
##  $ SeasSW  : num [1:31] NA 0 0 0 0 0 0 45 0 0 ...
##  $ SeasNW  : num [1:31] NA 0 45 50 50 50 0 100 90 90 ...

The most important variable in this dataset is Key. Key is a unique identifier for each storm that develops in either basin. It is formatted such as “AABBCCCC” where “AA” is the basin abbreviation (AL or EP), “BB” is the year number of the storm left-padded, and “CC” is the year of the storm.

Adv is the second-most important variable here. You’ll notice it is in character format. For regularly-scheduled advisories, advisory numbers are always numeric. However, when watches and warnings are in effect, intermediate advisories are issued which are given alpha suffixes; i.e., 1, 2, 3, 3A, 4, 4A, 4B, 5, etc.

Only the Public Advisory (public) will be issued more frequently. All other regular products (discus, fstadv, prblty, wndprb) are generally issued every six hours.

Status lists the current designation of the cyclone, i.e., Tropical Depression, Tropical Storm, etc. A Name is given once a storm crosses the threshold of Tropical Storm; that is, winds greater than 33kts.

Lat and Lon are the current position of the storm within PosAcc nautical miles. All distance measurements are in nautical miles.

Wind and Gust are current one-minute sustained wind speeds in knots (kts). You can use the function knots_to_mph to convert this. All wind speed values are in knots.

Pressure is the lowest atmospheric pressure of the cyclone either measured or estimated. It’s value is in millibars but you can use mb_to_in() to convert to inches.

FwdDir and FwdSpeed show the compass direction of the forward movement of the cyclone. NA values indicate the storm is stationary or drifting. FwdSpeed is measured in knots.

In some cases, where hurricanes have an identifiable Eye, it’s diameter in nautical miles will also be listed.

Lastly, the Seas variables will exist for a storm of at least tropical storm-strength. This is the distance from the center of circulation that 12ft seas can be found in each quadrant. The measurement is in nautical miles.

Helper function tidy_adv will subset this data to a narrow dataframe.

tidy_adv(df.al_18_2012_fstadv)
## # A tibble: 31 × 18
##    Key     Adv Date                Status Name    Lat   Lon  Wind  Gust Pressure
##    <chr> <dbl> <dttm>              <chr>  <chr> <dbl> <dbl> <dbl> <dbl>    <dbl>
##  1 AL18…     1 2012-10-22 15:00:00 Tropi… Eigh…  13.5 -78      25    35     1003
##  2 AL18…     2 2012-10-22 21:00:00 Tropi… Sandy  12.5 -78.5    35    45      999
##  3 AL18…     3 2012-10-23 03:00:00 Tropi… Sandy  12.7 -78.6    40    50      998
##  4 AL18…     4 2012-10-23 09:00:00 Tropi… Sandy  13.3 -78.6    40    50      998
##  5 AL18…     5 2012-10-23 15:00:00 Tropi… Sandy  13.8 -77.8    45    55      993
##  6 AL18…     6 2012-10-23 21:00:00 Tropi… Sandy  14.3 -77.6    45    55      993
##  7 AL18…     7 2012-10-24 03:00:00 Tropi… Sandy  15.2 -77.2    50    60      989
##  8 AL18…     8 2012-10-24 09:00:00 Tropi… Sandy  16.3 -77      60    75      986
##  9 AL18…     9 2012-10-24 15:00:00 Hurri… Sandy  17.1 -76.7    70    85      973
## 10 AL18…    10 2012-10-24 21:00:00 Hurri… Sandy  18.3 -76.6    70    85      970
## # … with 21 more rows, and 8 more variables: PosAcc <dbl>, FwdDir <dbl>,
## #   FwdSpeed <dbl>, Eye <dbl>, SeasNE <dbl>, SeasSE <dbl>, SeasSW <dbl>,
## #   SeasNW <dbl>

Wind Radius

Any cyclone of at least tropical storm-strength will have associated wind radius values. This is the distance from the center of circulation that a specified wind speed (34kts, 50kts, 64kts) can be found in each quadrant. Measurement is in nautical miles.

str(df.al_18_2012_fstadv %>% select(NE64:NW34))
## tibble [31 × 12] (S3: tbl_df/tbl/data.frame)
##  $ NE64: num [1:31] NA NA NA NA NA NA NA NA 20 25 ...
##  $ SE64: num [1:31] NA NA NA NA NA NA NA NA 20 20 ...
##  $ SW64: num [1:31] NA NA NA NA NA NA NA NA 0 0 ...
##  $ NW64: num [1:31] NA NA NA NA NA NA NA NA 0 0 ...
##  $ NE50: num [1:31] NA NA NA NA NA NA 0 50 50 50 ...
##  $ SE50: num [1:31] NA NA NA NA NA NA 80 70 60 60 ...
##  $ SW50: num [1:31] NA NA NA NA NA NA 0 0 30 40 ...
##  $ NW50: num [1:31] NA NA NA NA NA NA 0 0 30 40 ...
##  $ NE34: num [1:31] NA 50 50 70 70 80 90 100 110 110 ...
##  $ SE34: num [1:31] NA 60 60 80 80 90 120 120 120 120 ...
##  $ SW34: num [1:31] NA 0 0 0 0 0 0 45 60 70 ...
##  $ NW34: num [1:31] NA 0 0 0 0 0 30 45 60 60 ...

A helper function, tidy_wr will reorganize this data into a narrow format and tidied up. Complete wind radius values that are NA are removed for efficiency.

tidy_wr(df.al_18_2012_fstadv)
## # A tibble: 77 × 8
##    Key        Adv Date                WindField    NE    SE    SW    NW
##    <chr>    <dbl> <dttm>                  <dbl> <dbl> <dbl> <dbl> <dbl>
##  1 AL182012     2 2012-10-22 21:00:00        34    50    60     0     0
##  2 AL182012     3 2012-10-23 03:00:00        34    50    60     0     0
##  3 AL182012     4 2012-10-23 09:00:00        34    70    80     0     0
##  4 AL182012     5 2012-10-23 15:00:00        34    70    80     0     0
##  5 AL182012     6 2012-10-23 21:00:00        34    80    90     0     0
##  6 AL182012     7 2012-10-24 03:00:00        34    90   120     0    30
##  7 AL182012     7 2012-10-24 03:00:00        50     0    80     0     0
##  8 AL182012     8 2012-10-24 09:00:00        34   100   120    45    45
##  9 AL182012     8 2012-10-24 09:00:00        50    50    70     0     0
## 10 AL182012     9 2012-10-24 15:00:00        34   110   120    60    60
## # … with 67 more rows

Forecast

Most Forecast/Advisory products will have forecast data associated with it unless the storm has dissipated or is no longer tropical. There may be up to seven forecast positions. These positions are issued by 12-hour intervals through 48 hours where they are then at 24-hour intervals; 12, 24, 36, 48, 72, 96 and 120 hours.

str(df.al_18_2012_fstadv %>% select(Hr12FcstDate:Hr12Gust))

Notice each variable begins with the prefix “Hrn” where n is the forecast period as noted above. Only Date, Lat, Lon, Wind, Gust and wind radius (will discuss shortly) are given for forecast periods.

Use tidy_fcst to tidy forecast data.

tidy_fcst(df.al_18_2012_fstadv)
## # A tibble: 216 × 8
##    Key       Adv Date                FcstDate              Lat   Lon  Wind  Gust
##    <chr>   <dbl> <dttm>              <dttm>              <dbl> <dbl> <dbl> <dbl>
##  1 AL1820…     1 2012-10-22 15:00:00 2012-10-23 00:00:00  13.7 -78.3    35    45
##  2 AL1820…     1 2012-10-22 15:00:00 2012-10-23 12:00:00  14.3 -78.1    45    55
##  3 AL1820…     1 2012-10-22 15:00:00 2012-10-24 00:00:00  15.7 -77.6    55    65
##  4 AL1820…     1 2012-10-22 15:00:00 2012-10-24 12:00:00  17.4 -77      60    75
##  5 AL1820…     1 2012-10-22 15:00:00 2012-10-25 12:00:00  20.5 -76      55    65
##  6 AL1820…     1 2012-10-22 15:00:00 2012-10-26 12:00:00  24.5 -74.5    55    65
##  7 AL1820…     1 2012-10-22 15:00:00 2012-10-27 12:00:00  27   -73      50    60
##  8 AL1820…     2 2012-10-22 21:00:00 2012-10-23 06:00:00  13.6 -78.5    35    45
##  9 AL1820…     2 2012-10-22 21:00:00 2012-10-23 18:00:00  14.9 -78.3    45    55
## 10 AL1820…     2 2012-10-22 21:00:00 2012-10-24 06:00:00  16.4 -77.8    55    65
## # … with 206 more rows

Forecast Dates/Times

A note about forecast times.

df.al_18_2012_fstadv %>% select(Date, Hr12FcstDate) %>% slice(1)
## # A tibble: 1 × 2
##   Date                Hr12FcstDate       
##   <dttm>              <dttm>             
## 1 2012-10-22 15:00:00 2012-10-23 00:00:00

Notice the Date of this advisory is Oct 22 at 15:00 UTC. The Hr12FcstDate is Oct 23, 00:00 UTC. This difference, obviously, is not 12 hours. What gives? Forecast/Advisory products are issued with two “current” positions: one that is current (and provided in the dataset) and a position from three hours prior. So, in this specific advisory the text would contain the position of the storm for Oct 22, 12:00 UTC. It is from this position the forecast points are based. I do not know why.

Therefore, while officially the forecast periods are 12, 24, 36, … hours, in reality they are 9, 21, 33, … hours from the issuance time of the product.

Forecast Wind Radius

Some forecast positions may also contain wind radius information (only up to 72 hours).

str(df.al_18_2012_fstadv %>% select(Hr12NE64:Hr12NW34))

Again, these variables are prepended with the prefix prefix “Hrn” where n notes the forecast period.

tidy_fcst_wr will tidy this subset of data.

tidy_fcst_wr(df.al_18_2012_fstadv)
## # A tibble: 337 × 9
##    Key        Adv Date                FcstDate            WindField    NE    SE
##    <chr>    <dbl> <dttm>              <dttm>                  <dbl> <dbl> <dbl>
##  1 AL182012     1 2012-10-22 15:00:00 2012-10-23 00:00:00        34    40    30
##  2 AL182012     1 2012-10-22 15:00:00 2012-10-23 12:00:00        34    50    60
##  3 AL182012     1 2012-10-22 15:00:00 2012-10-24 00:00:00        34    80    80
##  4 AL182012     1 2012-10-22 15:00:00 2012-10-24 00:00:00        50    30    40
##  5 AL182012     1 2012-10-22 15:00:00 2012-10-24 12:00:00        34    90    90
##  6 AL182012     1 2012-10-22 15:00:00 2012-10-24 12:00:00        50    40    40
##  7 AL182012     1 2012-10-22 15:00:00 2012-10-25 12:00:00        34   200   180
##  8 AL182012     1 2012-10-22 15:00:00 2012-10-25 12:00:00        50    50    40
##  9 AL182012     2 2012-10-22 21:00:00 2012-10-23 06:00:00        34    50    60
## 10 AL182012     2 2012-10-22 21:00:00 2012-10-23 18:00:00        34    50    60
## # … with 327 more rows, and 2 more variables: SW <dbl>, NW <dbl>

Please see the National Hurricane Center’s website for more information on understanding the Forecast/Advisory product.

Strike Probabilities (prblty)

Strike probabilities were discontinued after the 2005 hurricane season (replaced by Wind Speed Probabilities; wndprb). For this example, we’ll look at Hurricane Katrina. For this we use the function get_prblty.

df.al_12_2005_prblty <- get_storms(year = 2005, basin = "AL") %>% 
    filter(Name == "Hurricane Katrina") %>% 
    .$Link %>% 
    get_prblty()
str(df.al_12_2005_prblty)
## tibble [937 × 10] (S3: tbl_df/tbl/data.frame)
##  $ Status  : chr [1:937] "Tropical Depression" "Tropical Depression" "Tropical Depression" "Tropical Depression" ...
##  $ Name    : chr [1:937] "Twelve" "Twelve" "Twelve" "Twelve" ...
##  $ Adv     : chr [1:937] "1" "1" "1" "1" ...
##  $ Date    : POSIXct[1:937], format: "2005-08-23 21:00:00" "2005-08-23 21:00:00" ...
##  $ Location: chr [1:937] "25.0N  77.7W" "JACKSONVILLE FL" "25.7N  78.5W" "SAVANNAH GA" ...
##  $ A       : num [1:937] 50 0 36 0 19 0 0 0 0 0 ...
##  $ B       : num [1:937] 0 0 0 0 6 0 1 0 0 5 ...
##  $ C       : num [1:937] 0 2 0 0 1 0 0 0 0 3 ...
##  $ D       : num [1:937] 0 9 1 6 1 3 3 2 2 5 ...
##  $ E       : num [1:937] 50 11 37 6 27 3 4 2 2 13 ...

This dataframe contains the possibility of a cyclone passing within 65 nautical miles of Location. The variables A, B, C, D, and E are as they appear in the products and were left as-is to avoid confusion. They’re definition is as follows:

  • A - current through 12 hours.
  • B - within the next 12-24 hours
  • C - within the next 24-36 hours
  • D - within the next 36-48 hours
  • E - Total probability from current through 48 hours.

Many values in the text product may be “X” for less than 1% chance of a strike. These values are converted to 0 as the fields are numeric.

The strike probability products did not contain Key which is the unique identifier for every cyclone. So the best way to do any joins will be by Name, Adv and Date.

Strike Probabilities may not exist for most Pacific cyclones.

Wind Speed Probabilities (wndprb)

df.al_18_2012_wndprb <- df.al_2012 %>% 
    filter(Name == "Hurricane Sandy") %>% 
    .$Link %>% 
    get_wndprb()
str(df.al_18_2012_wndprb)
## tibble [2,227 × 18] (S3: tbl_df/tbl/data.frame)
##  $ Key       : chr [1:2227] "AL182012" "AL182012" "AL182012" "AL182012" ...
##  $ Adv       : num [1:2227] 1 1 1 1 1 1 1 1 1 1 ...
##  $ Date      : POSIXct[1:2227], format: "2012-10-22 15:00:00" "2012-10-22 15:00:00" ...
##  $ Location  : chr [1:2227] "FT PIERCE FL" "W PALM BEACH" "MIAMI FL" "MARATHON FL" ...
##  $ Wind      : num [1:2227] 34 34 34 34 34 34 34 50 64 34 ...
##  $ Wind12    : num [1:2227] 0 0 0 0 0 0 0 0 0 0 ...
##  $ Wind24    : num [1:2227] 0 0 0 0 1 0 0 0 0 0 ...
##  $ Wind24Cum : num [1:2227] 0 0 0 0 1 0 0 0 0 0 ...
##  $ Wind36    : num [1:2227] 0 0 0 0 1 0 0 0 0 0 ...
##  $ Wind36Cum : num [1:2227] 0 0 0 0 2 0 0 0 0 0 ...
##  $ Wind48    : num [1:2227] 0 0 0 0 1 0 0 0 0 0 ...
##  $ Wind48Cum : num [1:2227] 0 0 0 0 3 0 0 0 0 0 ...
##  $ Wind72    : num [1:2227] 0 0 0 0 0 0 3 0 0 5 ...
##  $ Wind72Cum : num [1:2227] 0 0 0 0 3 0 3 0 0 5 ...
##  $ Wind96    : num [1:2227] 1 2 3 2 0 5 10 3 1 12 ...
##  $ Wind96Cum : num [1:2227] 1 2 3 2 3 5 13 3 1 17 ...
##  $ Wind120   : num [1:2227] 2 2 1 1 0 3 5 2 0 3 ...
##  $ Wind120Cum: num [1:2227] 3 4 4 3 3 8 18 5 1 20 ...

Wind Speed Probabilities are a bit more advanced than their predecessor. The Wind variable is for 34kt, 50kt and 64kt winds expected within a specific time period.

Each consecutive variable is within a specific time-frame (12, 24, 36, 48, 72, 96 and 120 hours) for both that time frame and cumulative.

For example, Wind24 is the chance of Wind between 12-24 hours. Wind24Cum is the cumulative probability from Date through 24 hours.

As with strike probabilities, an “X” in the original text product meant less than 0.5% chance for the specified wind in the specified time period. “X” has been replaced by 0 in this package.

Wind Speed Probabilities may not exist for most Pacific cyclones.

See Tropical Cyclone Wind Speed Probabilities Products for more information.

Other products

Other products are available:

  • get_public for Public Advisory statements. Think general information for the public audience. May not exist for some Pacific cyclones. Additionally, when watches and warnings are issued, these are issued every 3 hours (and, in some cases, every two).

  • get_discus for Storm Discussions. These are more technical statements on the structure of a storm, forecast model tendencies and satellite presentation.

  • get_update These are brief update statements when something considerable has changed in the cyclone or if the cyclone is making landfall.

  • get_posest. Position estimates are generally issued when a storm is making landfall and may be issued hourly.

Hurricane Ike, 2008, has both updates and position estimates.

At this time none of these products are parsed. Only the content of the product is returned.