Choosing a *clifro* Datatype
Blake Seers
2024-10-28
Source:vignettes/choose-datatype.Rmd
choose-datatype.Rmd
Introduction
The cf_datatype
function is all that is required to
select clifro
datatypes. This function can be called
without any arguments that takes the user through interactive menus,
otherwise the datatypes may be chosen programmatically if the menu
options are known in advance. Whether the intention is to choose one
datatype or many, this vignette details the various methods in choosing
them.
Using the menus interactively to choose a datatype
Those familiar with the cliflo datatype
selection menu will recall the myriad datatypes and options
available in the National Climate Database. Selection of a datatype
requires navigation through trees of menus, check boxes and combo boxes.
The cf_datatype
function mimics this (tedious) behaviour by
default, i.e. when no arguments are passed to the function and the
datatypes, menus and options are all identical to (actually scraped
from) the datatype selection menu.
A minimal example
Let’s say the datatype we are interested in is 9am surface wind in knots.
surfaceWind.dt = cf_datatype()
# If you prefer pointing and clicking - turn the graphics option on:
surfaceWind.dt = cf_datatype(graphics = TRUE)
Daily and Hourly Observations
## Daily and Hourly Observations
##
## 1: Combined Observations
## 2: Wind
## 3: Precipitation
## 4: Temperature and Humidity
## 5: Sunshine and Radiation
## 6: Weather
## 7: Pressure
## 8: Clouds
## 9: Evaporation / soil moisture
The first menu that appears when the above line of code is run in R
is the ‘Daily and Hourly Observations’. We are interested in ‘Wind’,
therefore we would type in the number of our selection (or select it
using the mouse if graphics = TRUE
), in this case option
2.
Submenu for the given datatype
## Wind
##
## 1: Surface wind
## 2: Max Gust
The next menu prompts us for the type of wind we are interested in, in this case we are interested in surface wind which is option 1.
Options for the given datatype
## Surface wind options
##
## 1: WindRun
## 2: HlyWind
## 3: 3HlyWind
## 4: 9amWind
The next menu is the options for the chosen datatype, for which we may choose more than one. If more than one option for a given datatype is sought, options must be chosen one at a time. This is made possible by a menu prompting whether or not we would like to select another datatype every time an option is chosen.
## Choose another option?
##
## 1: yes
## 2: no
We are interested only in the surface wind at 9am in this example therefore we don’t choose another option after we choose option 4.
Final options
## Units
##
## 1: m/s
## 2: km/hr
## 3: knots
This final options menu is typically associated with the units of the datatype (although not always) and is sometimes not necessary, depending on the datatype. For this example we do have a final menu and it prompts us for the units that we are interested in where we choose option 3.
The surface wind datatype and the associated options are now saved in
R as an object called surfaceWind.dt
.
surfaceWind.dt
## dt.name dt.type dt.options dt.combo
## dt1 Wind Surface wind [9amWind] knots
Choosing a datatype without the menus
The bold numbers in the minimal example above are emphasised specifically to show the menu order and selections needed to choose the strength of the 9am surface wind in knots datatype, i.e. 2 1 4 3. In general, if we know the selections needed for each of the four menus then we can choose any datatype without using the menus making datatype selection a lot faster and a much less tedious.
A minimal example
To repeat our minimal example without the use of the menus we would
just pass them as arguments to the cf_datatype
function.
These arguments are the selections of each of the four menus (in order)
separated by a comma.
surfaceWind.dt = cf_datatype(2, 1, 4, 3)
surfaceWind.dt
## dt.name dt.type dt.options dt.combo
## dt1 Wind Surface wind [9amWind] knots
Selecting more than one option for a given datatype
Recall that we may choose more than one option at the third menu, equivalent to the check boxes on the cliflo database query form. Using the menu to choose more than one option is an iterative process however we can just update our third function argument to deal with this in a more time-efficient manner.
surfaceWind.dt = cf_datatype(2, 1, c(2, 4), 3)
surfaceWind.dt
## dt.name dt.type dt.options dt.combo
## dt1 Wind Surface wind [HlyWind, 9amWind] knots
surfaceWind.dt
now contains the surface wind datatype
(in knots) with both 9am wind and hourly wind. Notice how all the other
function arguments remain the same.
Selecting multiple datatypes
Most applications involving the environmental data contained within
the National Climate Database will require selection of more than one
option for more than one datatype. This is where the true advantages in
using the clifro
package become apparent.
An extended example
Let us consider an application where we are now interested in hourly and 9am surface wind along with hourly and daily rainfall, hourly counts of lightning flashes and daily and hourly grass temperature extremes.
There are a few ways to choose all of these datatypes. Firstly, you could go through the menu options one by one, selecting the corresponding datatypes and options and saving the resulting datatypes as different R objects. A less laborious alternative is to create each of these datatypes without the menus. This does of course assume we know the selections at each branch of the datatype selection menus.
# Hourly and 9am surface wind (knots)
surfaceWind.dt = cf_datatype(2, 1, c(2, 4), 3)
surfaceWind.dt
## dt.name dt.type dt.options dt.combo
## dt1 Wind Surface wind [HlyWind, 9amWind] knots
# Hourly and daily rainfall
rainfall.dt = cf_datatype(3, 1, c(1, 2))
rainfall.dt
## dt.name dt.type dt.options dt.combo
## dt1 Precipitation Rain (fixed periods) [Daily , Hourly]
# Hourly counts of lightning flashes
lightning.dt = cf_datatype(6, 1, 1)
lightning.dt
## dt.name dt.type dt.options dt.combo
## dt1 Weather Lightning [Ltng]
# Daily and hourly grass temperature extremes
temperatureExtremes.dt = cf_datatype(4, 2, c(5, 6))
temperatureExtremes.dt
# Note: only the surface wind datatype requires combo options
## dt.name dt.type dt.options dt.combo
## dt1 Temperature and Humidity Max_min_temp [DlyGrass, HlyGrass]
This results in 4 separate objects in R containing the datatypes and their corresponding options. If we were wanting to submit a query using all of these datatypes at once, having four separate datatypes is less than optimal. The following table shows the options for each of the menus that we are interested in.
Menu | Surface wind | Rainfall | Lightning | Temperature |
---|---|---|---|---|
First selection | 2 | 3 | 6 | 4 |
Second selection | 1 | 1 | 1 | 2 |
Third selection(s) | 2 & 4 | 1 & 2 | 1 | 5 & 6 |
combo box options | 3 | NA | NA | NA |
We can read across the columns to see the selections that are needed
to return an R object containing the datatypes we are interested in. We
can then just pass these values into the cf_datatype
function to return a single R object containing all of our datatypes and
options.
query1.dt = cf_datatype(c(2, 3, 6, 4),
c(1, 1, 1, 2),
list(c(2, 4), c(1, 2), 1, c(5, 6)),
c(3, NA, NA, NA))
query1.dt
## dt.name dt.type dt.options dt.combo
## dt1 Wind Surface wind [HlyWind, 9amWind] knots
## dt2 Precipitation Rain (fixed periods) [Daily , Hourly]
## dt3 Weather Lightning [Ltng]
## dt4 Temperature and Humidity Max_min_temp [DlyGrass, HlyGrass]
We can also easily combine separate cfDatatype
objects
in R using the addition symbol +
, to produce an identical
result. This may be useful when you want to conduct multiple queries
which include a subset of these datatypes.
query1.dt = surfaceWind.dt + rainfall.dt + lightning.dt +
temperatureExtremes.dt
query1.dt
## dt.name dt.type dt.options dt.combo
## dt1 Wind Surface wind [HlyWind, 9amWind] knots
## dt2 Precipitation Rain (fixed periods) [Daily , Hourly]
## dt3 Weather Lightning [Ltng]
## dt4 Temperature and Humidity Max_min_temp [DlyGrass, HlyGrass]
Extras
# To add another datatype using the menu:
query1.dt + cf_datatype()
# Is equivalent to:
query1.dt + cf_datatype(NA, NA, NA, NA)
# Therefore is equivalent to adding a column of NA's to the above table:
query1.dt = cf_datatype(c(2, 3, 6, 4, NA),
c(1, 1, 1, 2, NA),
list(c(2, 4), c(1, 2), 1, c(5, 6), NA),
c(3, NA, NA, NA, NA))
# Half an unknown wind datatype i.e. we know first selection = 2 but nothing
# further:
rain.dt = cf_datatype(2) # Or cf_datatype(2, NA, NA, NA)