Connect to a NetCDF source and allow use of hyper_*
verbs for slicing with
hyper_filter()
, extracting data with hyper_array()
and [hyper_tibble()
from an activated grid. By default the largest grid encountered is
activated, seeactivate()
.
Usage
tidync(x, what, ...)
# S3 method for class 'character'
tidync(x, what, ...)
# S3 method for class 'tidync_data'
tidync(x, what, ...)
Arguments
- x
path to a NetCDF file
- what
(optional) character name of grid (see
ncmeta::nc_grids
) or (bare) name of variable (seencmeta::nc_vars
) or index of grid toactivate
- ...
reserved for arguments to methods, currently ignored
Details
The print method for tidync includes a lot of information about which
variables exist on which dimensions, and if any slicing (hyper_filter()
)
operations have occurred these are summarized as 'start' and 'count'
modifications relative to the dimension lengths. See print
for these details, and hyper_vars for programmatic access to
this information
Many NetCDF forms are supported and tidync tries to reduce the interpretation applied to a given source. The NetCDF system defines a 'grid' for storing array data, where 'grid' is the array 'shape', or 'set of dimensions'). There may be several grids in a single source and so we introduce the concept of grid 'activation'. Once activated, all downstream tasks apply to the set of variables that exist on that grid.
NetCDF sources with numeric types are chosen by default, even if existing 'NC_CHAR' type variables are on the largest grid. When read any 'NC_CHAR' type variables are exploded into single character elements so that dimensions match the source.
Grids
A grid is an instance of a particular set of dimensions, which can be shared by more than one variable. This is not the 'rank' of a variable (the number of dimensions) since a single data set may have many 3D variables composed of different sets of axes/dimensions. There's no formality around the concept of 'shape', as far as we know.
A dimension may have length zero, but this is a special case for a "measure" dimension, we think. (It doesn't mean the product of the dimensions is zero, for example).
Limitations
Files with compound types are not yet supported and should fail gracefully. Groups are not yet supported.
We haven't yet explored 'HDF5' in detail, so any feedback is appreciated. Major use of compound types is made by https://github.com/sosoc/croc.
Examples
## a SeaWiFS (S) Level-3 Mapped (L3m) monthly (MO) chlorophyll-a (CHL)
## remote sensing product at 9km resolution (at the equator)
## from the NASA ocean colour group in NetCDF4 format (.nc)
## for 31 day period January 2008 (S20080012008031)
f <- "S20080012008031.L3m_MO_CHL_chlor_a_9km.nc"
l3file <- system.file("extdata/oceandata", f, package= "tidync")
## skip on Solaris
if (!tolower(Sys.info()[["sysname"]]) == "sunos") {
tnc <- tidync(l3file)
print(tnc)
}
#>
#> Data Source (1): S20080012008031.L3m_MO_CHL_chlor_a_9km.nc ...
#>
#> Grids (4) <dimension family> : <associated variables>
#>
#> [1] D1,D0 : chlor_a **ACTIVE GRID** ( 9331200 values per variable)
#> [2] D3,D2 : palette
#> [3] D0 : lat
#> [4] D1 : lon
#>
#> Dimensions 4 (2 active):
#>
#> dim name length min max start count dmin dmax unlim coord_dim
#> <chr> <chr> <dbl> <dbl> <dbl> <int> <int> <dbl> <dbl> <lgl> <lgl>
#> 1 D0 lat 2160 -90.0 90.0 1 2160 -90.0 90.0 FALSE TRUE
#> 2 D1 lon 4320 -180. 180. 1 4320 -180. 180. FALSE TRUE
#>
#> Inactive dimensions:
#>
#> dim name length min max unlim coord_dim
#> <chr> <chr> <dbl> <dbl> <dbl> <lgl> <lgl>
#> 1 D2 rgb 3 1 3 FALSE FALSE
#> 2 D3 eightbitcolor 256 1 256 FALSE FALSE
## very simple Unidata example file, with one dimension
if (FALSE) { # \dontrun{
uf <- system.file("extdata/unidata", "test_hgroups.nc", package = "tidync")
recNum <- tidync(uf) %>% hyper_tibble()
print(recNum)
} # }
## a raw grid of Southern Ocean sea ice concentration from IFREMER
## it is 12.5km resolution passive microwave concentration values
## on a polar stereographic grid, on 2 October 2017, displaying the
## "hole in the ice" made famous here:
## https://tinyurl.com/ycbchcgn
ifr <- system.file("extdata/ifremer", "20171002.nc", package = "tidync")
ifrnc <- tidync(ifr)
#> Warning: Function `CFtimestamp()` is deprecated. Use `as_timestamp()` instead.
ifrnc %>% hyper_tibble(select_var = "concentration")
#> # A tibble: 207,195 × 4
#> concentration ni nj time
#> <int> <int> <int> <chr>
#> 1 0 291 5 2017-10-02 12:00:00
#> 2 0 292 5 2017-10-02 12:00:00
#> 3 0 293 5 2017-10-02 12:00:00
#> 4 0 294 5 2017-10-02 12:00:00
#> 5 0 295 5 2017-10-02 12:00:00
#> 6 0 296 5 2017-10-02 12:00:00
#> 7 0 297 5 2017-10-02 12:00:00
#> 8 0 298 5 2017-10-02 12:00:00
#> 9 0 299 5 2017-10-02 12:00:00
#> 10 0 300 5 2017-10-02 12:00:00
#> # ℹ 207,185 more rows