Tidy NetCDF

Connect to a NetCDF source and allow use of hyper_* verbs for slicing with hyper_filter(), extracting data with hyper_array() and [hyper_tibble() from an activated grid. By default the largest grid encountered is activated, seeactivate().

Usage

tidync(x, what, ...)

# S3 method for class 'character'
tidync(x, what, ...)

# S3 method for class 'tidync_data'
tidync(x, what, ...)

Arguments

x: path to a NetCDF file
what: (optional) character name of grid (see ncmeta::nc_grids) or (bare) name of variable (see ncmeta::nc_vars) or index of grid to activate
...: reserved for arguments to methods, currently ignored

Details

The print method for tidync includes a lot of information about which variables exist on which dimensions, and if any slicing (hyper_filter()) operations have occurred these are summarized as 'start' and 'count' modifications relative to the dimension lengths. See print for these details, and hyper_vars for programmatic access to this information

Many NetCDF forms are supported and tidync tries to reduce the interpretation applied to a given source. The NetCDF system defines a 'grid' for storing array data, where 'grid' is the array 'shape', or 'set of dimensions'). There may be several grids in a single source and so we introduce the concept of grid 'activation'. Once activated, all downstream tasks apply to the set of variables that exist on that grid.

NetCDF sources with numeric types are chosen by default, even if existing 'NC_CHAR' type variables are on the largest grid. When read any 'NC_CHAR' type variables are exploded into single character elements so that dimensions match the source.

Grids

A grid is an instance of a particular set of dimensions, which can be shared by more than one variable. This is not the 'rank' of a variable (the number of dimensions) since a single data set may have many 3D variables composed of different sets of axes/dimensions. There's no formality around the concept of 'shape', as far as we know.

A dimension may have length zero, but this is a special case for a "measure" dimension, we think. (It doesn't mean the product of the dimensions is zero, for example).

Limitations

Files with compound types are not yet supported and should fail gracefully. Groups are not yet supported.

We haven't yet explored 'HDF5' in detail, so any feedback is appreciated. Major use of compound types is made by https://github.com/sosoc/croc.

Examples

## a SeaWiFS (S) Level-3 Mapped (L3m) monthly (MO) chlorophyll-a (CHL)
## remote sensing product at 9km resolution (at the equator)
## from the NASA ocean colour group in NetCDF4 format (.nc)
## for 31 day period January 2008 (S20080012008031) 
f <- "S20080012008031.L3m_MO_CHL_chlor_a_9km.nc"
l3file <- system.file("extdata/oceandata", f, package= "tidync")
## skip on Solaris
if (!tolower(Sys.info()[["sysname"]]) == "sunos") {
tnc <- tidync(l3file)
print(tnc)
}
#> 
#> Data Source (1): S20080012008031.L3m_MO_CHL_chlor_a_9km.nc ...
#> 
#> Grids (4) <dimension family> : <associated variables> 
#> 
#> [1]   D1,D0 : chlor_a    **ACTIVE GRID** ( 9331200  values per variable)
#> [2]   D3,D2 : palette
#> [3]   D0    : lat
#> [4]   D1    : lon
#> 
#> Dimensions 4 (2 active): 
#>   
#>   dim   name  length    min   max start count   dmin  dmax unlim coord_dim 
#>   <chr> <chr>  <dbl>  <dbl> <dbl> <int> <int>  <dbl> <dbl> <lgl> <lgl>     
#> 1 D0    lat     2160  -90.0  90.0     1  2160  -90.0  90.0 FALSE TRUE      
#> 2 D1    lon     4320 -180.  180.      1  4320 -180.  180.  FALSE TRUE      
#>   
#> Inactive dimensions:
#>   
#>   dim   name          length   min   max unlim coord_dim 
#>   <chr> <chr>          <dbl> <dbl> <dbl> <lgl> <lgl>     
#> 1 D2    rgb                3     1     3 FALSE FALSE     
#> 2 D3    eightbitcolor    256     1   256 FALSE FALSE     

## very simple Unidata example file, with one dimension
if (FALSE) { # \dontrun{
uf <- system.file("extdata/unidata", "test_hgroups.nc", package = "tidync")
recNum <- tidync(uf) %>% hyper_tibble()
print(recNum)
} # }
## a raw grid of Southern Ocean sea ice concentration from IFREMER
## it is 12.5km resolution passive microwave concentration values
## on a polar stereographic grid, on 2 October 2017, displaying the 
## "hole in the ice" made famous here:
## https://tinyurl.com/ycbchcgn
ifr <- system.file("extdata/ifremer", "20171002.nc", package = "tidync")
ifrnc <- tidync(ifr)
#> Warning: Function `CFtimestamp()` is deprecated. Use `as_timestamp()` instead.
ifrnc %>% hyper_tibble(select_var = "concentration")
#> # A tibble: 207,195 × 4
#>    concentration    ni    nj time               
#>            <int> <int> <int> <chr>              
#>  1             0   291     5 2017-10-02T12:00:00
#>  2             0   292     5 2017-10-02T12:00:00
#>  3             0   293     5 2017-10-02T12:00:00
#>  4             0   294     5 2017-10-02T12:00:00
#>  5             0   295     5 2017-10-02T12:00:00
#>  6             0   296     5 2017-10-02T12:00:00
#>  7             0   297     5 2017-10-02T12:00:00
#>  8             0   298     5 2017-10-02T12:00:00
#>  9             0   299     5 2017-10-02T12:00:00
#> 10             0   300     5 2017-10-02T12:00:00
#> # ℹ 207,185 more rows

Usage

Arguments

Details

Grids

Limitations

Examples

About

Community

Resources