
Vetting popler
Aldo Compagnoni, Sam Levin
2023-03-06
Source:vignettes/vetting-popler.Rmd
vetting-popler.Rmd
Introduction: identifying groups of data sets
The popler
R package was built to foster scientific synthesis using LTER long-term population data. The premise of such synthesis is using data from many research projects that share characteristics of scientific interest. To identify projects sharing salient attributes, popler
uses the metadata information associated with each LTER project. In particular, it is fairly easy to select projects based on one or more of the following features:
- Replication, temporal or spatial.
- Taxonomic group(s).
- Study characteristics.
- Geographic location.
Vetting the database based on these criteria is intuitive. However, popler
also facilitates identifying data sets in other ways. Below we provide several examples on how to select LTER data based on the four types of features described above. Moreover, in the final section we also show how to carry out more complicated types of searches.
1. Replication
Temporal replication
If you are interested in long-term data, you will likely want to select projects based on how many years the data was collected for. This is straightforward:
library(popler)
pplr_browse(duration_years > 10)
## # A tibble: 163 × 20
## # Groups: title, proj_metadata_key, lterid, datatype, structured_data,
## # studytype, duration_years, community, studystartyr, studyendyr,
## # structured_type_1, structured_type_2, structured_type_3, structured_type_4,
## # treatment_type_1, treatment_type_2, treatment_type_3, lat_lter, lng_lter
## # [163]
## title proj_…¹ lterid datat…² struc…³ study…⁴ durat…⁵ commu…⁶ study…⁷ study…⁸
## * <chr> <int> <chr> <chr> <chr> <chr> <int> <chr> <chr> <chr>
## 1 SBC L… 1 SBC indivi… no obs 14 no 2000.0 2014.0
## 2 SBC L… 2 SBC count no obs 14 yes 2000.0 2014.0
## 3 SBC L… 3 SBC count yes obs 14 yes 2000.0 2014.0
## 4 SBC L… 4 SBC cover no obs 14 yes 2000.0 2014.0
## 5 SBC L… 12 SBC density no obs 21 yes 1990.0 2011.0
## 6 SBC L… 13 SBC count no obs 28 yes 1982.0 2010.0
## 7 SBC L… 14 SBC cover no obs 28 yes 1982.0 2010.0
## 8 SBC L… 15 SBC biomass no obs 24 yes 1982.0 2006.0
## 9 SBC L… 17 SBC biomass no obs 50 no 1957.0 2007.0
## 10 Long-… 21 SEV count yes obs 21 yes 1992.0 2013.0
## # … with 153 more rows, 10 more variables: structured_type_1 <chr>,
## # structured_type_2 <chr>, structured_type_3 <chr>, structured_type_4 <chr>,
## # treatment_type_1 <chr>, treatment_type_2 <chr>, treatment_type_3 <chr>,
## # lat_lter <dbl>, lng_lter <dbl>, taxas <named list>, and abbreviated
## # variable names ¹proj_metadata_key, ²datatype, ³structured_data, ⁴studytype,
## # ⁵duration_years, ⁶community, ⁷studystartyr, ⁸studyendyr
Note that most LTER projects contemplate sampling at a yearly or sub-yearly frequency. Thus, studies longer than 10 years often guarantee a longitudinal series of 10 or more observations. Note that the duration_years
variable is calculated as studyendyr - studystartyr
. Thus, an additional variable named samplefreq
characterizes the approximate sample frequency of each study.
pplr_dictionary(samplefreq)
## $`samplefreq (NA)`
## [1] "year" "yr" "season:yr" "biweekly" "month"
## [6] "month:year" "monthly" "season:year" "bimonthly" "NaN"
## [11] "biennial" "quadrennial" "irregular" "quinquennial" "day"
pplr_browse(samplefreq == "monthly")
## # A tibble: 1 × 20
## # Groups: title, proj_metadata_key, lterid, datatype, structured_data,
## # studytype, duration_years, community, studystartyr, studyendyr,
## # structured_type_1, structured_type_2, structured_type_3, structured_type_4,
## # treatment_type_1, treatment_type_2, treatment_type_3, lat_lter, lng_lter
## # [1]
## title proj_…¹ lterid datat…² struc…³ study…⁴ durat…⁵ commu…⁶ study…⁷ study…⁸
## * <chr> <int> <chr> <chr> <chr> <chr> <int> <chr> <chr> <chr>
## 1 SBC LT… 20 SBC count no obs 1 yes 2008.0 2009.0
## # … with 10 more variables: structured_type_1 <chr>, structured_type_2 <chr>,
## # structured_type_3 <chr>, structured_type_4 <chr>, treatment_type_1 <chr>,
## # treatment_type_2 <chr>, treatment_type_3 <chr>, lat_lter <dbl>,
## # lng_lter <dbl>, taxas <named list>, and abbreviated variable names
## # ¹proj_metadata_key, ²datatype, ³structured_data, ⁴studytype,
## # ⁵duration_years, ⁶community, ⁷studystartyr, ⁸studyendyr
Note that samplefreq
is not a default variable included in the pplr_dictionary
or pplr_browse()
functions. This can be viewed by specifying the full_tbl = TRUE
argument in either of the above functions.
###1. Spatial replication
Before downloading data
If you wish to select data sets based on their spatial replication, you need to consider that popler
organizes data in nested spatial levels. For example, in many plant studies data is collected at the plot level, which can be nested within block, which in turn can be nested within site. popler
labels spatial levels using numbers. Spatial level 1 is the coarsest level of replication which contains all other spatial replicates. In the example above, spatial level 1 is site, spatial level 2 is block, and spatial level 3 is plot. popler
allows for a total of 5 spatial levels. Given the above, you can select studies based on three criteria:
The total number of spatial replicates.
The number of replicates within a specific spatial level.
The number of nested spatial replicates.
Below we provide three examples for each one of these respective cases.
pplr_browse(tot_spat_rep > 100)
## # A tibble: 158 × 20
## # Groups: title, proj_metadata_key, lterid, datatype, structured_data,
## # studytype, duration_years, community, studystartyr, studyendyr,
## # structured_type_1, structured_type_2, structured_type_3, structured_type_4,
## # treatment_type_1, treatment_type_2, treatment_type_3, lat_lter, lng_lter
## # [158]
## title proj_…¹ lterid datat…² struc…³ study…⁴ durat…⁵ commu…⁶ study…⁷ study…⁸
## * <chr> <int> <chr> <chr> <chr> <chr> <int> <chr> <chr> <chr>
## 1 SBC L… 1 SBC indivi… no obs 14 no 2000.0 2014.0
## 2 SBC L… 2 SBC count no obs 14 yes 2000.0 2014.0
## 3 SBC L… 3 SBC count yes obs 14 yes 2000.0 2014.0
## 4 SBC L… 5 SBC indivi… no exp 6 no 2008.0 2014.0
## 5 SBC L… 6 SBC count yes exp 6 yes 2008.0 2014.0
## 6 SBC L… 7 SBC count no exp 6 yes 2008.0 2014.0
## 7 SBC L… 12 SBC density no obs 21 yes 1990.0 2011.0
## 8 SBC L… 13 SBC count no obs 28 yes 1982.0 2010.0
## 9 SBC L… 14 SBC cover no obs 28 yes 1982.0 2010.0
## 10 SBC L… 15 SBC biomass no obs 24 yes 1982.0 2006.0
## # … with 148 more rows, 10 more variables: structured_type_1 <chr>,
## # structured_type_2 <chr>, structured_type_3 <chr>, structured_type_4 <chr>,
## # treatment_type_1 <chr>, treatment_type_2 <chr>, treatment_type_3 <chr>,
## # lat_lter <dbl>, lng_lter <dbl>, taxas <named list>, and abbreviated
## # variable names ¹proj_metadata_key, ²datatype, ³structured_data, ⁴studytype,
## # ⁵duration_years, ⁶community, ⁷studystartyr, ⁸studyendyr
pplr_browse(spatial_replication_level_5_number_of_unique_reps > 1)
## # A tibble: 4 × 20
## # Groups: title, proj_metadata_key, lterid, datatype, structured_data,
## # studytype, duration_years, community, studystartyr, studyendyr,
## # structured_type_1, structured_type_2, structured_type_3, structured_type_4,
## # treatment_type_1, treatment_type_2, treatment_type_3, lat_lter, lng_lter
## # [4]
## title proj_…¹ lterid datat…² struc…³ study…⁴ durat…⁵ commu…⁶ study…⁷ study…⁸
## * <chr> <int> <chr> <chr> <chr> <chr> <int> <chr> <chr> <chr>
## 1 Plant … 141 AND cover no obs 51 yes 1962.0 2013.0
## 2 e093: … 287 CDR cover no exp 13 yes 1991.0 2004.0
## 3 Macroi… 862 PIE count no exp 10 yes 2003.0 2013.0
## 4 Meiofa… 868 PIE count no exp 6 yes 2003.0 2009.0
## # … with 10 more variables: structured_type_1 <chr>, structured_type_2 <chr>,
## # structured_type_3 <chr>, structured_type_4 <chr>, treatment_type_1 <chr>,
## # treatment_type_2 <chr>, treatment_type_3 <chr>, lat_lter <dbl>,
## # lng_lter <dbl>, taxas <named list>, and abbreviated variable names
## # ¹proj_metadata_key, ²datatype, ³structured_data, ⁴studytype,
## # ⁵duration_years, ⁶community, ⁷studystartyr, ⁸studyendyr
pplr_browse(n_spat_levs == 3)
## # A tibble: 96 × 20
## # Groups: title, proj_metadata_key, lterid, datatype, structured_data,
## # studytype, duration_years, community, studystartyr, studyendyr,
## # structured_type_1, structured_type_2, structured_type_3, structured_type_4,
## # treatment_type_1, treatment_type_2, treatment_type_3, lat_lter, lng_lter
## # [96]
## title proj_…¹ lterid datat…² struc…³ study…⁴ durat…⁵ commu…⁶ study…⁷ study…⁸
## * <chr> <int> <chr> <chr> <chr> <chr> <int> <chr> <chr> <chr>
## 1 SBC L… 13 SBC count no obs 28 yes 1982.0 2010.0
## 2 SBC L… 15 SBC biomass no obs 24 yes 1982.0 2006.0
## 3 SBC L… 16 SBC count no obs 7 no 2003.0 2010.0
## 4 Long-… 21 SEV count yes obs 21 yes 1992.0 2013.0
## 5 Roden… 25 SEV count yes obs 8 yes 1990.0 1998.0
## 6 Burn … 28 SEV indivi… no exp 2 yes 1991.0 1993.0
## 7 Nitro… 29 SEV cover no exp 12 yes 2004.0 2016.0
## 8 Pino … 33 SEV count no obs 2 yes 2000.0 2002.0
## 9 Warmi… 34 SEV cover no exp 10 yes 2006.0 2016.0
## 10 Lives… 35 SEV cover no exp 3 yes 2004.0 2007.0
## # … with 86 more rows, 10 more variables: structured_type_1 <chr>,
## # structured_type_2 <chr>, structured_type_3 <chr>, structured_type_4 <chr>,
## # treatment_type_1 <chr>, treatment_type_2 <chr>, treatment_type_3 <chr>,
## # lat_lter <dbl>, lng_lter <dbl>, taxas <named list>, and abbreviated
## # variable names ¹proj_metadata_key, ²datatype, ³structured_data, ⁴studytype,
## # ⁵duration_years, ⁶community, ⁷studystartyr, ⁸studyendyr
After downloading data
Users can also explore the spatial and temporal replication of the data more explicitly after downloading it with pplr_get_data()
through two function: pplr_site_rep()
and pplr_site_rep_plot()
.
pplr_site_rep()
provides two options for exploring data that meet temporal replication requirements at a given spatial resolution. The user can choose to filter data by specifying a minimum sampling frequency per year and a minimum number of years that sample with that frequency. Because this function uses the sampling dates to calculate the frequency, it provides additional information that is not contained in the samplefreq
column of the main metadata table.
# download some data (note: this download is >100MB)
SEV <- pplr_get_data(proj_metadata_key == 21)
# Create a summary table containing names of replication levels that contain 2 samples per year for 10 years.
SEV_long_studies <- pplr_site_rep(SEV,
freq = 2,
duration = 10,
return_logical = FALSE)
# you can also subset it directly using the function and specifying it to return a logical vector
subset_vec <- pplr_site_rep(SEV,
freq = 2,
duration = 10,
return_logical = TRUE)
# store subset of data
SEV_long_data <- SEV[subset_vec, ]
Users can also visualize the frequency of sampling at the coarsest level of spatial replication using the pplr_site_rep_plot()
function. This generates a ggplot
that denotes whether or not a particular site was sampled in a particular year. Note that the coarsest level of spatial replication is called site and it is contained in the variable spatial_replication_level_1
.
library(ggplot2)
# return the plot object w/ return_plot = TRUE
pplr_site_rep_plot(SEV_long_data, return_plot = TRUE) +
ggtitle("Long Term Data from Sevilleta LTER")
# or return an invisible copy of the input data and keep piping
library(dplyr)
SEV_long_data %>%
pplr_site_rep_plot(return_plot = FALSE) %>%
pplr_report_metadata()
###2. Taxonomic group
popler
is not limited to specific taxonomic groups, and it currently contains mostly data on animals and plants. To select information based on taxonomic groups, simply specify which group and which category you wish to select. The default settings of popler provide seven taxonomic groups: kingdom, phylum, class, order, family, genus, and species in each request. Column sppcode
provides the identifier, usually an alphanumeric code, associated with each taxonomic entity in the original dataset. Note that not all LTER studies provide full taxonomic information; hence, browsing studies by taxonomic information will provide partial results (in the example below, not all insects studies will be identified).
pplr_dictionary(class)
## $`class (class)`
## [1] "Phaeophycea" "Actinopterygii" "Chondrichthyes"
## [4] "Osteichthes" "Asteroidea" "Gastropoda"
## [7] "Anthozoa" "Cephalopoda" "Malacostraca"
## [10] "Phaeophyceae" "Bivalvia" "Holothuroidea"
## [13] "Echinoidea" "Ascidiacea" "Demospongiae"
## [16] "Polychaeta" "Ophiuroidea" "Ascidiacae"
## [19] "Rhodophyceae" "Hydrozoa" "Gymnolaemata"
## [22] "Liliopsida" "Ascidacea" "Chlorophyceae"
## [25] "Bacillariophyta" "Maxillopoda" "Calcarea"
## [28] "Ophiuroidea/Asteroidea" "Ophiuroidae" "Floriophyccae"
## [31] "Mammalia" "Bacillariophyceae" "Conoidasida"
## [34] "Secernentea" "Cestoda" "Archiacanthocephala"
## [37] "cestode" "Adenophorea" "Insecta"
## [40] "Arachnida" "Catenotaeniidae" "Insect"
## [43] "Reptilia" "Aves" "Collembola"
## [46] "Clitellata" "Hexapoda" "Lecanoromycetes"
## [49] "Turbellaria" "Ostracoda" "Branchiobdellida"
## [52] "Branchiopoda" "Hirudinea" "Oligochaeta"
## [55] "Pelecypoda" "Entogatha" "Annelida"
## [58] "Crustacea" "Nematoda" "Hydracarina"
## [61] "Phylum Nemertea" "Phylum Nematoda" "Phylum Cnidaria"
pplr_browse(class == "Insecta")
## # A tibble: 7 × 20
## # Groups: title, proj_metadata_key, lterid, datatype, structured_data,
## # studytype, duration_years, community, studystartyr, studyendyr,
## # structured_type_1, structured_type_2, structured_type_3, structured_type_4,
## # treatment_type_1, treatment_type_2, treatment_type_3, lat_lter, lng_lter
## # [7]
## title proj_…¹ lterid datat…² struc…³ study…⁴ durat…⁵ commu…⁶ study…⁷ study…⁸
## * <chr> <int> <chr> <chr> <chr> <chr> <int> <chr> <chr> <chr>
## 1 Rodent… 25 SEV count yes obs 8 yes 1990.0 1998.0
## 2 Effect… 43 SEV count no obs 2 yes 2008.0 2010.0
## 3 Small … 60 SEV count no exp 10 yes 1995.0 2005.0
## 4 SGS-LT… 86 SGS count no obs 8 yes 1998.0 2006.0
## 5 Aquati… 133 AND count no obs 0 yes 2001.0 2001.0
## 6 Bonanz… 156 BNZ count no obs 3 yes 2010.0 2013.0
## 7 North … 822 NTL count no obs 34 yes 1981.0 2015.0
## # … with 10 more variables: structured_type_1 <chr>, structured_type_2 <chr>,
## # structured_type_3 <chr>, structured_type_4 <chr>, treatment_type_1 <chr>,
## # treatment_type_2 <chr>, treatment_type_3 <chr>, lat_lter <dbl>,
## # lng_lter <dbl>, taxas <named list>, and abbreviated variable names
## # ¹proj_metadata_key, ²datatype, ³structured_data, ⁴studytype,
## # ⁵duration_years, ⁶community, ⁷studystartyr, ⁸studyendyr
Note that the taxonomic information returned in pplr_browse()
is housed in a data structure called list column. Each entry of this list column is itself a list that contains a data.frame
with eight columns. Users can access this information using the following syntax.
insects <- pplr_browse(class == 'Insecta')
# access the taxonomic table from the first project in the insects object
insects$taxas[[1]]
## # A tibble: 7 × 8
## sppcode species kingdom phylum class order family genus
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 cune neomexicana Animalia Arthropoda Insecta Diptera Oestridae Cutere…
## 2 cune neomexicana Animalia Arthropoda Insecta Diptera Oestridae Cutere…
## 3 cuau austeni Animalia Arthropoda Insecta Diptera Oestridae Cutere…
## 4 flea sp Animalia Arthropoda Insecta Siphonaptera NA NA
## 5 cuau austeni Animalia Arthropoda Insecta Diptera Oestridae Cutere…
## 6 flea sp Animalia Arthropoda Insecta Siphonaptera NA NA
## 7 cusp species Animalia Arthropoda Insecta Diptera Oestridae Cutere…
# second table (etc.)
insects$taxas[[2]]
## # A tibble: 205 × 8
## sppcode species kingdom phylum class order family genus
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 ANPERPUL NA Animalia Arthropoda Insecta Hymenoptera NA NA
## 2 APHABMOR morrisoni Animalia Arthropoda Insecta Hymenoptera APIDAE Habr…
## 3 HAAGAANG angelicus Animalia Arthropoda Insecta Hymenoptera HALICTIDAE Agap…
## 4 APDIAENA NA Animalia Arthropoda Insecta Hymenoptera NA NA
## 5 MEOSMTIT titusi Animalia Arthropoda Insecta Hymenoptera MEGACHILIDAE Osmia
## 6 ANPER005 5 Animalia Arthropoda Insecta Hymenoptera ANDRENIDAE Perd…
## 7 HALASCOA NA Animalia Arthropoda Insecta Hymenoptera NA NA
## 8 APTETALB NA Animalia Arthropoda Insecta Hymenoptera NA NA
## 9 APANTPHE NA Animalia Arthropoda Insecta Hymenoptera NA NA
## 10 HASPH002 2 Animalia Arthropoda Insecta Hymenoptera HALICTIDAE Sphe…
## # … with 195 more rows
###3. Study characteristics
Metadata information provides a few variables to select studies based on their design. In particular:
-
studytype
: indicates whether the study is observational or experimental. Options areobs
orexp
for observational and experimental studies, respectively. -
treatment_type
: type of treatments, if study is experimental. -
community
: indicates whether the project provides data on multiple species. Options areyes
orno
. -
structured_data
: indicates whether the project provides information on population structure. For example, a population can be sub-divided in age, size, or developmental classes. Options areyes
orno
.
Below we show how to use these three fields.
pplr_dictionary(community)
## $`community (NA)`
## [1] "no" "yes"
pplr_browse(community == "no") # 20 single-species studies
## # A tibble: 43 × 20
## # Groups: title, proj_metadata_key, lterid, datatype, structured_data,
## # studytype, duration_years, community, studystartyr, studyendyr,
## # structured_type_1, structured_type_2, structured_type_3, structured_type_4,
## # treatment_type_1, treatment_type_2, treatment_type_3, lat_lter, lng_lter
## # [43]
## title proj_…¹ lterid datat…² struc…³ study…⁴ durat…⁵ commu…⁶ study…⁷ study…⁸
## * <chr> <int> <chr> <chr> <chr> <chr> <int> <chr> <chr> <chr>
## 1 SBC L… 1 SBC indivi… no obs 14 no 2000.0 2014.0
## 2 SBC L… 5 SBC indivi… no exp 6 no 2008.0 2014.0
## 3 SBC L… 16 SBC count no obs 7 no 2003.0 2010.0
## 4 SBC L… 17 SBC biomass no obs 50 no 1957.0 2007.0
## 5 SBC L… 18 SBC count yes obs 4 no 2007.0 2011.0
## 6 Popul… 44 SEV indivi… no obs 4 no 2005.0 2009.0
## 7 Gunni… 47 SEV indivi… no exp 5 no 2010.0 2015.0
## 8 SGS-L… 84 SGS indivi… no obs 7 no 1999.0 2006.0
## 9 Densi… 90 VCR density no exp 6 no 2007.0 2013.0
## 10 Spruc… 158 BNZ indivi… no exp 7 no 2003.0 2010.0
## # … with 33 more rows, 10 more variables: structured_type_1 <chr>,
## # structured_type_2 <chr>, structured_type_3 <chr>, structured_type_4 <chr>,
## # treatment_type_1 <chr>, treatment_type_2 <chr>, treatment_type_3 <chr>,
## # lat_lter <dbl>, lng_lter <dbl>, taxas <named list>, and abbreviated
## # variable names ¹proj_metadata_key, ²datatype, ³structured_data, ⁴studytype,
## # ⁵duration_years, ⁶community, ⁷studystartyr, ⁸studyendyr
pplr_dictionary(treatment)
## $`treatment (type of treatment)`
## [1] "observational" "removal"
## [3] "fire" "resource"
## [5] "temp(T); precip(P); resources(N)" "consumer"
## [7] "precip" "precipitation"
## [9] "density" "disturbance"
## [11] "exclosure" "temperature"
## [13] "competition" "diversity"
## [15] "restoration"
nrow( pplr_browse(treatment == "fire") ) # 21 fire studies
## [1] 18
pplr_dictionary(studytype)
## $`studytype (NA)`
## [1] "obs" "exp"
nrow( pplr_browse(studytype == "obs") ) # 78 observational studies
## [1] 183
4. Geographic location.
To select studies based on the latitude and longitude of LTER headquarters around which datasets were, or are being collected, simply use the lat_lter
and lng_lter
numeric variables:
pplr_dictionary( lat_lter, lng_lter )
## $`lat_lter (NA)`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -77.00000 33.43000 39.09000 35.65512 45.40000 66.63000
##
## $`lng_lter (NA)`
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -149.8300 -119.8400 -106.7400 -103.4849 -93.2000 162.5200
pplr_browse( lat_lter > 40 & lng_lter < -100 ) # single-species studies
## # A tibble: 58 × 20
## # Groups: title, proj_metadata_key, lterid, datatype, structured_data,
## # studytype, duration_years, community, studystartyr, studyendyr,
## # structured_type_1, structured_type_2, structured_type_3, structured_type_4,
## # treatment_type_1, treatment_type_2, treatment_type_3, lat_lter, lng_lter
## # [58]
## title proj_…¹ lterid datat…² struc…³ study…⁴ durat…⁵ commu…⁶ study…⁷ study…⁸
## * <chr> <int> <chr> <chr> <chr> <chr> <int> <chr> <chr> <chr>
## 1 SGS-L… 63 SGS cover no obs 8 yes 1999.0 2007.0
## 2 SGS-L… 65 SGS biomass no obs 25 yes 1983.0 2008.0
## 3 Open … 66 SGS cover no exp 4 yes 1997.0 2001.0
## 4 SGS-L… 69 SGS count no exp 9 yes 1997.0 2006.0
## 5 SGS-L… 70 SGS cover no exp 8 yes 1997.0 2005.0
## 6 SGS-L… 71 SGS cover no exp 42 yes 1977.0 2019.0
## 7 SGS-L… 72 SGS count no exp 36 yes 1975.0 2011.0
## 8 SGS-L… 73 SGS basal_… no exp 29 yes 1982.0 2011.0
## 9 SGS-L… 74 SGS cover no exp 16 yes 1992.0 2008.0
## 10 SGS-L… 76 SGS count no exp 12 yes 1998.0 2010.0
## # … with 48 more rows, 10 more variables: structured_type_1 <chr>,
## # structured_type_2 <chr>, structured_type_3 <chr>, structured_type_4 <chr>,
## # treatment_type_1 <chr>, treatment_type_2 <chr>, treatment_type_3 <chr>,
## # lat_lter <dbl>, lng_lter <dbl>, taxas <named list>, and abbreviated
## # variable names ¹proj_metadata_key, ²datatype, ³structured_data, ⁴studytype,
## # ⁵duration_years, ⁶community, ⁷studystartyr, ⁸studyendyr
5. More complicated searches
Popler allows carrying out more complicated searches by allowing to i) simultaneously search several types of metadata variables, and ii) search studies matching a string pattern. In the first case, simply provide the function pplr_browse()
with a logical statement regarding more than one metadata variable. For example, if you want studies on plants with at least 4 nested spatial levels, and 10 years of data:
pplr_browse(kingdom == "Plantae" & n_spat_levs == 4 & duration_years > 10)
In the second case, the keyword argument in function pplr_browse()
will search for string patterns within the metadata of each study. For example, in case we were interested in studies using traps:
pplr_browse(keyword = 'trap')
Note that the keyword argument works with regular expressions as well:
# look for studies that include the words "trap" or "spatial"
pplr_browse(keyword = 'trap|spatial')