Randomly sample some proportion of taxa from a taxonomy() or taxmap() object. Weights can be specified for taxa or the observations assigned to them. See dplyr::sample_frac() for the inspiration for this function.

obj$sample_frac_taxa(size, taxon_weight = NULL,
  obs_weight = NULL, obs_target = NULL,
  use_subtaxa = TRUE, collapse_func = mean, ...)
sample_frac_taxa(obj, size, taxon_weight = NULL,
  obs_weight = NULL, obs_target = NULL,
  use_subtaxa = TRUE, collapse_func = mean, ...)

Arguments

obj

(taxonomy() or taxmap()) The object to sample from.

size

(numeric of length 1) The proportion of taxa to sample.

taxon_weight

(numeric) Non-negative sampling weights of each taxon. If obs_weight is also specified, the two weights are multiplied (after obs_weight for each taxon is calculated).

obs_weight

(numeric) This option only applies to taxmap() objects. Sampling weights of each observation. The weights for each observation assigned to a given taxon are supplied to collapse_func to get the taxon weight. If use_subtaxa is TRUE then the observations assigned to every subtaxa are also used. Any variable name that appears in all_names() can be used as if it was a vector on its own. If taxon_weight is also specified, the two weights are multiplied (after obs_weight for each observation is calculated). obs_target must be used with this option.

obs_target

(character of length 1) This option only applies to taxmap() objects. The name of the data set in obj$data that values in obs_weight corresponds to. Must be used when obs_weight is used.

use_subtaxa

(logical or numeric of length 1) Affects how the obs_weight option is used. If TRUE, the weights for each taxon in an observation's classification are multiplied to get the observation weight. If TRUE just the taxonomic level the observation is assign to it considered. Positive numbers indicate the number of ranks below the target taxa to return. 0 is equivalent to FALSE. Negative numbers are equivalent to TRUE.

collapse_func

(function of length 1) If taxon_weight is used and supertaxa is TRUE, the weights for each taxon in an observation's classification are supplied to collapse_func to get the observation weight. This function should take numeric vector and return a single number.

...

Additional options are passed to filter_taxa().

Value

An object of type taxonomy() or taxmap()

See also

Examples

# sample half of the taxa sample_frac_taxa(ex_taxmap, 0.5, supertaxa = TRUE)
#> <Taxmap> #> 13 taxa: b. Mammalia, c. Plantae ... p. sapiens, r. tuberosum #> 13 edges: NA->b, NA->c, b->d, b->e ... g->l, j->o, k->p, l->r #> 4 data sets: #> info: #> # A tibble: 6 x 4 #> taxon_id name n_legs dangerous #> <chr> <chr> <dbl> <lgl> #> 1 d tiger 4 TRUE #> 2 i cat 4 FALSE #> 3 o mole 4 FALSE #> # … with 3 more rows #> phylopic_ids: a named vector of 'character' with 6 items #> d. e148eabb-f138-43[truncated] ... r. 63604565-0406-46[truncated] #> foods: a list of 6 items named by taxa: #> d, i, o, p, l, r #> abund: #> # A tibble: 8 x 5 #> taxon_id code sample_id count taxon_index #> <chr> <fct> <fct> <dbl> <int> #> 1 d T A 1 1 #> 2 i C A 2 2 #> 3 o M B 5 3 #> # … with 5 more rows #> 1 functions: #> reaction