Skip to contents

This function implements a Bayesian model that integrates data from paired eDNA and traditional surveys, as described in Keller et al. (2022) https://doi.org/10.1002/eap.2561. The model estimates parameters including the expected species catch rate and the probability of false positive eDNA detection. This function allows for optional model variations, like inclusion of site-level covariates that scale the sensitivity of eDNA sampling relative to traditional sampling, as well as estimation of catchability coefficients when multiple traditional gear types are used. Model is implemented using Bayesian inference using the rstan package, which uses Hamiltonian Monte Carlo to simulate the posterior distributions. See more examples in the Package Vignette.

Usage

jointModel(
  data,
  cov = NULL,
  family = "poisson",
  p10priors = c(1, 20),
  q = FALSE,
  phipriors = NULL,
  multicore = TRUE,
  initial_values = NULL,
  n.chain = 4,
  n.iter.burn = 500,
  n.iter.sample = 2500,
  thin = 1,
  adapt_delta = 0.9,
  verbose = TRUE,
  seed = NULL
)

Arguments

data

A list containing data necessary for model fitting. Valid tags are qPCR.N, qPCR.K, count, count.type, and site.cov. qPCR.N and qPCR.K are matrices or data frames with first dimension equal to the number of sites (i) and second dimension equal to the maximum number of eDNA samples at a given site (m). qPCR.N contains the total number of qPCR replicates per site and eDNA sample, and qPCR.K contains the total number of positive qPCR detections per site and eDNA sample. count is a matrix or data frame of traditional survey count data, with first dimension equal to the number of sites (i) and second dimension equal to the maximum number of traditional survey replicates at a given site (j). count.type is an optional matrix or data frame of integers indicating gear type used in corresponding count data, with first dimension equal to the number of sites (i) and second dimension equal to the maximum number of traditional survey replicates at a given site. Values should be integers beginning with 1 (referring to the first gear type) to n (last gear type). site.cov is an optional matrix or data frame of site-level covariate data, with first dimension equal to the number of sites (i). site.cov should include informative column names. Empty cells should be NA and will be removed during processing. Sites, i, should be consistent in all qPCR, count, and site covariate data.

cov

A character vector indicating the site-level covariates to include in the model. Default value is NULL.

family

The distribution class used to model traditional survey count data. Options include poisson ('poisson'), negative binomial ('negbin'), and gamma ('gamma'). Default value is 'poisson'.

p10priors

A numeric vector indicating beta distribution hyperparameters (alpha, beta) used as the prior distribution for the eDNA false positive probability (p10). Default vector is c(1,20).

q

A logical value indicating whether to estimate a catchability coefficient, q, for traditional survey gear types (TRUE) or to not estimate a catchability coefficient, q, for traditional survey gear types (FALSE). Default value is FALSE.

phipriors

A numeric vector indicating gamma distribution hyperparameters (shape, rate) used as the prior distribution for phi, the overdispersion in the negative binomial distribution for traditional survey gear data. Used when family = 'negbin.' If family = 'negbin', then default vector is c(0.25,0.25), otherwise, default is NULL.

multicore

A logical value indicating whether to parallelize chains with multiple cores. Default is TRUE.

initial_values

A list of lists of initial values to use in MCMC. The length should equal the number of MCMC chains. Initial values can be provided for parameters: beta, p10 (log-scale), mu, q, alpha. If no initial values are provided, default random values are drawn.

n.chain

Number of MCMC chains. Default value is 4.

n.iter.burn

Number of warm-up MCMC iterations. Default value is 500.

n.iter.sample

Number of sampling MCMC iterations. Default value is 2500.

thin

A positive integer specifying the period for saving samples. Default value is 1.

adapt_delta

Numeric value between 0 and 1 indicating target average acceptance probability used in rstan::sampling. Default value is 0.9.

verbose

Logical value controlling the verbosity of output (i.e., warnings, messages, progress bar). Default is TRUE.

seed

A positive integer seed used for random number generation in MCMC. Default is NULL, which means the seed is generated from 1 to the maximum integer supported by R.

Value

A list of:

  • a model object of class stanfit returned by rstan::sampling

  • initial values used in MCMC

Note

Before fitting the model, this function checks to ensure that the model specification is possible given the data files. These checks include:

  • All tags in data are valid (i.e., include qPCR.N, qPCR.K, count, count.type, and site.cov).

  • Dimensions of qPCR.N and qPCR.K are equal, and dimensions of count and count.type are equal (if count.type provided).

  • Number of sites in qPCR and count data are equal.

  • All data are numeric (i.e., integer or NA).

  • Empty data cells (NA) match in qPCR.N and qPCR.K and in count and count.type.

  • family is either 'poisson', 'negbin', or 'gamma'.

  • p10priors and phipriors (if used) is a vector of two numeric values.

  • site.cov has same number of rows as qPCR.N and count, if present

  • site.cov is numeric, if present

  • cov values match column names in site.cov, if present

If any of these checks fail, the function returns an error message.

Examples

# \donttest{
# Ex. 1: Implementing the joint model

# Load data
data(gobyData)

# Examine data in list
names(gobyData)
#> [1] "qPCR.N"   "qPCR.K"   "count"    "site.cov"

# Note that the surveyed sites (rows) should match in all data
dim(gobyData$qPCR.N)[1]
#> [1] 39
dim(gobyData$count)[1]
#> [1] 39

# Fit a basic model with paired eDNA and traditional survey data.
# Count data is modeled using a poisson distribution.
fit <- jointModel(data = gobyData, family = "poisson", p10priors = c(1,20),
                  multicore = FALSE)
#> 
#> SAMPLING FOR MODEL 'joint_binary_pois' NOW (CHAIN 1).
#> Chain 1: 
#> Chain 1: Gradient evaluation took 4.6e-05 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.46 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1: 
#> Chain 1: 
#> Chain 1: Iteration:    1 / 3000 [  0%]  (Warmup)
#> Chain 1: Iteration:  500 / 3000 [ 16%]  (Warmup)
#> Chain 1: Iteration:  501 / 3000 [ 16%]  (Sampling)
#> Chain 1: Iteration: 1000 / 3000 [ 33%]  (Sampling)
#> Chain 1: Iteration: 1500 / 3000 [ 50%]  (Sampling)
#> Chain 1: Iteration: 2000 / 3000 [ 66%]  (Sampling)
#> Chain 1: Iteration: 2500 / 3000 [ 83%]  (Sampling)
#> Chain 1: Iteration: 3000 / 3000 [100%]  (Sampling)
#> Chain 1: 
#> Chain 1:  Elapsed Time: 0.96 seconds (Warm-up)
#> Chain 1:                1.765 seconds (Sampling)
#> Chain 1:                2.725 seconds (Total)
#> Chain 1: 
#> 
#> SAMPLING FOR MODEL 'joint_binary_pois' NOW (CHAIN 2).
#> Chain 2: 
#> Chain 2: Gradient evaluation took 4.4e-05 seconds
#> Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.44 seconds.
#> Chain 2: Adjust your expectations accordingly!
#> Chain 2: 
#> Chain 2: 
#> Chain 2: Iteration:    1 / 3000 [  0%]  (Warmup)
#> Chain 2: Iteration:  500 / 3000 [ 16%]  (Warmup)
#> Chain 2: Iteration:  501 / 3000 [ 16%]  (Sampling)
#> Chain 2: Iteration: 1000 / 3000 [ 33%]  (Sampling)
#> Chain 2: Iteration: 1500 / 3000 [ 50%]  (Sampling)
#> Chain 2: Iteration: 2000 / 3000 [ 66%]  (Sampling)
#> Chain 2: Iteration: 2500 / 3000 [ 83%]  (Sampling)
#> Chain 2: Iteration: 3000 / 3000 [100%]  (Sampling)
#> Chain 2: 
#> Chain 2:  Elapsed Time: 0.984 seconds (Warm-up)
#> Chain 2:                1.651 seconds (Sampling)
#> Chain 2:                2.635 seconds (Total)
#> Chain 2: 
#> 
#> SAMPLING FOR MODEL 'joint_binary_pois' NOW (CHAIN 3).
#> Chain 3: 
#> Chain 3: Gradient evaluation took 4.3e-05 seconds
#> Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.43 seconds.
#> Chain 3: Adjust your expectations accordingly!
#> Chain 3: 
#> Chain 3: 
#> Chain 3: Iteration:    1 / 3000 [  0%]  (Warmup)
#> Chain 3: Iteration:  500 / 3000 [ 16%]  (Warmup)
#> Chain 3: Iteration:  501 / 3000 [ 16%]  (Sampling)
#> Chain 3: Iteration: 1000 / 3000 [ 33%]  (Sampling)
#> Chain 3: Iteration: 1500 / 3000 [ 50%]  (Sampling)
#> Chain 3: Iteration: 2000 / 3000 [ 66%]  (Sampling)
#> Chain 3: Iteration: 2500 / 3000 [ 83%]  (Sampling)
#> Chain 3: Iteration: 3000 / 3000 [100%]  (Sampling)
#> Chain 3: 
#> Chain 3:  Elapsed Time: 1.02 seconds (Warm-up)
#> Chain 3:                1.818 seconds (Sampling)
#> Chain 3:                2.838 seconds (Total)
#> Chain 3: 
#> 
#> SAMPLING FOR MODEL 'joint_binary_pois' NOW (CHAIN 4).
#> Chain 4: 
#> Chain 4: Gradient evaluation took 4.4e-05 seconds
#> Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.44 seconds.
#> Chain 4: Adjust your expectations accordingly!
#> Chain 4: 
#> Chain 4: 
#> Chain 4: Iteration:    1 / 3000 [  0%]  (Warmup)
#> Chain 4: Iteration:  500 / 3000 [ 16%]  (Warmup)
#> Chain 4: Iteration:  501 / 3000 [ 16%]  (Sampling)
#> Chain 4: Iteration: 1000 / 3000 [ 33%]  (Sampling)
#> Chain 4: Iteration: 1500 / 3000 [ 50%]  (Sampling)
#> Chain 4: Iteration: 2000 / 3000 [ 66%]  (Sampling)
#> Chain 4: Iteration: 2500 / 3000 [ 83%]  (Sampling)
#> Chain 4: Iteration: 3000 / 3000 [100%]  (Sampling)
#> Chain 4: 
#> Chain 4:  Elapsed Time: 0.89 seconds (Warm-up)
#> Chain 4:                1.748 seconds (Sampling)
#> Chain 4:                2.638 seconds (Total)
#> Chain 4: 
#> Refer to the eDNAjoint guide for visualization tips:  https://bookdown.org/abigailkeller/eDNAjoint_vignette/tips.html#visualization-tips 

# Ex. 2: Implementing the joint model with site-level covariates

# With the same data, fit a model including 'Filter_time' and 'Salinity'
# site-level covariates
# These covariates will scale the sensitivity of eDNA sampling relative to
# traditional surveys
# Count data is modeled using a poisson distribution.
fit.cov <- jointModel(data = gobyData, cov = c('Filter_time','Salinity'),
                      family = "poisson", p10priors = c(1,20),
                      multicore = FALSE)
#> 
#> SAMPLING FOR MODEL 'joint_binary_cov_pois' NOW (CHAIN 1).
#> Chain 1: 
#> Chain 1: Gradient evaluation took 4.9e-05 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.49 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1: 
#> Chain 1: 
#> Chain 1: Iteration:    1 / 3000 [  0%]  (Warmup)
#> Chain 1: Iteration:  500 / 3000 [ 16%]  (Warmup)
#> Chain 1: Iteration:  501 / 3000 [ 16%]  (Sampling)
#> Chain 1: Iteration: 1000 / 3000 [ 33%]  (Sampling)
#> Chain 1: Iteration: 1500 / 3000 [ 50%]  (Sampling)
#> Chain 1: Iteration: 2000 / 3000 [ 66%]  (Sampling)
#> Chain 1: Iteration: 2500 / 3000 [ 83%]  (Sampling)
#> Chain 1: Iteration: 3000 / 3000 [100%]  (Sampling)
#> Chain 1: 
#> Chain 1:  Elapsed Time: 0.941 seconds (Warm-up)
#> Chain 1:                1.743 seconds (Sampling)
#> Chain 1:                2.684 seconds (Total)
#> Chain 1: 
#> 
#> SAMPLING FOR MODEL 'joint_binary_cov_pois' NOW (CHAIN 2).
#> Chain 2: 
#> Chain 2: Gradient evaluation took 4.5e-05 seconds
#> Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.45 seconds.
#> Chain 2: Adjust your expectations accordingly!
#> Chain 2: 
#> Chain 2: 
#> Chain 2: Iteration:    1 / 3000 [  0%]  (Warmup)
#> Chain 2: Iteration:  500 / 3000 [ 16%]  (Warmup)
#> Chain 2: Iteration:  501 / 3000 [ 16%]  (Sampling)
#> Chain 2: Iteration: 1000 / 3000 [ 33%]  (Sampling)
#> Chain 2: Iteration: 1500 / 3000 [ 50%]  (Sampling)
#> Chain 2: Iteration: 2000 / 3000 [ 66%]  (Sampling)
#> Chain 2: Iteration: 2500 / 3000 [ 83%]  (Sampling)
#> Chain 2: Iteration: 3000 / 3000 [100%]  (Sampling)
#> Chain 2: 
#> Chain 2:  Elapsed Time: 4.921 seconds (Warm-up)
#> Chain 2:                1.741 seconds (Sampling)
#> Chain 2:                6.662 seconds (Total)
#> Chain 2: 
#> 
#> SAMPLING FOR MODEL 'joint_binary_cov_pois' NOW (CHAIN 3).
#> Chain 3: 
#> Chain 3: Gradient evaluation took 6.2e-05 seconds
#> Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.62 seconds.
#> Chain 3: Adjust your expectations accordingly!
#> Chain 3: 
#> Chain 3: 
#> Chain 3: Iteration:    1 / 3000 [  0%]  (Warmup)
#> Chain 3: Iteration:  500 / 3000 [ 16%]  (Warmup)
#> Chain 3: Iteration:  501 / 3000 [ 16%]  (Sampling)
#> Chain 3: Iteration: 1000 / 3000 [ 33%]  (Sampling)
#> Chain 3: Iteration: 1500 / 3000 [ 50%]  (Sampling)
#> Chain 3: Iteration: 2000 / 3000 [ 66%]  (Sampling)
#> Chain 3: Iteration: 2500 / 3000 [ 83%]  (Sampling)
#> Chain 3: Iteration: 3000 / 3000 [100%]  (Sampling)
#> Chain 3: 
#> Chain 3:  Elapsed Time: 0.598 seconds (Warm-up)
#> Chain 3:                1.75 seconds (Sampling)
#> Chain 3:                2.348 seconds (Total)
#> Chain 3: 
#> 
#> SAMPLING FOR MODEL 'joint_binary_cov_pois' NOW (CHAIN 4).
#> Chain 4: 
#> Chain 4: Gradient evaluation took 4.5e-05 seconds
#> Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.45 seconds.
#> Chain 4: Adjust your expectations accordingly!
#> Chain 4: 
#> Chain 4: 
#> Chain 4: Iteration:    1 / 3000 [  0%]  (Warmup)
#> Chain 4: Iteration:  500 / 3000 [ 16%]  (Warmup)
#> Chain 4: Iteration:  501 / 3000 [ 16%]  (Sampling)
#> Chain 4: Iteration: 1000 / 3000 [ 33%]  (Sampling)
#> Chain 4: Iteration: 1500 / 3000 [ 50%]  (Sampling)
#> Chain 4: Iteration: 2000 / 3000 [ 66%]  (Sampling)
#> Chain 4: Iteration: 2500 / 3000 [ 83%]  (Sampling)
#> Chain 4: Iteration: 3000 / 3000 [100%]  (Sampling)
#> Chain 4: 
#> Chain 4:  Elapsed Time: 0.938 seconds (Warm-up)
#> Chain 4:                1.696 seconds (Sampling)
#> Chain 4:                2.634 seconds (Total)
#> Chain 4: 
#> Refer to the eDNAjoint guide for visualization tips:  https://bookdown.org/abigailkeller/eDNAjoint_vignette/tips.html#visualization-tips 


# Ex. 3: Implementing the joint model with multiple traditional gear types

# Load data
data(greencrabData)

# Examine data in list
names(greencrabData)
#> [1] "qPCR.N"     "qPCR.K"     "count"      "count.type"

# Note that the surveyed sites (rows) should match in all data
dim(greencrabData$qPCR.N)[1]
#> [1] 20
dim(greencrabData$count)[1]
#> [1] 20

# Fit a model estimating a catchability coefficient for traditional survey
# gear types.
# This model does not assume all traditional survey methods have the same
# catchability.
# Count data is modeled using a negative binomial distribution.
fit.q <- jointModel(data = greencrabData, cov = NULL, family = "negbin",
                    p10priors = c(1,20), q = TRUE, phipriors = c(0.25,0.25),
                    multicore = FALSE, initial_values = NULL,
                    n.chain = 4, n.iter.burn = 500,
                    n.iter.sample = 2500, thin = 1, adapt_delta = 0.9,
                    verbose = TRUE, seed = 123)
#> 
#> SAMPLING FOR MODEL 'joint_binary_catchability_negbin' NOW (CHAIN 1).
#> Chain 1: 
#> Chain 1: Gradient evaluation took 0.000296 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 2.96 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1: 
#> Chain 1: 
#> Chain 1: Iteration:    1 / 3000 [  0%]  (Warmup)
#> Chain 1: Iteration:  500 / 3000 [ 16%]  (Warmup)
#> Chain 1: Iteration:  501 / 3000 [ 16%]  (Sampling)
#> Chain 1: Iteration: 1000 / 3000 [ 33%]  (Sampling)
#> Chain 1: Iteration: 1500 / 3000 [ 50%]  (Sampling)
#> Chain 1: Iteration: 2000 / 3000 [ 66%]  (Sampling)
#> Chain 1: Iteration: 2500 / 3000 [ 83%]  (Sampling)
#> Chain 1: Iteration: 3000 / 3000 [100%]  (Sampling)
#> Chain 1: 
#> Chain 1:  Elapsed Time: 3.469 seconds (Warm-up)
#> Chain 1:                10.342 seconds (Sampling)
#> Chain 1:                13.811 seconds (Total)
#> Chain 1: 
#> 
#> SAMPLING FOR MODEL 'joint_binary_catchability_negbin' NOW (CHAIN 2).
#> Chain 2: 
#> Chain 2: Gradient evaluation took 0.000336 seconds
#> Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 3.36 seconds.
#> Chain 2: Adjust your expectations accordingly!
#> Chain 2: 
#> Chain 2: 
#> Chain 2: Iteration:    1 / 3000 [  0%]  (Warmup)
#> Chain 2: Iteration:  500 / 3000 [ 16%]  (Warmup)
#> Chain 2: Iteration:  501 / 3000 [ 16%]  (Sampling)
#> Chain 2: Iteration: 1000 / 3000 [ 33%]  (Sampling)
#> Chain 2: Iteration: 1500 / 3000 [ 50%]  (Sampling)
#> Chain 2: Iteration: 2000 / 3000 [ 66%]  (Sampling)
#> Chain 2: Iteration: 2500 / 3000 [ 83%]  (Sampling)
#> Chain 2: Iteration: 3000 / 3000 [100%]  (Sampling)
#> Chain 2: 
#> Chain 2:  Elapsed Time: 3.639 seconds (Warm-up)
#> Chain 2:                8.814 seconds (Sampling)
#> Chain 2:                12.453 seconds (Total)
#> Chain 2: 
#> 
#> SAMPLING FOR MODEL 'joint_binary_catchability_negbin' NOW (CHAIN 3).
#> Chain 3: 
#> Chain 3: Gradient evaluation took 0.000315 seconds
#> Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 3.15 seconds.
#> Chain 3: Adjust your expectations accordingly!
#> Chain 3: 
#> Chain 3: 
#> Chain 3: Iteration:    1 / 3000 [  0%]  (Warmup)
#> Chain 3: Iteration:  500 / 3000 [ 16%]  (Warmup)
#> Chain 3: Iteration:  501 / 3000 [ 16%]  (Sampling)
#> Chain 3: Iteration: 1000 / 3000 [ 33%]  (Sampling)
#> Chain 3: Iteration: 1500 / 3000 [ 50%]  (Sampling)
#> Chain 3: Iteration: 2000 / 3000 [ 66%]  (Sampling)
#> Chain 3: Iteration: 2500 / 3000 [ 83%]  (Sampling)
#> Chain 3: Iteration: 3000 / 3000 [100%]  (Sampling)
#> Chain 3: 
#> Chain 3:  Elapsed Time: 3.559 seconds (Warm-up)
#> Chain 3:                11.379 seconds (Sampling)
#> Chain 3:                14.938 seconds (Total)
#> Chain 3: 
#> 
#> SAMPLING FOR MODEL 'joint_binary_catchability_negbin' NOW (CHAIN 4).
#> Chain 4: 
#> Chain 4: Gradient evaluation took 0.000305 seconds
#> Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 3.05 seconds.
#> Chain 4: Adjust your expectations accordingly!
#> Chain 4: 
#> Chain 4: 
#> Chain 4: Iteration:    1 / 3000 [  0%]  (Warmup)
#> Chain 4: Iteration:  500 / 3000 [ 16%]  (Warmup)
#> Chain 4: Iteration:  501 / 3000 [ 16%]  (Sampling)
#> Chain 4: Iteration: 1000 / 3000 [ 33%]  (Sampling)
#> Chain 4: Iteration: 1500 / 3000 [ 50%]  (Sampling)
#> Chain 4: Iteration: 2000 / 3000 [ 66%]  (Sampling)
#> Chain 4: Iteration: 2500 / 3000 [ 83%]  (Sampling)
#> Chain 4: Iteration: 3000 / 3000 [100%]  (Sampling)
#> Chain 4: 
#> Chain 4:  Elapsed Time: 3.438 seconds (Warm-up)
#> Chain 4:                10.971 seconds (Sampling)
#> Chain 4:                14.409 seconds (Total)
#> Chain 4: 
#> Refer to the eDNAjoint guide for visualization tips:  https://bookdown.org/abigailkeller/eDNAjoint_vignette/tips.html#visualization-tips 
# }