Acoustic templates correlator using time-frequency cross-correlation
Source:R/template_correlator.R
template_correlator.Rd
template_correlator
estimates templates cross-correlation across multiple sound files.
Usage
template_correlator(templates, files = NULL, hop.size = 11.6, wl = NULL, ovlp = 0,
wn ='hanning', cor.method = "pearson", cores = 1, path = ".",
pb = TRUE, type = "fourier", fbtype = "mel", ...)
Arguments
- templates
'selection_table', 'extended_selection_table' (warbleR package's formats, see
selection_table
) or data frame with time and frequency information of the sound event(s) to be used as templates (1 template per row). The object must contain columns for sound files (sound.files), selection number (selec), and start and end time of sound event (start and end). If frequency range columns are included ('bottom.freq' and 'top.freq', in kHz) the correlation will be run on those frequency ranges. All templates must have the same sampling rate and both templates and 'files' (in which to find templates) must also have the same sampling rate.- files
Character vector with name of the files in which to run the cross-correlation with the supplied template(s). Supported file formats:'.wav', '.mp3', '.flac' and '.wac'. If not supplied the function will work on all sound files (in the supported formats) in 'path'.
- hop.size
A numeric vector of length 1 specifying the time window duration (in ms). Default is 11.6 ms, which is equivalent to 512 wl for a 44.1 kHz sampling rate. Ignored if 'wl' is supplied.
- wl
A numeric vector of length 1 specifying the window length of the spectrogram. Default is
NULL
. If supplied, 'hop.size' is ignored.- ovlp
Numeric vector of length 1 specifying % of overlap between two consecutive windows, as in
spectro
. Default is 0. High values of ovlp slow down the function but may produce more accurate results.- wn
A character vector of length 1 specifying the window name as in
ftwindow
.- cor.method
A character vector of length 1 specifying the correlation method as in
cor
.- cores
Numeric. Controls whether parallel computing is applied. It specifies the number of cores to be used. Default is 1 (i.e. no parallel computing).
- path
Character string containing the directory path where the sound files are located. The current working directory is used as default.
- pb
Logical argument to control progress bar. Default is
TRUE
.- type
A character vector of length 1 specifying the type of cross-correlation: "fourier" (i.e. spectrographic cross-correlation using Fourier transform; internally using
spectro
; default), "mfcc" (auditory scale coefficient matrix cross-correlation; internally usingmelfcc
) or "mel-auditory" (cross-correlation of auditory spectrum, i.e. spectrum after transformation to an auditory scale; internally usingmelfcc
). The argument 'fbtype' controls the auditory scale to be used. Note that the last 2 methods have not been widely used in this context so can be regarded as experimental.- fbtype
Character vector indicating the auditory frequency scale to use: "mel", "bark", "htkmel", "fcmel".
- ...
Additional arguments to be passed to
melfcc
for further customization when using auditory scales.
Value
The function returns an object of class 'template_correlations' which is a list with the correlation scores for each combination of templates and files. 'template_correlations' objects must be used to infer sound event occurrences using template_detector
or to graphically explore template correlations across sound files using full_spectrograms
.
Details
This function calculates the similarity of acoustic templates across sound files by means of time-frequency cross-correlation. Fourier spectrograms or time-frequency representations from auditory scales (including cepstral coefficients) can be used. Several templates can be run over several sound files. Note that template-based detection is divided in two steps: template correlation (using this function) and template detection (or peak detection as it infers detection based on peak correlation scores, using the function template_detector
). So the output of this function (and object of 'template_correlations') must be input into template_detector
for inferring sound event occurrences. optimize_template_detector
can be used to optimize template detection.
References
Araya-Salas, M., Smith-Vidaurre, G., Chaverri, G., Brenes, J. C., Chirino, F., Elizondo-Calvo, J., & Rico-Guevara, A. 2022. ohun: an R package for diagnosing and optimizing automatic sound event detection. BioRxiv, 2022.12.13.520253. https://doi.org/10.1101/2022.12.13.520253
Khanna H., Gaunt S.L.L. & McCallum D.A. (1997). Digital spectrographic cross-correlation: tests of recall. Bioacoustics 7(3): 209-234.
Lyon, R. H., & Ordubadi, A. (1982). Use of cepstra in acoustical signal analysis. Journal of Mechanical Design, 104(2), 303-306.
Author
Marcelo Araya-Salas marcelo.araya@ucr.ac.cr)
Examples
{
# load example data
data("lbh1", "lbh2", "lbh_reference")
# save sound files
tuneR::writeWave(lbh1, file.path(tempdir(), "lbh1.wav"))
tuneR::writeWave(lbh2, file.path(tempdir(), "lbh2.wav"))
# create template
templ <- lbh_reference[4, ]
templ2 <- warbleR::selection_table(templ,
extended = TRUE,
path = tempdir()
)
# fourier spectrogram
(tc_fr <- template_correlator(templates = templ, path = tempdir(), type = "fourier"))
# mel auditory spectrograms
(tc_ma <- template_correlator(templates = templ, path = tempdir(), type = "mel-auditory"))
# mfcc spectrograms
(tc_mfcc <- template_correlator(templates = templ, path = tempdir(), type = "mfcc"))
# similar results (but no exactly the same) are found with the 3 methods
# these are the correlation of the correlation vectors
# fourier vs mel-auditory
cor(
tc_fr$`lbh2.wav-4/lbh2.wav`$correlation.scores,
tc_ma$`lbh2.wav-4/lbh2.wav`$correlation.scores
)
# fourier vs mfcc
cor(
tc_fr$`lbh2.wav-4/lbh2.wav`$correlation.scores,
tc_mfcc$`lbh2.wav-4/lbh2.wav`$correlation.scores
)
# mel-auditory vs mfcc
cor(
tc_ma$`lbh2.wav-4/lbh2.wav`$correlation.scores,
tc_mfcc$`lbh2.wav-4/lbh2.wav`$correlation.scores
)
# using an extended selection table
templ_est <- warbleR::selection_table(templ,
extended = TRUE,
path = tempdir()
)
tc_fr_est <- template_correlator(templates = templ_est, path = tempdir(), type = "fourier")
# produces the same result as templates in a regular data frame
cor(
tc_fr$`lbh2.wav-4/lbh2.wav`$correlation.scores,
tc_fr_est$`lbh2.wav_4-1/lbh2.wav`$correlation.scores
)
}
#> all selections are OK
#>
#> all selections are OK
#>
#> [1] 1