warbleR logo

Visual inspection and classification of signals

Marcelo Araya-Salas and Grace Smith Vidaurre
2018-02-10


The warbleR workflow continues with visualization of selection signals for quality control filtering and classification. For more details about function arguments, input or output, please read the documentation for the function in question (e.g. ?lspec). warbleR is available on both CRAN and GitHub. The GitHub repository will always contain the latest functions and updates. We also published an article on warbleR documenting the package workflow [1].

Please note that most tools in warbleR use functions from the seewave, monitoR, tuneR and dtw packages internally. warbleR has been designed to make the analyses more accessible to average R-users. However, acoustic analysis with warbleR would not be possible without the tools provided by these additional packages. These packages should be given credit when using warbleR by including the appropriate citations in publications (e.g. citation("seewave")).

Nearly all warbleR functions contain options for parallel processing, which can greatly speed up analyses. See [1] for more details about parallel processing performance. Use parallel processing with caution, particularly if you use a Windows operating system. warbleR dependencies for parallel processing have not yet been optimized for Windows.

This vignette can be run without an advanced understanding of R, as long as you know how to run code in your R console. However, knowing more about basic R coding would be very helpful to modify the code for your research questions.

In the previous vignette we downloaded and filtered recordings from the open-access xeno-canto database, and discussed methods of automated and manual signal selection, as well as options to use warbleR with other bioacoustics software. Make sure to set your working directory prior to running this vignette, especially if RStudio has been closed since running the last vignette.

library(warbleR)

# set your working directory appropriately
# setwd("/path/to/working directory")

# run this if you have restarted RStudio between vignettes without saving your workspace
# assumes that you are in your /home/username directory
setwd(file.path(getwd(),"warbleR_example"))

# Check your location
getwd()


Quality control filtering of selections

Short spectrograms of selections

specreator generates spectrograms of individual selected signals. These image files are a great way to filter out selections that were poorly made or represent signals that are not relevant to your analysis. This filtering step is particularly important after running autodetec, or if signals were manually selected by various users.

Phae.hisnr <- read.csv("Phae_hisnr.csv", header = TRUE)

specreator(Phae.hisnr, wl = 300, flim = c(2, 10), it = "jpeg", res = 150, osci = TRUE, ovlp = 90)

Inspect spectrograms and throw away image files that are poor quality to prepare for later steps. Make sure you are working in a directory that only has image files associated with this vignette. Delete the image files corresponding to recording 154138 selection 39 and 154161 selection 145, as the start coordinates for these selections are not accurate.

Use filtersels to remove selections with missing image files

# remove selections after deleting corresponding image files
Phae.hisnr2 <- filtersels(Phae.hisnr, it = "jpeg", incl.wav = TRUE)
nrow(Phae.hisnr2) # 23 selections left

After removing the poorest quality selections or signals, there are some other quality control steps that may be helpful.


Check that selections can be read, and if not, fix sound files

Can selections be read by downstream functions? The function checksels also yields a data frame with columns for duration, minimum samples, sampling rate, channels and bits.

# if selections can be read, "OK" will be printed to check.res column
checksels(Phae.hisnr2, check.header = FALSE)

If selections cannot be read, it is possible the sound files are corrupt. If so, use the fixwavs function to repair wav files.

Cut selections into individual sound files

Hearing is as important as seeing for acoustic analysis, and the function cut_sels can be very useful for aural comparison of selected signals. Selected signals can be played as individual sounds rather than having to open up entire sound files. As a word of caution, generating cuts of sound files will also propagate any naming errors present in the original files. In general, it is usually better to avoid creating many cuts, but if you must do so, just proceed carefully.

cut_sels can also be used to your advantage if your original recordings are long (over 10-15 minutes). Some warbleR functions, in particular manualoc will run slowly with long recordings, so input of shorter duration is desirable. You can make selections of shorter pieces of long original recordings, either in Raven or Syrinx, and use cut_sels to generate shorter segments for smoother signal detection in warbleR.

cut_sels(Phae.hisnr2, labels = c("sound.files", "selec"))
# bug in the above cut_sels code

# Error in apply(X[, sapply(X, is.factor)], 2, as.character) : 
#   dim(X) must have a positive length

Tailor selections that were not well-selected

Sometimes the start and end times of selected signals need fine-tuned adjustments. This is particularly true when signals are found within bouts of closely delivered sounds that may be hard to pull apart, such as duets, or if multiple researchers use different rules-of-thumb to select signals. seltailor provides an interactive interface similar to manualoc for tailoring the temporal coordinates of selections.

If you check out the image files generated by running specreator above, you’ll see that some of the selections made during the automatic detection process with autodetec do not have accurate start and/or end coordinates.

For instance:

The start of this signal is not well selected.
Start coordinate not well-selected


The end of this signal is not well selected.
End coordinate not well-selected


The temporal coordinates for the tailored signals will be saved in a _ .csv_ file called seltailor_output.csv that can be read back into R to continue downstream analyses.

seltailor(Phae.hisnr2, wl = 300, flim = c(2,10), wn = "hanning", mar = 0.1,
 osci = TRUE, title = c("sound.files", "selec"))

Phae.hisnrt <- read.csv("seltailor_output.csv", header = TRUE)
str(Phae.hisnrt)
## 'data.frame':    23 obs. of  6 variables:
##  $ sound.files: Factor w/ 5 levels "Phaethornis-longirostris-154070.wav",..: 1 1 1 1 1 2 2 2 2 2 ...
##  $ selec      : int  11 13 15 19 29 2 36 76 126 148 ...
##  $ start      : num  6.95 8.08 9.23 11.61 17.73 ...
##  $ end        : num  7.08 8.21 9.35 11.74 17.86 ...
##  $ SNR        : num  11 12 12.4 12.1 11.1 ...
##  $ tailored   : Factor w/ 1 level "y": 1 1 1 1 1 1 1 1 1 1 ...

Visual classification of selected signals

Visual classification of signals is fundamental to vocal repertoire analysis, and can also be useful for other studies. If your research focuses on assessing variation between individuals or groups, several warbleR functions can provide you with important information about how to steer your analysis. If there is obvious variation in vocalization structure across groups (e.g. treatments or geographic regions), you can focus your analysis on visual classification of vocalizations.

Highlight spectrogram regions with color.spectro

color.spectro allows you to highlight selections you’ve made within a short region of a spectrogram. In the example below we will use color.spectro to highlight neighboring songs. This function has a wide variety of uses, and could be especially useful for analysis of duets or coordinated singing bouts. This example is taken directly from the color.spectro documentation. If working with your own data frame of selections, make sure to calculate the frequency range for your selections beforehand using the function frange, which will come up in the next vignette.

# we will use Phaethornis songs and selections from the warbleR package
data(list = c("Phae.long1", "selec.table"))
writeWave(Phae.long1, "Phae.long1.wav") #save sound files 

# subset selection table
# already contains the frequency range for these signals
st <- selec.table[selec.table$sound.files == "Phae.long1.wav",]
 
# read wave file as an R object
sgnl <- tuneR::readWave(as.character(st$sound.files[1]))
 
# create color column
st$colors <- c("red2", "blue", "green")
 
# highlight selections
color.spectro(wave = sgnl, wl = 300, ovlp = 90, flim = c(1, 8.6), collevels = seq(-90, 0, 5), 
              dB = "B", X = st, col.clm = "colors", base.col = "skyblue",  t.mar = 0.07, f.mar = 0.1)

Create lexicons with catalog

This section on catalog is taken directly from Marcelo Araya-Salas’s bioacoustics GitHub blog with slight modifications. When we are interested in geographic variation of acoustic signals, we usually want to compare spectrograms from different individuals and sites. This can be challenging when working with large numbers of signals, individuals and/or sites. catalog aims to simplify this task.

This is how it works:

  • catalog plots a matrix of spectrograms from signals listed in a selection table
    • similar to the example data frame selec.table in warbleR, run data(selec.table) after loading warbleR
  • Graphs are saved as image files in the working directory (or path provided)
  • Several images are generated if the number of signals do not fit in a single file
  • Spectrograms can be labeled or color-tagged to facilitate exploring variation related to the parameter of interest (e.g. site or song type if already classified)
  • A legend can be added to help match colors with tag levels
    • different color palettes can be used for each tag
  • The duration of the signals can be “fixed” such that all the spectrograms have the same duration
    • faciltates comparisons
  • You can control the number of rows and columns as well as the width and height of the output image

Recent updates to catalog allow you to group signals into biologically relevant groups by coloring the background of selected spectrograms accordingly. There is also an option to add hatching to tag labels, as well as filling the catalog with spectrograms by rows or columns of the selection table data frame, among other additional arguments. Check out Marcelo’s post on new updates to the catalog function.

Note the use of the move.imgs function, which can come in handy when creating multiple catalogs to avoid overwriting previous image files, or when working through rounds of other image files. In this case, the first catalog we create has signals labeled, tagged and grouped with respective color and hatching levels. The second catalog we create will not have any grouping of signals whatsoever, and could be used for a test of inter-observer reliability. move.imgs helps us move the first catalog into another directory to save it from being overwritten when creating the second catalog.

# create a column of recording IDs for friendlier catalog labels
rec_ID <- sapply(1:nrow(Phae.hisnrt), function(x){
  strsplit(strsplit(as.character(Phae.hisnrt$sound.files[x]), split = "-")[[1]][[3]], split = ".w")[[1]][1]
})

Phae.hisnrt$rec_ID <- rec_ID
str(Phae.hisnrt)

# set color palette
# alpha controls transparency for softer colors
cmc <- function(n) cm.colors(n, alpha = 0.8)

catalog(X = Phae.hisnrt, flim = c(1, 10), nrow = 4, ncol = 3, height = 10, width = 10, tag.pal = list(cmc), cex = 0.8, same.time.scale = TRUE, mar = 0.01, wl = 300, gr = FALSE, labels = "rec_ID", tags = "rec_ID", hatching = 1, group.tag = "rec_ID", spec.mar = 0.4, lab.mar = 0.8, max.group.cols = 5)

catalog2pdf(keep.img = FALSE, overwrite = TRUE)

# assuming we are working from the warbleR_example directory
# the ~/ format does not apply to Windows
# make sure you have already moved or deleted all other pdf files
move.imgs(from = ".", it = "pdf", create.folder = TRUE, folder.name = "Catalog_image_files")
Catalog with labels, tags and groups


# now create a catalog without labels, tags, groups or axes
Phae.hisnrt$no_label <- ""

catalog(X = Phae.hisnrt, flim = c(1, 10), nrow = 4, ncol = 3, height = 10, width = 10, cex = 0.8, same.time.scale = TRUE, mar = 0.01, wl = 300, spec.mar = 0.4, rm.axes = TRUE, labels = "no_label")

catalog2pdf(keep.img = FALSE, overwrite = TRUE)
Catalog without labels, tags and groups


Next vignette: Acoustic (dis)similarity and coordinated singing

Here we have finished the second phase of the warbleR workflow, which includes many options for quality control filtering or visualization that can be used to your advantage during acoustic analysis. After running the code in this second vignette, you should now have an idea of how to:

  • use spectrograms for quality control filtering
  • check selections for wav file compatibility
  • create wav files of selections
  • tailor temporal coordinates of selections
  • use different methods for visual classification of signals, including:
    • long spectrograms
    • highlighted regions within spectrograms
    • catalogues or lexicons of individual signals

The next vignette will cover the third phase of the warbleR workflow, which includes methods to perform acoustic mesaurements as a batch process, an example of how to use these measurements for an analysis of geographic variation, and coordinated singing analysis.


References

  1. Araya-Salas, M. and G. Smith-Vidaurre. 2016. warbleR: an R package to streamline analysis of animal acoustic signals. Methods in Ecology and Evolution. doi: 10.1111/2041-210X.12624