Automated acoustic analysis is increasingly used in behavioural ecology, and determining caller identity is a key element for many investigations. However, variability in feature extraction and classification methods limits the comparability of results across species and studies, constraining conclusions we can draw about the ecology and evolution of the groups under study. We investigated the impact of using different feature extraction (spectro-temporal measurements, linear and Mel-frequency cepstral coefficients, as well as highly comparative time-series analysis) and classification methods (discriminant function analysis, neural networks, random forests, and support vector machines) on the consistency of caller identity classification accuracy across 16 mammalian datasets. We found that Mel-frequency cepstral coefficients and random forests yield consistently reliable results across datasets, facilitating a standardised approach across species that generates directly comparable data. These findings remained consistent across vocalisation sample sizes and number of individuals considered. We offer guidelines for processing and analysing mammalian vocalisations, fostering greater comparability, and advancing our understanding of the evolutionary significance of acoustic communication in diverse mammalian species.
Support the authors with ResearchCoin
Support the authors with ResearchCoin