What makes a poet, a performer, a politician, or a preacher compelling to listen to? When we describe a speaker as neutral, dramatic, or monotonous, one implied feature is intonation, the rise and fall of the voice, as well as pitch range. While the field of sound studies develops apace, voice recognition and voice profiling are used in surveillance and employee recruitment, and debates about “NPR voice” and “poet voice” proliferate on social media, quantitative analysis of the voice is still uncommon among humanist scholars concerned with the enormous audio archive of vocal recordings. Pitch-tracking is routinely employed in corpus linguistics—a term for the digital humanities in the U.K.—but its use is novel in sound studies and DH in the U.S. It has the potential to defamiliarize texts, as Tanya Clement has argued about other DH methods, opening them to new angles of scholarly inquiry.
This interactive demonstration of a new pitch-tracking tool will allow the audience to see, in real-time, visualizations of the pitch range and intonation patterns of short vocal recordings of poems and speech, in line graphs and figures. This tool derives from the audio analysis program ARLO (Adaptive Recognition with Layered Optimization), developed through the NEH-sponsored High Performance Sound Technologies for Access and Scholarship (HiPSTAS). ARLO rivals programs commonly used by linguists, such as Praat, which are difficult to learn and have trouble tracking pitch in longer recordings of poor audio quality—a characteristic of much of the audio archive. The simple user interface for the ARLO pitch-tracker is the result of my collaboration, on an ACLS Digital Innovations Fellowship in 2015-16, with phonetician Georgia Zellou, sound and media artist and designer Dave Cerf, and David Tcheng, a machine learning scientist and senior audio signal analyst for GoPro, formerly of the Illinois Informatics Institute. (A current version of the interface is attached; it may change somewhat by May.)