Scientifically – Listening With Whales

Listening Scientifically with Whales

Written by Éadin O’Mahony, Marie Comuzzo, Eric R. Snyder and Alex South

To listen with whales scientifically can mean many things. Scientists listen through sound, using hydrophones to record their songs and calls. They listen through images, using spectrograms that translate sound into visual patterns. They listen through machines, which help detect and classify whale vocalizations across vast datasets. And they listen through DNA, reading genetic sequences that carry traces of ancestry, survival, and loss. Each of these offers a different way of encountering whales, not just in the present but also across time, space, and more-than-human culture. This deepens our understanding of whales and the world they inhabit. It shapes the ways in which we tell their stories, and in turn, how their stories influence our culture.

We can listen in the most literal sense: using our ears, and by extension underwater microphones (hydrophones), to hear and record the sounds that whales produce to communicate with each other and to navigate their environment.

In their popular writings, scientists find as much wonder and beauty in their sounds as the rest of us. When “on duty,” however, scientists attend to whale song as a stream of auditory objects, to be classified, measured, and analyzed in pursuit of a variety of research questions, monitoring purposes, and conservation goals.

We’ve used recordings to learn that sperm whales form matriarchal clans who are known to each other through symbolic markers of cultural identity: specific patterns of clicks called “codas” that signal membership to a clan (Hersh et al. 2022).

We’ve learned that male humpback whales co-compose, songs that may be shared across entire ocean basins, and when one whale introduces a change others follow , resulting in an everchanging song (Payne & Payne, 1985). This accumulation of small changes is known as “cultural evolution”. In the South Pacific, the rapid population-wide replacement of whole songs has also been observed in events labelled “cultural revolutions” (Garland et al., 2011).

According to the founding myth of humpback whale song studies, it was only when bioacousticians Roger and Katy Payne studied Scott and Hella McVay’s spectrograms that they came to realize that these vocalizations exhibited the long-range repetitive structure that legitimated its description as “song” in a biological sense (Bridge, 2025).

However, reflecting on their first time listening to humpbacks, Katy Payne said that it felt like “looking through a window and seeing the rest of her life” (Angier, 1983, p. 44 from the Ocean Alliance Archive), and Roger wrote that it was “the first time [he] had ever heard the abyss.” And continued with “That’s what whales do; they give the ocean its voice, and the voice they give it is ethereal and unearthly.” (Payne, 1995, p. 145). Yet, listening scientifically has never relied on the ear alone.

Spectrograms turn sound into an image that can be measured. They allow scientists to view patterns of sounds, how long they last, how the pitches rise and fall, making it possible to see, rather than hear how phrases repeat over time. The scientific processes of measurement, categorization, annotation, and so on,almost inevitably use spectrograms in the investigation of field recordings. In making sound visible, spectrograms allow scientists to analyze whale vocalizations with a level of precision that can feel more objective than listening alone, even as this process shifts attention away from an embodied experience of sound, allowing scientists to “transcend” what are seen as the limitations of the physical senses.

As is the case for musicians and notated scores, bioacousticians can come to hear the sounds they read on the screen in their “inner ear.” And yet, as the critique of musicology’s focus on scores has shown (e.g., Hasty, 1997), the experience of reading visual representations of sounds is very different to that of listening to them. For one thing, it gives us the sounds “all at once,” failing to replicate the ebb and flow of anticipation and effect that accompanies typical human listenings to structured sounds such as those of music. Recordings can last for months or years, making real-time listening impractical. Instead, scientists often turn the recordings into images so hours of data can be examined in a single figure (Wiggins and Hildebrand, 2006).

These “sound pictures” revolutionized the study of more-than-human sounds when they were first introduced in the mid-twentieth century, seemingly circumventing human perceptual limits and allowing the identification of sound patterns occurring at timescales faster or slower than our usual frame of reference.

Where individuals listen and categorize sounds – and this has been demonstrated as efficacious in many cases (e.g., recognizing the signature whistles of different bottlenose dolphins, distinguishing among killer whale dialects, or coding the nested repetitions of humpback whale song) – studies typically involve multiple scientists to ensure the reliability of the result.

However, as underwater recording technologies have become more robust, reliable, and cheaper, the sheer amount of data has necessitated automated detection and classification, facilitated with tools such as PAMGuard, a leading software for the detection, classification, and localization of marine mammal and other animal sounds (Gillespie et al., 2026).

As recordings accumulated, listening began to exceed the capacity of human perception alone. Because there are not enough trained experts to label the sheer number of recordings, recent work has called for citizen scientists (members of the public), using majority voting—where multiple annotators label the same sound and the most common classification is retained – to improve annotation quality, (Dubus et al., 2024), drawing on the so-called “wisdom of the crowd.”

However, much scientific listening has become “machine listening,” in which human ears are primarily enlisted for the purposes of the initial annotation of datasets, the training of models, and their final validation.

While machine learning tools offer necessary steps in the testing of scientific hypotheses, and we don’t want to imply that bioacousticians have lost their ear for the often wondrous strangeness of the sounds they study, we want to note that when we move past listening to, we may also lose our chance to listen with whales.

The methodologies used are as diverse as the study aims, but – at least in the early stages – typically involve human researchers spending many hours listening to field recordings, usually assisted by the simultaneous viewing of spectrograms, followed by the use of machine learning tools to visualize, categorize, and analyze the sounds they hear. When coupled with machine learning tools, “listening scientifically” to whales can become a somewhat detached process, becoming less about hearing their sounds, and more about analyzing visual patterns, structures, and data that stand in for the sounds themselves.

As these less invasive and more attuned ways of studying and relating to whales emerge, scientific listening itself may become listening with whales.

Listening to whale sounds engages the body differently than seeing their visual representation. One process that might emerge through sustained listening is entrainment. Entrainment is the process through which brainwaves and the body’s internal rhythms start synchronizing with the rhythms in the sounds listened to. For example, neurons can synchronize their patterns to the temporal structure of sound, and listeners often find themselves breathing differently, anticipating repetitions or beads, or moving their bodies in ways that predict the next sound. While humans and whales inhabit very different sensory worlds, listening to their sounds can bring human perception into closer temporal and physiological proximity to whales. Entrainment potentially moves the listener from listening to, to listening with, as human bodies, however partially, begin to share the inner rhythms that shape whales’ lives. Thus, listening to whales with our ears and bodies, especially through sustained, attentive listening, may offer insights into how these sounds are produced, experienced, and how they shape relationships among whales and other beings.

Sound is crucial to a whale’s experience in the world. Whales depend on sound to communicate and observe their surroundings. Human activities, like shipping, oil exploration, construction, and military activities, have drastically altered the acoustic world of whales. Human sonic activities have been shown to alter whale behavior, increase stress levels, and even cause mass strandings (Bernaldo de Quirós, et al., 2019). Our acoustic observations of the ocean have only existed in a post-industrial revolution world, where human activities had already greatly impacted the soundscape (ZoBell, et al., 2024). Scientists are “listening with whales” by seeking to understand how sonic our activities have impacted their world.

While bioacoustics allows scientists to listen across distances and time, revealing some of the cultural transformations happening since the 1949 (the oldest recordings we have) genetics offers another kind of listening– one that reaches far across generations, tracing lives through the molecular archives carried within their cells.

Thus we also can listen in a more metaphorical sense. Every being carries within its cells a record of its life and ancestry, written in DNA – a molecular code made of four letters (A, T, C, and G). By listening through genetic sampling, we can draw intergenerational histories that tell us about the overall health of entire species. We can also listen to their past: as encoded in their genomes (a complete set of genetic instructions that makes an organism what it is) lie signals of histories of commercial whaling, with large populations pushed by humans to the brink of extinction. Genetic signals today reveal the lasting effects of this collective trauma. When whale populations were drastically reduced, individuals bred with close relatives. These severe levels of inbreeding are observed through what are known as “runs of homozygosity,” which are formed when both parents pass down the same segment of DNA.

Whale baleen retains a record of their history; much like tree rings document the environment over a tree’s lifespan, the slow-growing baleen of mystecetes contain insights into the environmental changes whales have experienced. Some species live for hundreds of years, and chemical and isotopic analysis of baleen reveals historical properties of the oceans, or how populations may have altered their feeding behaviors over time (Teixeira, et al., 2022). These historical analyses underline the impact humans have had on the oceans. By uncovering these traces of embodied history, we can equip ourselves with the necessary knowledge to help whales recover and thrive in perpetuity.

Traditionally, tissue samples are collected from whales using crossbows, yielding very high-quality samples that can prove extremely important for detailed genomic work. However, such high-quality data is not always needed to answer important questions: increasingly we see the adoption of non-invasive ways of sampling whales. This includes using remotely piloted drones to fly through the exhaled breath (the “blow”) of whales; carefully preserving this sample to sequence the genome of the whale from the epithelial cells found in what is essentially its snot (O’Mahony et. al. 2024)!

These samples contain enough DNA to answer questions important to conservation: such as how whales within the same social group are related to one another, and how genetically different various whale populations are from each other.

We suggest, we not only need to stay with the auditory experience, but we also need a degree of embodied reflection. That is, we must acknowledge the histories held in bodies, as well as the ways bodies, and their ways of knowing, can produce knowledge. We must self-consciously recognize the empathic leap required to move from regarding the sounds we study or the being producing it as “object,” to asking ourselves what it might be like to hear, attend to, comprehend, and to respond to the sounds as a whale, and as a human listening with that whale. This certainly means imagining as fully as possible the vocal and perceptual capacities of the individual whale to which we are listening, as well as its social and physical environment. It might also include attempting to imagine what might be like to produce these sounds. Here we have to inform ourselves by all means possible (e.g., observations, playback experiments, Indigenous knowledges) of those differences and similarities in our perceptual and cognitive abilities that separate and unite sounding whales with listening humans.

Thus, listening scientifically in these various ways reveal how whales and humans are members of an interspecies community that carries history and culture continuously influencing each other. Not only through sound, but also through embodied experiences and histories that are also recorded in genes. As these less invasive and more attuned ways of studying and relating to whales emerge, scientific listening itself may become listening with whales.