Google’s New AI Is Trying to Talk to Dolphins—Seriously

In a collaboration that sounds straight out of sci-fi but is very much grounded in decades of ocean science, Google has teamed up with marine biologists and AI researchers to build a large language model designed not to chat with humans, but with dolphins.
The model is DolphinGemma, a cutting-edge LLM trained to recognize, predict, and eventually generate dolphin vocalizations, in an effort to not only crack the code on how the cetaceans communicate with each other—but also how we might be able to communicate with them ourselves. Developed in partnership with the Wild Dolphin Project (WDP) and researchers at Georgia Tech, the model represents the latest milestone in a quest that’s been swimming along for more than 40 years.
A deep dive into a dolphin communitySince 1985, WDP has run the world’s longest underwater study of dolphins. The project investigates a group of wild Atlantic spotted dolphins (S. frontalis) in the Bahamas. Over the decades, the team has non-invasively collected underwater audio and video data that is associated with individual dolphins in the pod, detailing aspects of the animals’ relationships and life histories.
The project has yielded an extraordinary dataset—one packed with 41 years of sound-behavior pairings like courtship buzzes, aggressive squawks used in cetacean altercations, and “signature whistles” that act as dolphin name tags.
This trove of labeled vocalizations gave Google researchers what they needed to train an AI model designed to do for dolphin sounds what ChatGPT does for words. Thus, DolphinGemma was born: a roughly 400-million parameter model built on the same research that powers Google’s Gemini models.
DolphinGemma is audio-in, audio-out—the model “listens” to dolphin vocalizations and predicts what sound comes next—essentially learning the structure of dolphin communication.
AI and animal communicationArtificial intelligence models are changing the rate at which experts can decipher animal communication. Everything under the Sun—from dog barks and bird whistles—is easily fed into large language models which then can use pattern recognition and any relevant contexts to sift through the noise and posit what the animals are “saying.”
Last year, researchers at the University of Michigan, Mexico’s National Institute of Astrophysics, and the Optics and Electronics Institute used an AI speech model to identify dog emotions, gender, and identity from a dataset of barks.
Cetaceans, a group that includes dolphins and whales, are an especially good target for AI-powered interpretation because of their lifestyles and the way they communicate. For one, whales and dolphins are sophisticated, social creatures, which means that their communication is packed with nuance. But the clicks and shrill whistles the animals use to communicate are also easy to record and feed into a model that can unpack the “grammar” of the animals’ sounds. Last May, for example, the nonprofit Project CETI used software tools and machine learning on a library of 8,000 sperm whale codas, and found patterns of rhythm and tempo that enabled the researchers to create the whales’ phonetic alphabet.
Talking to dolphins with a smartphoneThe DolphinGemma model can generate new, dolphin-like sounds in the correct acoustic patterns, potentially helping humans engage in real-time, simplified back-and-forths with dolphins. This two-way communication relies on what a Google blog referred to as Cetacean Hearing Augmentation Telemetry, or CHAT—an underwater computer that generates dolphin sounds the system associates with objects the dolphins like and regularly interact with, including seagrass and researchers’ scarves.
“By demonstrating the system between humans, researchers hope the naturally curious dolphins will learn to mimic the whistles to request these items,” the Google Keyword blog stated. “Eventually, as more of the dolphins’ natural sounds are understood, they can also be added to the system.”
CHAT is installed on modified smartphones, and the researchers’ idea is to use it to create a basic shared vocabulary between dolphins and humans. If a dolphin mimics a synthetic whistle associated with a toy, a researcher can respond by handing it over—kind of like dolphin charades, with the novel tech acting as the intermediary.
Future iterations of CHAT will pack in more processing power and smarter algorithms, enabling faster responses and clearer interactions between the dolphins and their humanoid counterparts. Of course, that’s easily said for controlled environments—but raises some serious ethical considerations about how to interface with dolphins in the wild should the communication methods become more sophisticated.
A summer of dolphin scienceGoogle plans to release DolphinGemma as an open model this summer, allowing researchers studying other species, including bottlenose or spinner dolphins, to apply it more broadly. DolphinGemma could be a significant step toward scientists better understanding one of the ocean’s most familiar mammalian faces.
We’re not quite ready for a dolphin TED Talk, but the possibility of two-way communication is a tantalizing indicator of what AI models could make possible.
gizmodo