Enter The Aviary: Time Travel With Birds and Machines
The Aviary is a prototype developed at the 3A Institute at the Australian National University and launched on 11 November, 2020. It was conceived and designed with myself along with Memunat Ibrahim, Jacob Choi, and Mikaela Jade. This is a personal reflection on the ideas and development of the work.
Before Covid-19 became the crisis of 2020, and a month before I arrived in Canberra, the world was reeling at the environmental devastation wrought by a series of fires that had swept the Australian bush. Those fires served as a warning for what was to come, as wildfires would ignite across California on the flip side of this terrible year. The toll on wildlife in Australia was enormous, with nearly 3 billion animals — including 180 million birds — lost.
As part of a collaborative project at the 3A Institute, we wanted to create a cyber-physical system that would serve as an artwork, while aiding and documenting in the recovery efforts of national parks in and around Canberra, including Namadgi and Mulligan’s Flat. We created a virtual sound experience that could replay the sounds of birds as they were captured on cameras in these parks. It is a system of systems, which includes:
Machine learning for image classification, which sorts high-risk and most-common birds into folders to be verified by volunteers, while separating unidentifiable birds from pictures where no birds are present.
A system to scrape metadata from the same images, including time and date stamps, which are stored on a calendar.
A system for users to click on any date and hear the sound of birds replayed at the precise time those birds were spotted by cameras.
It is our hope that this could be linked to image archives in national parks to create a sonification of bird tracking data as Australian birds recover from the fires. Users can choose any date on a calendar and play an approximate sound of the forests based on the data collected that day — including specific bird songs played at the specific moment a bird was seen.
We presented the work by recreating green plants and grass in a conference room. As guests entered, they stepped into a small trapezoidal space where speakers were hidden in the grass. The ambient sound of running water, buzzing insects and distant birds could be heard in the distance. Soon, a crescendo: the sound of magpies, rosellas or galahs slowly rose in volume, as if the bird was approaching and passing by. Each “rise and fall” of birdsong marked the appearance of the same species of bird at that exact moment one day prior. It was a way of experiencing the data gathered about bird populations as something more than a line rising in a chart, or a 10 turning into a 12 in a spreadsheet.
In the meantime, the back end of the system makes analyzing and sorting images from tree-mounted cameras less onerous on parks volunteers, who must identify the birds in thousands of photographs every week. By training our system on this massive archives of past images and being “trained” over time by volunteers who confirm the accuracy of the model’s identification, the model should eventually become robust enough to speed up the tracking of birds, freeing the time of volunteers to do more field work.
A Cyber-Physical Time Machine, With Birds
The prototype of this system is functional, and the code — the final result of the brilliant Memunat Ibrahim — is online at Github for anyone to use or adapt, for any purpose. Our prototype was installed for a demo day at the 3A Institute on 11 November 2020. It consisted of two small speakers hidden within a dense “forest” of potted plants, facing a window out of the third story building where birds (Noisy Miners) often perched. Guests were welcomed in and invited to listen to the sounds of three Australian birds: Galahs, Crimson Rosellas, and Magpies. Each of these birds was selected for their distinct sounds, with Galahs selected in consultation with park rangers about which birds were of special interest.
Our hope was also to inspire a new relationship to sound. I had worked, previously, with the sound artist Jason Reinier for an exhibition he’d done at swissnex San Francisco. This piece owes a heavy debt to the conversations I had with him then, and again once our team started the project. When an artwork refocuses our attention on hearing, even for short periods of time, it lingers: as I left the space at the end of the day I was much more attuned to the natural environment surrounding me on my walk through the ANU campus. Our hope is that this lingering sense of appreciation translates into a desire to protect the natural world, or at least spend more time in places that people will then aim to preserve.
From top: An Australian Magpie, a Galah, and a Crimson Rosella.
Another inspiration was the idea that sound archives are undervalued for their imaginative potency, especially compared to images. Close your eyes and listen and you can imagine an entire world. Sound also changes in ways we forget — a dialogue I had with Jason about the changing system behind San Francisco’s (in)famous foghorns was relevant here. When San Francisco changed from air horns to pre-recorded sirens, the tone changed, but we quickly forgot. It’s easy to “see” differences between the present and past through photographs or films. It’s much harder to hear it. One might forget the sounds of a forest after an event of major crisis, the changes that come as natural systems recover. While we are synthesizing these forest sounds, it’s important for us to create an auditory frame of reference, to account for a sense of density or sparseness. As birds return, the soundscape becomes more and more dense, thicker and more immersive. For now, the data gives us something sparse, but soothing. The forest is changing — our hope is that these comparisons can be experienced more viscerally through this artwork.
We had also considered a prototype which was a “Sonic VR” headset: a blindfold with ear buds. The intent was to ask guests to listen rather than look. In this way, we were inviting guests to rethink ideas of immersion, which are so often linked to the visual senses, and center listening rather than looking. While we had to leave this blindfold-as-VR-goggles prototype undeveloped because of covid-19 concerns, we were able to keep the same message for a modified immersive experience.
It is our hope, and would be a great success for the project, if the sound wall eventually becomes so dense it is pure noise! This would signify the return of massive numbers of lost birds, and suggest that our tool has created an opportunity to deploy and use an increasing number of wildlife cameras by saving Rangers the time of manually reviewing every image they collect.
Aviary CPS: The Prototype
The next step for this system is acquiring data, potentially from park rangers. At the moment, the system is fully capable of taking in photographs, sorting and classifying them, writing and labeling them, and creating a calendar interface for anyone in the world to choose a date and “play” the forest data back as a time-stamped collection of bird songs. We have tested and trained the system on a number of sample images publicly available from Mulligan’s National Park.
Each bird song is archival, taken from creative-commons-licensed offerings on the Xeno-Canto database. Roughly 6-14 songs are stored in the file associated for each bird. At a designated moment when the bird has been identified, the system selects a file at random from that bird’s designated folder, and plays it at the time the bird was identified by the camera. A longer, recurring background recording provides a steady flow of sound at a lower volume; we hope someday to link this to weather data to reflect the sound of rain or even wind.
We’re excited to see the middleware piece of this tool adapted to other uses by artists or anyone who has an interest in connecting images to an archive of sound. Our system was designed as a sonic installation, and runs on a 24-hour cycle, picking up at the hour that a user clicks on the date in a calendar. Changes to the code could create shorter (or longer!) durations. Systems equipped with microphones could also pair recorded audio, though it would require modifications of the programs. Training new image libraries could also create opportunities to expand the number of animals, or generate calendar information from new sets of images altogether. And finally, with just a few modifications the code could move from triggering sounds to triggering image sequences or any other response.
We are also open to talking to parks who might like to use this system to create a sonic artwork for a public space. Our major hurdle is only in obtaining enough image data to train our model in more refined ways. Finally, the piece could be of interest to the GLAM sector, particularly film or sound archives, as the calendar interface could create a “sound of the day” for surfacing dated sounds or video elements from the archive, or creating time-stamped sounds in response to inputs from a webcam. The module is quite open-ended!
We hope to create an online version of The Aviary soon, using only sample data. In the meantime, you can find the source code below.
This project drew on a broad range of expertise within our group, from Memunat Ibrahim’s coding, ML and JS skills, to Jacob Choi’s excellent project management and camera development. We are also grateful to Mikaela Jade for her tremendous insights into how park services work with images and population data, and for keeping our idea grounded in hopeful outcomes for these recovering bird populations.
Special thanks to 3A Institute staff, Johan Michelove and Zena Assaad, for guiding our work and thinking.