Bioacoustic Provenance

The acoustic, forensic, and archival foundations of the Sondage Sound Standard, and the reasoning by which a recorded human voice is preserved as evidence rather than as media.

Why the Voice Must Be Preserved as Evidence

A recorded voice in 2026 is not what a recorded voice was in 1996. The technology that captures sound has improved at the same rate as the technology that synthesizes it, and the second arc has now caught the first. A voice clone trained on three minutes of audio can produce speech that is, to most listeners, indistinguishable from the original speaker. Generative restoration tools can rebuild a degraded recording into something that sounds cleaner than the original. Noise reduction algorithms can remove the room a voice was recorded in and leave the voice apparently unchanged.

These tools serve real purposes. They are not the problem. The problem is that none of these tools, applied to a recording, leaves any reliable trace that they have been applied. A restored recording sounds like a recording. A cloned voice sounds like the speaker. A noise-suppressed master sounds like the original. The acoustic record, after a decade of generative processing, no longer reliably distinguishes a human voice from a synthetic reconstruction of one. The historian of 2075 opening a family archive will not be able to tell, by listening alone, which recordings are evidence of a life and which are plausible inferences about it.

This is the condition the Sondage Sound Standard was designed for. The recorded voice of a Senior Fellow, captured under documented conditions and protected from synthetic processing, must remain forensically and evidentially distinct from any synthetic reconstruction of that voice for as long as the recording exists. The discipline by which that distinction is preserved at the level of the recording itself is what Sondage names bioacoustic provenance. This page describes the scientific, institutional, and archival traditions that bioacoustic provenance operates within. The next section in this pillar describes how the credentialed Legacy Sound Producer operationalizes the standard in residential settings.

The Bioacoustic Foundation

The human voice carries a signature. Not metaphorically. Acoustically, biomechanically, and physiologically, every voice is a measurable artifact of a particular vocal tract, a particular respiratory system, a particular set of articulators, and a particular history of how that body has used them across a life. The science of how these systems produce voice has been developing since the late nineteenth century, and the contemporary discipline is well-established in the work of voice researchers including Ingo Titze at the National Center for Voice and Speech and the Swedish acoustician Johan Sundberg.

Titze's Principles of Voice Production (Prentice Hall, 1994; revised by NCVS, 2000) is the standard reference on the physical and physiological mechanisms of human vocal production. Sundberg's The Science of the Singing Voice (Northern Illinois University Press, 1987) established the foundational understanding of the acoustic properties of the human voice, including the formant structure and spectral characteristics that distinguish authentic vocal recordings from synthetic reconstruction. The combined research base establishes a basic claim that has not been overturned by the rise of generative voice synthesis. The human voice contains information at frequencies, at temporal resolutions, and in micro-fluctuations that no synthesis system can fully replicate, because the systems that synthesize voice operate on a model of voice rather than from a body that produces it.

This is the technical basis for the claim that an unaltered recording of a human voice, captured at sufficient bit depth and sample rate to preserve the spectral signature, remains forensically distinguishable from a synthetic version of the same voice. The signature does not reside in the words. It resides in the bioacoustic micro-structure of the production. Preserving that micro-structure is what the Sondage Sound Standard is built to do. Preserving it requires that the recording not be processed by any system that operates on a model of voice rather than from the voice itself.

The Forensic Tradition

The institutional discipline that has been doing this work the longest, and to the highest evidentiary standard, is forensic phonetics. The professional body is the International Association for Forensic Phonetics and Acoustics (IAFPA), founded in 1991, whose membership of academic phoneticians, forensic laboratories, and law enforcement audio examiners has produced the contemporary standards for speaker identification, voice comparison, and the evidentiary use of recorded voice in legal proceedings.

The relevance of forensic phonetics to a family archive is not theoretical. The same standards that make a voice recording admissible in a courtroom are the standards that make it citable as evidence in a historical archive. The IAFPA's published guidance on the conditions under which recordings may be used for speaker identification, the documentation requirements for chain of custody, and the prohibition on certain post-capture processing operations constitute the immediate institutional context within which Sondage's claim that a recorded voice can serve as forensically defensible primary source becomes legible.

The forensic tradition draws a hard line that the consumer recording industry has been moving away from for two decades. The line is between capture and processing. A captured recording is what arrived at the microphone. A processed recording is what an engineer or a system has done to that capture afterward. For most purposes, processing improves the listening experience. For evidentiary purposes, processing introduces uncertainty about what the recording is evidence of. The forensic tradition treats the unprocessed master as the authoritative document and any processed version as a derivative. Sondage adopts this discipline as the Authentic Capture Standard, which prohibits noise suppression, spectral repair, generative reconstruction, and any other post-capture synthetic processing of master files at any stage of the engagement.

‍ ‍

The Archival Audio Tradition

Adjacent to the forensic tradition is the archival audio tradition, which has been developing standards for long-term sound preservation since the early twentieth century and has codified them through several professional bodies whose work governs serious institutional audio archives worldwide.

The International Association of Sound and Audiovisual Archives (IASA), founded in 1969, publishes the international standard for archival audio preservation, IASA-TC 04, Guidelines on the Production and Preservation of Digital Audio Objects. The standard specifies file format, sampling rate, bit depth, metadata structure, and preservation strategy at the level of detail required for institutional archives that intend to remain readable for centuries. Sondage adopts the IASA standards and extends them with the prohibition on post-capture synthetic processing that the institutional archival tradition has not yet fully codified for itself.

The American institutional anchor is the Association for Recorded Sound Collections (ARSC), the professional body for archivists, librarians, and curators of recorded sound, whose standards on preservation, format migration, and authentic capture inform the residential application of the discipline. The Library of Congress National Audio-Visual Conservation Center, housed in Culpeper, Virginia, is the largest institutional preservation operation for recorded sound in the United States, and its preservation methodology is the practical reference standard against which Sondage's residential standards are calibrated.

The archival audio tradition has been thinking carefully about format obsolescence for decades. Magnetic tape degrades. Optical media fail. Proprietary file formats become unreadable when their software vendors dissolve. The tradition's response is the discipline of open format archival capture, recording at uncompressed standards in file formats whose specifications are public and whose readability does not depend on any single vendor's continued operation. Sondage records to dual-track WAV at 32-bit float depth, an open archival format whose readability will not depend on Sondage's continued operation or on any vendor's continued support. The recording made in 2026 must still be openable, listenable, and verifiable in 2076 by a descendant with no technical training and no access to the platform that produced it. This is what the archival audio tradition obligates.

‍ ‍

The History and Philosophy of Recorded Sound

A standard for recorded voice in the synthetic age requires more than a forensic protocol and a preservation format. It requires a position on what a recording is for and what it owes to its future listener. The intellectual tradition that has been thinking about these questions is sound studies, and the foundational text is the cultural historian Jonathan Sterne's The Audible Past, Cultural Origins of Sound Reproduction (Duke University Press, 2003).

Sterne's argument, drawn from a detailed history of nineteenth-century recording technologies and the cultural assumptions encoded into them, is that recording has never been neutral capture. Every recording technology embeds assumptions about what sound is for, who it is for, and what about the original event is worth preserving. The phonograph captured what its designers thought was important about a voice. The microphone captured what its designers thought was important. The compressed digital file captures what its codec's designers thought was important. The history of sound reproduction is the history of these assumptions made and remade across more than a century. Sondage's commitment to the unprocessed master file is itself a position within this history. It is a claim that, in the synthetic age, the assumption worth preserving is the one that says what the microphone captured was what was actually said in the room.

The contemporary sound studies field, including the work of Mara Mills at NYU on the history of sound technology and disability, Steven Feld on voice as cultural and embodied object, and the broader scholarly community organized through the journal Sound Studies and the Oxford Handbook of Sound Studies, provides the intellectual context within which Sondage's argument for acoustic honesty becomes legible to a sophisticated reader. The argument is not that processing is wrong. It is that the unprocessed capture has a categorically different evidentiary status than a processed version of itself, and that for an archive of a human life, the evidentiary status is what matters.

The Synthetic Age Threat Environment

The contemporary urgency of bioacoustic provenance is established by the rapid development of synthetic voice generation across the past five years. The leading civil society organization tracking the threat is WITNESS, particularly through the work of Sam Gregory and the WITNESS Media Lab, whose research on deepfake detection, voice cloning, and the governance of synthetic media constitutes the most developed civil society response to the threat environment Sondage operates within.

The institutional response has converged on labeling and cryptographic provenance. The Coalition for Content Provenance and Authenticity (C2PA) develops cryptographic assertions embeddable in image, video, and audio files. The International Press Telecommunications Council (IPTC) defines metadata fields for provenance, including a Digital Source Type value for synthetic media. The Library of Congress issued in 2026 a Call to Action for the Libraries, Archives, and Museums Community on content authenticity and provenance in the age of artificial intelligence. These responses are useful. None of them is sufficient on its own for an archive of recorded human voice.

The reason is structural. A label inside a file claims a status. The unprocessed master, captured under documented conditions and verified by a credentialed practitioner who signs the Season Technical Manifest, demonstrates the status. A label that says this recording was not synthetically processed is only as trustworthy as the system that wrote the label. A recording that demonstrably was not synthetically processed, because its bioacoustic micro-structure remains intact and can be forensically verified, is trustworthy at the level of the artifact itself. Sondage's commitment to bioacoustic provenance is the architectural answer to the problem that labeling alone cannot solve. The recording itself, preserved at the bioacoustic level, becomes its own evidence.

The performance theorist Diana Taylor's counterposition of the archive to the repertoire, articulated in The Archive and the Repertoire (Duke University Press, 2003) and described more fully at Heritage Curation, is the philosophical anchor. A record is not legible without a living chain of people willing to vouch for it. The recording is the archive. The credentialed Legacy Sound Producer, signing the Manifest, is the repertoire. Together they constitute bioacoustic provenance. Either alone is insufficient.

‍ ‍

How Sondage Applies the Bioacoustic Provenance Frame

Sondage stands within these traditions and operationalizes them through a single discipline. The Sondage Sound Standard, attestable on the Season Technical Manifest at the close of every engagement, holds three commitments that together constitute bioacoustic provenance in operational form.

The first is the Authentic Capture Standard. No noise suppression, no spectral repair, no generative reconstruction, no synthetic processing of any kind is applied to the master file at any stage of the engagement. The voice is preserved as it was spoken, with its hesitations, its texture, its ambient truth. Acoustic honesty, the preservation of sound as it actually existed in the room on a specific day, is the defining quality of the work.

The second is open archival format capture. Recording is conducted in dual-track WAV at 32-bit float depth, captured in the Senior Fellow's home environment, in file formats whose specifications are public and whose readability does not depend on Sondage's continued operation or on any vendor's continued support. The recording made in 2026 must remain openable in 2076 by a descendant with no access to the original infrastructure.

The third is embodied attestation. A credentialed Legacy Sound Producer, trained in the Sondage standard and accountable for the engagement, signs the Season Technical Manifest at the close of the work. The signature carries the practitioner's name, professional standing, and disciplinary commitment forward into the archive. A label can be automated. An attestation cannot. The Manifest is the embodied repertoire that makes the recorded archive citable as primary source rather than as media.

Together these three commitments produce what Sondage names bioacoustic provenance. The recording, captured under the Authentic Capture Standard, preserved in open archival format, and signed by the credentialed practitioner who built the room and verified the work, remains a forensically and evidentially defensible primary source for as long as the archive exists. The full operational standard, including the technical specifications of capture, the protocols by which the Sanctum is prepared, and the structure of the Manifest itself, is the subject of the credentialed Legacy Sound Producer curriculum.

‍ ‍

The Lineage

The selective citational base of this page. Each entry corresponds to a researcher, scholar, institution, or framework named above, with the seminal work and a Sondage-aligned description of the contribution.

The Bioacoustic and Voice-Science Tradition

Ingo Titze. Principles of Voice Production. Prentice Hall, 1994; revised by the National Center for Voice and Speech, 2000. The standard reference on the physical and physiological mechanisms of human vocal production, establishing the irreducible biological signature of the human voice that constitutes the evidentiary basis for bioacoustic provenance.

Johan Sundberg. The Science of the Singing Voice. Northern Illinois University Press, 1987. The foundational text on the acoustic properties of the human voice, including the formant structure and spectral characteristics that distinguish authentic vocal recordings from synthetic reconstruction.

The Forensic Tradition

The International Association for Forensic Phonetics and Acoustics (IAFPA). Founded 1991. The professional body governing forensic voice analysis, whose standards on speaker identification, voice comparison, and the evidentiary use of recorded voice constitute the immediate institutional context for Sondage's claim that a recorded voice can serve as forensically defensible primary source.

The Archival Audio Tradition

The International Association of Sound and Audiovisual Archives (IASA). IASA-TC 04, Guidelines on the Production and Preservation of Digital Audio Objects. The international standard for archival audio preservation, including format, sampling rate, bit depth, and metadata requirements that Sondage adopts and extends with the Authentic Capture Standard.

The Association for Recorded Sound Collections (ARSC). The American professional body for archivists, librarians, and curators of recorded sound, whose standards on preservation, format migration, and authentic capture inform the residential application of the discipline.

The Library of Congress National Audio-Visual Conservation Center. The American institutional anchor of long-term audio preservation practice, whose preservation methodology is the practical reference standard against which Sondage's residential standards are calibrated.

The History and Philosophy of Recorded Sound

Jonathan Sterne. The Audible Past, Cultural Origins of Sound Reproduction. Duke University Press, 2003. The cultural and philosophical history of recorded sound, establishing that recording has never been neutral capture and that every recording technology embeds assumptions about what sound is for and who it is for.

Mara Mills. The NYU media studies scholar whose work on the history of sound technology, disability, and the politics of audio capture extends the sound studies tradition into the contemporary infrastructure within which Sondage operates.

Steven Feld. The ethnomusicologist and acoustic anthropologist whose work on voice as cultural and embodied object, particularly Sound and Sentiment (1982), provides the anthropological foundation for the claim that recorded voice carries cultural and biographical information that aggregated metadata cannot.

The Synthetic Age Threat Environment

WITNESS and Sam Gregory. The leading civil society organization on the ethics and governance of synthetic media, whose work on deepfake detection and the threat of voice cloning constitutes the immediate threat-environment context for Sondage's bioacoustic provenance commitment.

The Coalition for Content Provenance and Authenticity (C2PA) and the International Press Telecommunications Council (IPTC). The two principal institutional bodies developing labeling and cryptographic provenance standards for AI-generated content. Sondage builds compatibly with both while arguing that labeling alone is insufficient for an archive of recorded human voice. See also Heritage Curation.

The Library of Congress. Content Authenticity and Provenance in the Age of Artificial Intelligence, A Call to Action for the Libraries, Archives, and Museums Community (2026). The institutional naming of the central archival challenge of the synthetic age, to which Sondage's bioacoustic provenance commitment is a structured response. See also Heritage Curation.

The Philosophical Frame

Diana Taylor. The Archive and the Repertoire, Performing Cultural Memory in the Americas. Duke University Press, 2003. The counterposition of the archive of inert documents to the repertoire of embodied transmission, with the human voice as the most repertoire-dependent of archival objects. See also Heritage Curation.

Adjacent References

Smithsonian Folkways Recordings. The institutional precedent for ethically grounded recorded-voice preservation at scale, whose Moses and Frances Asch collections demonstrate what a sustained commitment to the unprocessed recording of human voice produces over decades.

R. Murray Schafer. The Soundscape, Our Sonic Environment and the Tuning of the World. Destiny Books, 1977. The foundational text in acoustic ecology, providing the broader framework within which the Sondage commitment to room tone and the preserved acoustic environment of the Senior Fellow's home becomes legible.

‍ ‍

Continued in the Sondage Review

The Sondage Review extends this material in essays on the synthetic voice threat environment, the archival case for unprocessed capture, the relationship between forensic phonetics and family archives, and the philosophical history of what recording means. Forthcoming essays in 2026 will deepen the connection between the bioacoustic research base and the operational practice of a Sondage Sound Standard engagement.

The Sondage Legacy Sound Producer curriculum is the structured course of study by which independent practitioners are accredited to the Sondage Guild.