Top of page

The Meaning of the MP3 Format: An Interview with Jonathan Sterne

Share this post:

Historian Jonathan Sterne author of MP3: The Meaning of a Format & The Audible Past: Cultural Origins of Sound Reproduction
Historian Jonathan Sterne author of MP3: The Meaning of a Format & The Audible Past: Cultural Origins of Sound Reproduction

What does the history of the MP3 format mean for those interested in ensuring long-term access to our digital cultural heritage? In this installment of the NDSA’s Insights interview series I talk with historian Jonathan Sterne about his book MP3: The Meaning of a Format. You can read the introduction to his book, titled “Format Theory,” online. You can also read an interview about the book on Pitchfork.

Trevor: The audience here tends to be folks working in and around digital preservation (curators, librarians, archivists, digital asset managers, etc.). Given that audience, what do you think would be particularly relevant to folks working to ensure long term access to digital information?

Jonathan: Thanks for interviewing me and thanks to you, dear reader, for reading.

Here are two takeaways: compression is a major dimension of communication history, as important as reproduction or the quest for verisimilitude; and while recordings have been abundant and ephemeral for most of their history, in recent years we have been overrun with a new level of abundance of recorded sound. Obviously, archivists have it harder than your average sound nerd, because they are interested in how sound might be reproduced in an uncertain future.

On the one hand, the book is a bit of reassurance, that high definition isn’t the be-all and end-all of sound reproduction, even from a technical standpoint. On the other, the book will be troubling because so many concerns that shaped the mp3 format–from engineering decisions to industrial self-regulation–run counter to the values and goals of sound archives. And the sheer ubiquity of recordings means any archivist surveying the world now has a more present–and painful–sense of the magnitude of cultural heritage that is necessarily going to be lost to future generations.

Trevor: Given your approach to understanding MP3 as a format, I’m curious if you have any thoughts on MP3 files as a species of digital artifact. For instance, what kinds of questions do you think historians of the future will be able to ask and answer of collections of MP3s found in disk images of individuals’ hard drives, or in web archives?

Jonathan: At a basic level, they’re not that different from what historians can ask now of people’s collections or archives. People forget that for all their delicious materiality, most people understood records and tapes as ephemera for most of their histories. So the collections will be partial and fragmentary in the future just as they are now. Of course, you’ll need a lot more than good filing and climate control to keep disk images around for as long as we have kept old analog recordings around.

On another level, there is great promise because of things like metadata that are “inside” the document, searchability, comparability and a variety of other features of digital files. There is an irony here. Humanists like to argue that inside a computer, it’s all 0s and 1s and the computer doesn’t care that it was originally or will later be sound, image or text. But the sonic dimension of sound archives is proving interestingly recalcitrant. There is a lot of hope for things like Music Information Retrieval, spectral analysis, and other “digital humanities” methods for dealing with recordings.

As of yet we haven’t had any major breakthrough in the various cultural disciplines resulting from these analytical techniques. They haven’t re-answered any old questions or yet introduced important new ones. But this stuff takes time. One promising project that I’ve got my eye on is Tanya Clement’s HiPSTAS, which involves appropriating technologies from other disciplines and then trying to modify it for humanists. Her group is interested in audio archives of poetry and is actually talking with poets and poetry scholars. Also, they get extra points for the acronym.

Example of viewing the waveform of an audio recording in Audacity.
Example of viewing the waveform of an audio recording in Audacity.

There is one other dimension to this, which is the simple availability of sound as a material for scholarship. Art historians have had slide projectors (and the wonderful dual-slide projector pedagogy for comparison) for more than 100 years now. Literary scholars have been able to bring the text into their scholarship a lot longer. In both cases, it wasn’t bespoke technologies for scholars but rather scholars appropriating technologies that were made for other people: slide projectors, overheads, mimeographs, photocopies, word processors, slideware (that one for better and worse!) and on and on.

Now we can appropriate sound editors like Audacity and sound performance software like Ableton Live to use sound in our scholarship, teaching and discussion. Sure, musicologists had record players in their classrooms, but even there it was something of an unusual technology for teaching, and it was never widely utilized outside of pretty specialized music classrooms.

In a talk or a class, I can now easily integrate a sound recording into a discussion. I can highlight parts of it in the same way one of my colleagues might zoom in on a figure in a painting in order to make a point about how “the gaze” works. That ability to point out and indicate–a sort of “hey, listen to this,” the quality that philosopher Charles Sanders Peirce called indexicality–is now available to us in large quantities for the first time.

It’s been a boon to me, even though many classrooms and meeting rooms are still not properly set up for sound. Journals like The Journal of Sonic Studies are taking advantage of this capability. But audiences are still surprised when I break out sound recordings in the middle of a talk to illustrate a point in the way a social scientist might use a figure or a diagram. It’s unusual. I hope that in the coming years it won’t be.

Trevor: In the introduction to your book, titled “Format Theory,” you ask “In an age of ever-increasing bandwidth and processing power, why is there also a proliferation of lower-definition formats?” Could you give us a brief take on your answer to that question? Further, do you think anything in your observation is more broadly relevant to the longevity/future of digital formats?

Jonathan: My answer is simple and related to that first takeaway. One of the great myths of communication history is that technologies are progressing inexorably toward increasing definition or resolution–whether “naturally” or as a program for human endeavor. People also mistakenly assume high definition is the same thing as realism or verisimilitude– which can be defined as truthfulness to life, or more accurately, “truthiness” (to credit the noted scholar Stephen Colbert). If you actually consider the development of communication technologies, space–defined as bandwidth, storage space in a medium, or even space defined as a slot in a schedule–is often the most economically valuable thing.

People who ask why we would ever use compressed audio formats now that hard drives are so big haven’t checked their wireless or internet bills lately: bandwidth is one of the most precious resources. And from an archivist’s standpoint, this “space is cheap” ideology is complete foolishness. Sure hard drives are getting bigger and cheaper. Now let’s talk about keeping them running for five years. Now let’s talk about 50 or 100 years and it doesn’t seem cheap at all.

Archivists’ solution generally seems to be transcoding: we don’t play back old cylinders, we play back digital recordings of them. The same will eventually true for digital files. Our first assumption in considering the future always has to be that the present day infrastructure–up to and including the internet–won’t exist.

Now here’s the thing: “lossy” formats like MP3 reveal their transcodings more easily than lossless formats. That’s because of how the encoders work–artifacts tend to compound on one another as you transcode from one format to another. Upload a low bitrate MP3 to one of the many sites that automatically transcodes audio, and you’ll hear it immediately if you’re sensitive to those things. In a sense, existing archival practice probably accounts for this–people tend to encode in the highest definition that’s feasible–but the artifacts will be there from the file’s previous travels before it got to the archivist, like scratches on a record.

Trevor: The MP3 was created in the early 90s, but you trace it’s history back to the turn of the 20th century. Could you offer a bit of the reasoning for that deeper history? Do you think that kind of deeper historical approach is necessary for contextualizing and understanding other kinds of file formats?

Jonathan: I imagine it depends on the format. I didn’t plan on doing that, but I quickly discovered that the paired histories of information theory and psychoacoustics weren’t a well-understood part of our cultural history. Many engineers knew about it, but only as it affected their own work, so there were a lot of gaps to fill in. I also felt that if I were going to posit a real alternative to the fantasy of progressing toward greater definition, I needed to give an alternative story. If a pile of books already existed that gave the history the way I wanted to, I could have jumped in right in the 1970s with the first steps into perceptual coding.

More generally, I believe there are multiple kinds of historical time or multiple rhythms. For MP3, I wanted to catch three at once: a century-long arc toward compression; a half-century shift in orientation toward computers, hearing and noise; and a 20-year period where people were building the technology and setting the standard. I think this way delivers a richer sense of context and significance, especially because so many trains of thought on the importance of new media derail because of historical ignorance. But of course sometimes the best story is one that takes place in a single week or year. It all depends on what you’re trying to do.

Trevor: You frame your book as part of a “general history of compression,” suggesting a need to shift from a focus on media to a focus on formats. Could you tell us a bit about that idea? Further, what other kinds of media and formats do you think really need to be best understood in terms of that history of compression and what do you think that history might mean for those interested in ensuring long term access to digital objects?

Jonathan: Ha! Do I get in trouble if I just say “all of them”? At least since the end of World War II, English-language scholars have tended to focus on particular media, defined usually through a box or end-user technology that stands in for the whole set of institutions, technologies and practices: think of television, newspapers, radio, magazines, books as they were understood in the 1980s. These definitions and the ideas of separateness behind them are actually written into US communications law, which is why broadcasters are under such different regulations than internet services, even though the end user experience in some cases (say, watching a TV show) may be the same.

The standard tale told is one of convergence, where there were these separate things called media but now they’ve retreated inside our computers, so they no longer have their own boxes. But what if it’s the other way around? What if the separation of media we thought we saw in the mid 20th century was somewhere between an historical illusion and an exception? If that’s the case, then we need to think at different scales–above and below those end-user experiences, but also in terms of infrastructure, regulation, aesthetics, platforms, and audience/user practices all being mixed up in one another, rather than having a stable and predictable relationship amongst themselves.

So “format” calls attention to the subtler dimensions of the point where standards, aesthetics and experience meet. Obviously, for people who read and write a lot, .pdf and .doc are huge. Lisa Gitelman’s taken care of the former in her new Paper Knowledge and Matt Kirschenbaum should have some smart things to say about the latter in his forthcoming history of word processing. Oddly, there still aren’t good published cultural histories of most audio recording formats, almost all of which at least touch on compression history since the engineers almost always have limits foremost in their minds–long playing records, compact discs, audio tape. That work is coming but it’s not yet out. And I’m part of a small group of people writing on color and video, where many of the same problems come up.

Trevor: In digital preservation there is considerable consideration for the sustainability of particular digital formats, and a need for file format action plans where organizations articulate a strategy to deal with the potential, future inability to “play back” files. Could you tell us a bit about your articulation of format theory and then talk through how format theory might be relevant to considerations of obsolescence and format action plans?

Jonathan: A radical version of my argument might be that all digital formats are stuck in a permanent state as incunabula. Which is to say that the stability we expect from communication technologies like the printing press through the 33 1/3rpm record aren’t likely to come. Business models have changed to encourage obsolescence–a really noxious phenomenon when applied to culture and one on which our descendents will judge us, along with our general destruction of the environment.

A less hyperbolic argument would be that successful digital formats seem to have the same lifetime as analog formats. Sam Brylawski, who used to head up the Recorded Sound Collections at the Library of Congress, told me that he thinks all audio formats have a half-life of about a quarter-century (in the US, at least). That seems to work for CDs, LPs and cassettes. MP3s may last longer simply because many of them were never bought.  But either way, you’re talking about a regular pattern of change-over to keep things available to users, while also maintaining older technologies or emulators to ensure playback. So we’re back to engineering.

Trevor: I’ve previously talked with Lori Emerson about the relationship between media archaeology and digital stewardship/preservation. She described media archaeology as, at least partly, “a theoretical framework by which to look at the particular material dimensions of machines.” I would be curious to hear you explain what the relationship between your approach to something like MP3 and media archaeology.

Jonathan: There are lots of theoretical frameworks to look at the material dimensions of machines. I certainly think of media archaeologists as kindred spirits (and we both have read our Foucault), but I have an ambivalent relationship to that tradition. I learned to think about the “hard” operational dimensions of technologies as political and cultural through cultural studies, science and technology studies, feminism and the cultural history of media. For me, the fact that the psychoacoustic model “works” in the MP3 is not enough–what matters is where it comes from and why it’s there. We could say the same about the idea (for most of the 20th century) that the phone was a point-to-point technology and the radio was a one-to-many broadcast technology. At the same time, media archaeologists are often great at helping us see past naive narratives of progress, and to understand communication technology as an important part of our cultural heritage, and I am 100% behind that program.

Trevor: In digital preservation, there is considerable discussion of the somewhat problematic notion of “significant properties” or “significant characteristics” of digital content. Dappert and Farquhar’s essay Significance is in the Eye of the Stakeholder (pdf) discusses a good bit of these issues around this concept, but the idea is generally that organizations interested in preserving digital objects for the long haul may need to change their formats to do so and as such they need to attend to the properties or characteristics of those objects that are significant for their intended future audience. Given your book presents such a deep dive into the history and ideas behind a particular format, I would be curious to hear your thoughts on both the notion of defining significant characteristics for a particular set of MP3s and more specifically, what kinds of things you think an organization should attend to in defining such properties or characteristics.

Jonathan: In terms of ensuring long-term access to digital objects, there isn’t an easy answer. But one thing I would strongly caution against is accepting the current industry rubric about “content.” That term does a lot of nefarious rhetorical work in the media industry (essentially arguing that platforms, intermediaries and infrastructures should have more cultural as well as economic importance than the culture carried and enacted in the medium).

Video games, recordings, images, letters, and movies are all cultural artifacts, attempts to make meaning in a complex world. But how that plays into practice I am not sure. If I play an old Atari game on my new laptop, am I really having the experience that someone would have had on an old console version? Of course not, but it’s hard to know which aspects might be important for scholars. If, on the other hand, we say we have to preserve all aspects of the platform in order to get at the historicity of the media practice, that means archival practice will have to have a whole new engineering dimension to it.

Here’s another thought: might it be worth preserving the traces of circulation and use that the object itself bears? We are used to this in analog media, from marginal notes in collections of people’s papers to the erosion of video tapes or records, where the “noise” becomes part of the meaning of the recording itself–just listen to all the vinyl noise in hip hop. But digital files bear traces of their circulation as well, from “track changes” to compression artifacts on YouTube videos.

Perhaps future historians will find as much in them as we find in the marks on older artifacts. This is already an issue in digitized collections of old magazines. I recently had cause to consult some Time Magazine issues from the late 1960s. My library told me it was online, but they only had “all the articles” and not even all the articles. I wanted to know what was in the magazine, though–the ads, the placement, the font, the images. Somebody cleaned all that out, and for the kind of historical work I do, that whole digital archive is now useless. Luckily, the library still had the magazines, though I suspect they won’t stick around for long as budgets for paper artifacts keep getting cut.

Please, folks, let’s be careful not to clean the history out of our digital files!

Add a Comment

This blog is governed by the general rules of respectful civil discourse. You are fully responsible for everything that you post. The content of all comments is released into the public domain unless clearly stated otherwise. The Library of Congress does not control the content posted. Nevertheless, the Library of Congress may monitor any user-generated content as it chooses and reserves the right to remove content for any reason whatever, without consent. Gratuitous links to sites are viewed as spam and may result in removed comments. We further reserve the right, in our sole discretion, to remove a user's privilege to post content on the Library site. Read our Comment and Posting Policy.


Required fields are indicated with an * asterisk.