Top of page

Can Big Data Save Us from Ourselves? A Conversation About Information, Democracy, and Dystopia

Share this post:

On a rainy day in late spring, a pan-Asian noodle restaurant on Pennsylvania Avenue offered the perfect nook for a spirited conversation about big data, algorithms, and the construction of our legal and social realities. Among those at the table with me were Martin Hilbert, who was a Kluge Distinguished Visiting Scholar and is Associate Professor of Communications at UC Davis, and Andrew Meade McGee, a Kluge Fellow in Digital Studies writing a book about the political history of the computer in the United States from the 1940s to the 1980s. Later, we continued the conversation online. The text of our Google Docs exchange was edited for length and clarity.

Man on computer screen and sign that reads, Smile — you’re on computer. Drawing by Herbert Block, 1998. © Herb Block Foundation, used with permission.

DT: Martin, you have suggested that societies have the capacity to develop algorithms that combat bias, and that these could be even more powerful than education in terms of their ability to change societal patterns of discrimination. Can you tell us more?

MH: Machine learning is a means of deriving artificial intelligence by discovering patterns in existing data. If a statistical machine-learning model is trained on a standard corpus of text from the World Wide Web, you will find that the resulting artificial intelligence is quite racist, biased, and intolerant.

For example, if you ask it whom to invite for a job interview, it will tell you with a two-thirds bias to go for someone with a European first name, instead of someone with an African-American name, or to go for a man, instead of a woman. If you open up the “black box of AI” and look into it—that’s the good thing, that we can do that, in contrast to brains—you can see how the different words and concepts line up in the multidimensional vector space that the AI uses to create meaning out of syntactics. Male names are closer to concepts like professional, salary, business, and career, while female names are closer to parents, family, home. European names are closer to concepts like freedom, peace, diploma, and loyal, while Afro-American names are closer to concepts like abuse, poverty, and prison.

Where did the AI get such racist and discriminatory attitudes from? From us! It simply read what we wrote over the past centuries and derived conclusions. The good thing is that we can look under the hood. This also means that we can see how exactly these biases are created.

DT: It sounds like you are saying that while we may have aspirations towards fairness, our human brains are unable to fully realize them because we are too complex, rapt in emotion and ingrained prejudice, whereas, if properly trained, the machine has the potential to follow through on an objective standard. Are there limitations here? The machine is rational, but what about the emotive side, or situations in which empathy changes the calculus?

AMM: I am drawn to the fundamental optimism inherent in this idea we can program our way to a more utopian future by stripping away the worst of man’s nature—racism, sexism, bias, and discrimination in all forms—from carefully constructed algorithms. I’m dubious, however, that even carefully controlling for identifiable variables of bias can truly free algorithms from their origins of bias inherent in any man-made creations. Code, law, institutions—all emerge from flawed societies. Technologies are in part reflections of the societies that produce them, no?

MH: I agree, and both of these information processors (brains and machines) essentially do data mining. Now, as this recent paper points out: “By definition, data mining is always a form of statistical (and therefore seemingly rational) discrimination. Indeed, the very point of data mining is to provide a rational basis upon which to distinguish between individuals and to reliably confer to the individual the qualities possessed by those who seem statistically similar.”

Information is defined as differences: a “difference that makes a difference,” as Bateson described it. So all of us need differences to make decisions, be rational, and derive conclusions. Code, law, institutions are all based on information about society, and it is up to society, not up to the algorithm, to determine which differences not to consider.

It might be that we find that some differences are “confounding variables,” leading to spurious correlations, (so we, as a society, might decide not to consider items like race, gender, religious belief, ethnic origin, etc.). Different societies define these “protected variables” differently. Once they are identified, the challenge consists in creating mechanisms that make decisions without considering them. These mechanisms can be institutional (lawsuits and courts) or algorithmic. We have seen that for algorithms, we are able to do that.

AMM: I like this framing of data mining and difference. I also agree with Dan that there are elements of human judgement—sometimes irrational—that stem from emotions and empathy and cultural situating that don’t seem to have direct algorithmic analogues.

DT: I would say based on a different, or expanded set of criteria, rather than irrational. It may still be rational, it’s just that the context mandates a different response. This is my fear with AI: if it is based on a set of rules, then it is unable to make more nuanced distinctions. If, or better, when, it becomes a learning system, then it will develop its own rules, and who is to say these will continue to reflect human intent? This is the fear that keeps creeping up in our sci-fi accounts, from “Minority Report” to “Battlestar Galactica” and “The Matrix.” It is about transparency, the ability to appeal and find recourse and to keep outcomes controlled.

AMM: That’s fair. It’s worth considering that there is a certain set of generally agreed upon computer logic practices. And as you note, these are not always in accord with social codes dictated by generations of tradition and artifacts of culture. I’d like to ask Martin to delve a little into what he sees as the key differences between algorithms as mechanisms and institutions as mechanisms. I see them as overlapping but not always parallel. To build on what Dan notes, do technologies and institutions structure human intent in the same ways?

MH: For me, institutions are social algorithms. What would you say Dan: in what respect are they different / not overlapping?

DT: I agree; to the degree that bureaucracies are rule-following mechanisms, they function like algorithms. But to your point, it is more difficult to map how they function. In “Reassembling the Social: An Introduction to Actor-Network-Theory” Bruno Latour points out that in the digital realm, because there is always a footprint, it is easier to trace social interactions.

AMM: There is something compelling about situating both law and algorithms on a continuum of “code.” The metaphor begins to strain in my mind, however, when we consider the limitations of both socially-constructed institutions and human-designed software.

Human institutions, like the bureaucracies Dan mentions, emerge out of a need to organize information. Civilization itself—the construction of cities, specialized occupations, and management of large numbers of people for shared purposes—came out of information management: priests recording the movements of the heavens to predict agriculture, clerks tabulating harvested bushels of grain, and so on. Subsequent political restructurings of society are in large part predicated on new forms of government rooted in newer techniques of managing large-scale enterprises and the reams of information they generate.

Technologies are intimately intertwined with this process—cuneiform tablets, printing presses, telegraph wires. Yet the ways we approach our technologies and the ways we structure our institutions don’t always map. We load into technologies aspirations of an imagined better future. We build into human systems—institutions—checks and balances because we recognize they are imperfect manifestations of the now.

Social Security board records office, Baltimore, MD. Photo by Harris & Ewing, ca. 1937.

DT: Do you think there is a greater fear that algorithms will go awry with unintended consequences? That they may be more difficult to control or to fix if they have strayed from their intended purpose?

MH: I agree that “we load into technologies aspirations of an imagined better future.” We do the same with our laws. I also agree that “we build into human systems—institutions—checks and balances because we recognize they are imperfect manifestations of the now.” So the challenge is that we should also do this for algorithms. Why not? If someone disagrees with an algorithm—just like when someone disagrees with a court ruling—why not appeal and go into revision? Just because one algorithm got to the conclusion it doesn’t mean that others will. What’s the difference between a debatable court ruling and a questionable algorithmic decision?

AMM: The underlying promise of new organizational and information processing technologies is that they offer convenience, speed, scale, and seemingly expert neutrality. The reality is that they frequently create a technocratic barrier between those policymakers charged with implementing the structures of governance and the smaller, technologically-trained group of experts capable of designing, implementing, and revising the systems (hardware, software, algorithmic, conceptual) that undergird our information society. We’ve mentioned this phraseology of a “black box” before—from the perspective of an ordinary citizen, it seems simpler to challenge the flawed decisions of a human legislator or bureaucrat than to begin to dissect the seemingly unassailable ruling of a machine. We know that outputs of technological systems are only as good as the biased and flawed inputs we give them, but our society places complex technologies on pedestals of neutrality that they haven’t entirely earned. As a historian of the information society in the United States, I am struck by the fact that we as a nation have had variations of this conversation before. At the turn of the twentieth century, a group of reformers known loosely as the “Progressives” embraced new scientific methods of information gathering and organizational management as a corrective to the excesses of the Gilded Age and the seemingly intractable social ills of the industrial revolutions. They saw new statistical techniques as the key to making cities cleaner, improving public health, providing order to questions of urbanization, industrialization, and immigration. At their best they helped bring order to an increasingly complex and disorienting modern society. At their worst, they enabled scientific practices of dehumanization—Fordism, Taylorism, eugenics—by reducing individual humans and their complex experiences to distillable numbers on a spreadsheet. Herbert Hoover, our first truly technocrat president, was an excellent engineer and superb businessman. He was far from an ideal and compassionate president.

MH: I agree, Andrew, that we often put too many hopes in technologies and in seemingly neutral technological mechanisms. Yet I still think that these technological advancements are key to making the world a better place. Just because we exaggerate with our expectations doesn’t mean that they are not better than what we have.

Current institutions like our half capitalistic, half subsidized military-industrial machinery that we call “market of innovation” here in the U.S., the “bureaucracies” we created over the centuries to guide our daily lives, the mechanisms we created to elect our leaders who we put in charge to “execute the law” (the government), and the algorithms we have in place to derive laws, that is, to derive the general will of the people: all of these have taken us to a world that is not sustainable and is environmentally self-destructive. Increasing inequality, socioeconomic tension, the risk of a Third World War: I’d say our institutions are far from being “good enough.” Will they allow our species to survive?

AMM: The motivations Martin describes have great historical precedence. Moving to a more recent historical moment, the 1960s and 1970s saw hearty enthusiasm for the potential of mainframe computers and the complex information processing programs whirring away on their magnetic tape reels. The country was gripped by a fascination with the potential for computerized management and the use of electronic databanks to streamline government, assist human decision making, and reduce bias in policymaking. How to make Congress more efficient? Give it a computer-using analytical arm, the Congressional Budget Office. How to solve the complex “urban crisis” of the 1960s so intrinsically woven into delicate questions of race, class, and poverty? Code it on punch cards and let the computer make the difficult decisions. How to end poverty? Let the Social Security Administration construct digital profiles on every American.

These schemes had many backers, who saw digital technologies as the only way for mankind to ascend to a higher form of social plenty, but also many detractors, who feared diminishment of autonomy, loss of privacy, and constraint of the very human impulse to guide ourselves, however flawed the result. In my work at the Library of Congress I’m tracing the correspondence of a group of highly placed national science advisors at the end of the 1960s who proposed creating a fourth branch of the American government: a systems branch that would deploy mainframe computers and expertly designed computer models to neutrally advise the other three branches of government and usher the nation into a technological utopia that could address those very questions of sustainability.

DT: What are you most hopeful about, what are you most afraid of, and what would you like to see happen societally and politically to ensure that AI, big data, and new technologies are elevating our quality of life?

MH: I don’t think algorithmic decision making is perfect. I don’t think AI is optimal. But it doesn’t have to be. It just has to be better than what we currently have. That’s how evolution works: it’s about the “fitter.” Then we evolve. I also think that we’ve learned several important lessons over the past millennia of civilization, and we should honor them. That starts with the checks and balances and the right to appeal a decision (be it made by a human judge or an AI), and it ends with the cutting-edge lessons we are learning from Andrew’s ongoing research about what happened not too long ago with our hope of using data processing to make the world a better place. I’m looking forward to his results (please don’t forget to share!).

My biggest fear is that we misguide the algorithmic institutional structure that we are currently creating. Past civilizations have set up institutional structures that didn’t make it and did much harm—the Roman Empire for example, the Nazi’s Third Reich, Stalin’s socialism, and so on. We are currently creating an algorithmic institutional structure around the globe. It has some characteristics, such as the primacy of commercial interests, which lead to the placement of ads in most modern communications structures, which worry me deeply.

I don’t think it’s out of reach to make adjustments, but I think we have to be aware that just as with the creation of past institutional structures, we need to get actively involved in order to ensure an outcome that moves us toward a sustainable and brighter future. This involvement starts by discussing and talking about it, just like we’ve been doing.


I never tire of watching them. Drawing by Herbert Block, 1978. © Herb Block Foundation, used with permission.


AMM: The fact that humans are seeking to harness complex technologies to extend the potential of our brains and bodies does make me hopeful. My fears hearken back to the tropes emergent in much recent science fiction that Dan raised. Humans operate on a level of unprogrammed logic, of creativity, of base motivations and elevated aspirations not yet fully replicable in algorithmic codes. Our needs and desires are contingent, complex, and not always understood by the computers we program and the artificial intelligences we train.

Often our fears are reflected in our popular culture. Fifty years ago, HAL 9000 from Stanley Kubrick’s film “2001: A Space Odyssey” captured one triumph of human ingenuity—a sentient companion and helpmate to navigate the vastness of space. We know how that turned out. Another film from that period is far pulpier: “Colossus: The Forbin Project.” In this movie, the world’s most advanced artificial intelligence mainframe, tasked with overseeing the United States’ nuclear arsenal, gains sentience, imprisons its scientist creators, and demands complete control over human society—for our own good. The AI concludes humans cannot be trusted with our own destiny.

It’s laughably dramatic science fiction, but to prevent the strengths of a real algorithmic future from blinding us to its excesses, we should maintain a healthy sense of skepticism—a respect for the technologies we create and our limitations as creators.

Add a Comment

This blog is governed by the general rules of respectful civil discourse. You are fully responsible for everything that you post. The content of all comments is released into the public domain unless clearly stated otherwise. The Library of Congress does not control the content posted. Nevertheless, the Library of Congress may monitor any user-generated content as it chooses and reserves the right to remove content for any reason whatever, without consent. Gratuitous links to sites are viewed as spam and may result in removed comments. We further reserve the right, in our sole discretion, to remove a user's privilege to post content on the Library site. Read our Comment and Posting Policy.

Required fields are indicated with an * asterisk.