Digital Archives, Information Storms and the Knowledge Conundrum

Editors note: The video clips interspersed throughout the text are meant to be watched as you read through the essay.

The altar of a temple built by the Church of Christ, Scientist in San Francisco, incessantly buzzes with the sound of data flow. As you enter the building and go up the staircase, the humming becomes hypnotic. The trance leads you to the altar, where you find yourself surrounded by huge windows that stain the entire room with an intense yellow light. The anachronistic aura of the space is exaggerated by the architecture, reminiscent of some sort of Greek-Christian-Republican Disney nostalgia. Such a setting clashes with the physical materialization of digital knowledge as at the back of the altar, six towers of digital servers have been arranged in two groups of three at opposite sides of the vast room. The sea of data flowing through these hard drives—and others not visible to the public—comprises the Internet Archive (IA), the largest digital library in the world.  

The ideological paradigms that place the burden of historical narrative on memory institutions such as museums, archives, and libraries, find a paragon in the Internet Archive. Since the origins of western civilization, institutions such as the Library of Alexandria have had a very similar global mandate as the IA: to collect and preserve all knowledge. The mission of the IA reads, “Most societies place importance on preserving artifacts of their culture and heritage. Without such artifacts, civilization has no memory and no mechanism to learn from its successes and failures. Our culture now produces more and more artifacts in digital form. The Archive’s mission is to help preserve those artifacts and create an Internet library for researchers, historians, and scholars.”[1]

The fanatical efforts to collect and preserve as much digital knowledge as we possibly can as the best alternative to generate public knowledge, engenders the modern obsession with remembering everything as the moral imperative to create a fair and just society.

The fanatical efforts to collect and preserve as much digital knowledge as we possibly can as the best alternative to generate public knowledge, engenders the modern obsession with remembering everything as the moral imperative to create a fair and just society. What has been termed as the late 20th century “memory-boom,” solidified these moral principles in academia, politics, as well as popular culture, and exacerbated the value of contemporary archival institutions. These same cultural norms lie at the foundation of the IA, as they have engendered an acute anxiety by the prospect of what Michelle Krasowski calls the “digital dark age.”[2]

Forgetting that memory is intrinsically imperfect, partial and bias—and consequentially so is historical narrative—our obsession with the archive as a mode of knowing becomes a negation of our very existence as historical beings, defined by our partial capacity to objectively and collectively remember the past.[3]

In his essay, “On the Uses and Disadvantages of History in Life,” Friedrich Nietzsche argues that our temporal relation to existence and the construction of both individual and collective identities is based on a balance between our ability to remember and our capacity to forget—or as Nietzsche described them, our historical and ahistorical sensitivities.[4]

A completely ahistorical being has no sense of time and space—no memory—, whereas a fully historical one is stuck in its own awareness of the time and place it inhabits. Therefore, forgetting is as much a part of our social, cultural and historical consciousness as remembering is, throwing into question our increasing obsession with memory, archives, and data. Contemporary societies fail to accept the imperfection and partiality not only of what is included in the archives, but of the very forms in which we record and include things in the first place, throwing into turmoil the impossible mission of the IA: to create “universal access to all knowledge.”

History is written by its victors, however, archives and the potential narratives that arise from them are defined by particular modes of recording, out of which the written word is the most valued. This hierarchical system excludes a significant portion of the information that societies have produced in oral forms—or that has been suppressed from existing in a written one—ultimately defining the limits of what memory institutions have deemed important to archive and pass on to the next generation.[5]

even though we have access to more information now than we have ever had before, is our society truly more informed?

In its initial stages, the internet promised to democratize the way knowledge was produced and accessed. As of 2014, the Internet Archive reached 50 petabytes,[6] of total used storage, and in 2015 alone added over 3 million new items to its databases. While the IA is arguably the largest digital library in the world, it only constitutes a very small portion of the vast ocean of data that populates individual users’ screens around the globe, which is estimated to be as large as 1.5 million terabytes. What once was a democratic desire has become an excess of data, making it increasingly difficult to distinguish what is real information and determine the quality of what we find on the internet. Ultimately it begs the question: even though we have access to more information now than we have ever had before, is our society truly more informed?

One of the best examples of this paradox is the National Security Agency (NSA), which has developed a massive spying apparatus designed to collect as much data as possible from every single person. In an interview with Huffington Post, William Binney, who spent nearly 30 years working at the NSA, openly talks about how the agency is overwhelmed by the data it collects.[7] Binney explains that the NSA’s analysts are drowning in it, to the extent that they have not been able to identify potential threats to national security. Yet more dangerous than the failure to identify threats, is the anxiety that is produced by an inability to turn raw data into quality or even usable information. As a result, the government has contracted out private companies to analyze their databases, a threat to national security and individual privacy in and of itself.[8] Ultimately the privatization of this vast archive is inevitably defined by the potential profit to private companies, as they access the bulk of data provided by what can only be deemed an Orwellian government. The connection between state power, profit, and information is astonishingly obvious.

As more and more people have access to the web, it seems like the overriding bias in the digital archiving process is the save-it-all approach. Digital networks, which we consider to be the ultimate platforms for information storage and exchange, are structurally partial and unstable. The data that comprises such digital archives is incredibly fragile as it is in a constant state of physical change, difficult to process, and over defined by the increasingly tight control exercised by private companies and the government’s private contractors. There is also the added layer of fragility that emerges from the instability of the hardware—the physical housing for data—as it constantly faces issues like bit-rot, deterioration and format changes. 

Yet the true vantage point that shatters contemporary knowledge methodology lies in the fact that our systems of archiving have never depended so much on the current political economy. Digital information—especially our private information—is no longer just the knowledge that we can access, but the currency that has created one of the most opulent and unstable industry and market in the twenty-first century, what has come to be defined as the tech sector.  

All the things we can possibly come to know from our digital archives are determined by neoliberal market ideologies, which give power to profit driven corporations and private companies to organize and distribute the electric impulses hosted in ever-growing data centers. The language that has developed around digital archives and platforms, however, continues to symbolically conceal data’s material and political implications. Information in our era assumes disembodied pseudo-religious connotations such as: oceans of data, clouds of information storage, webs, wireless, etc. The vernacular of this contemporary phenomenon dematerializes the very physical and economic substrate of its operations, placing its functionalities in the clouds—in heaven—not in the hands of the powerful elites who control its movement and organization.

Many of us still believe the internet is information circulating in invisible airwaves from cloud to cloud, guided by some inexplicable hand of god, when the reality is an intricate set of underwater wires, strategically placed and maintained that like serpents, circumvent the world. The algorithms, hardware and labor that comprise the internet, emerge from our own historical present and political economy, as they are formed by and within the cultural values of the societies that produce it. The internet and its inter-webs are as human, fragile and tangible as we are, and therefore prone to corruption, misuse and manipulation. 

There are two partial backups of all the IA’s data hosted in San Francisco: one is located in Alexandria, Egypt, and the other in the Netherlands in Amsterdam. These three locations, however random or circumstantial they might have been at the moment of their creation, have a fascinating resonance with the historical and material fragility of contemporary archival fever. The San Francisco Bay Area is the cradle of the new technological economies, where it has become evident that the information that circulates through digital networks can easily be turned into vast amounts of profit and power. In other words, neoliberal structures in the form of private corporations quickly filter, translate, and multiply the flow of information, frequently referred to as “Big Data,” into cash flow.[9]

On the other hand, Amsterdam illustrates a Europe deeply impacted by the rise of conservative politics buoyed by the imminent threat of global warming. While the Dutch ultra right-wing Party of Freedom keeps winning elections every year, the Netherlands are increasingly threatened by rising sea levels and imminent floods. Neo-conservative rhetoric in the global north is known to deny global warming, as it easily displaces and manipulates scientific information. Alexandria serves as a clear historical quotation of the Library of Alexandria, a symbol of the loss of public knowledge as it was partially destroyed during the rise of fundamentalist Christianity in the 4th century CE. Modern Egypt, however, has become a symbol for what came after the so-called Arab Spring, when progressive organizers were severely impacted by the government’s elimination of the country’s internet access. 

These three examples point out the ways that information and power become acutely tied within a set of complex geopolitical struggles, fueled by the intervention of western economic and political interests.Where the IA’s data is stored confirms the inextricable dependence of information and knowledge to its historical and political context. Brewster Kahle once pointed out the fragility of the locations stating, “so our earthquake zone archive is backed up in the turbulent Middle East and a flood zone. I won’t sleep well until there are five or six backup sites.”[10]

The relationship between the archive and rising sea levels from global warming which threaten the very physicality of the seemingly ephemeral space of the web, sheds light onto a very interesting connection that has recently fascinated me: the language that is used to understand our digital knowledge and its deep relation to water cycles. Our world, the Blue Marble, is a vast ocean of data, where each droplet of water is a bit of potential information. In most genesis stories life originated in water, and just like our origin stories, our digital cosmology is bound up in the language of fluidity. As data flows from cloud to cloud, back into electrical currents in vast oceans of data, water is revealed to us as the initial stage of being, each particle containing the possibility of knowing ourselves—of knowing our individual and collective consciousness.[11]

As the cycle of information continues, the evaporation of information from the ocean, with its swirling cables, is forming unprecedented symbolic clouds. All climate predictions today suggest that the poles are rapidly melting, sea levels are quickly rising, and large dark clouds are forming over our global economic and social systems. Some even think that the depth of these dark digital clouds is larger than the depth of our cosmos. If all predictions are correct, what will this storm of data look like when the clouds finally break?

Once the storm comes, and we’re faced with the collapse and dissolution of our systems and foundations, the question remains: will we drown?

We live in the eye of this hurricane. It is forming right here, above us, beneath us, and in front of our eyes every time we swipe our phone or turn our screens on. In his essay, On the Concept of History, Walter Benjamin describes an extended metaphor that sheds light on the nature of this overwhelming sensation of impotence in the face of immanent disaster: “But a storm is blowing from Paradise and has got caught in our wings; it is so strong that we can no longer close them. This storm drives us irresistibly into the future, to which our back is turned, while the pile of debris before us grows toward the sky. What we call progress is this storm.”[12] The storm of data, a pool of massive information being pushed by the winds of neoliberal capital, surveillance, and conservative rhetoric, seems almost unstoppable. The partiality and ephemerality of our archives has never been so well disguised, and yet so undeniably clear.

As global warming reshapes our world via rising sea levels and dramatic changes in weather patterns, we will also be faced with a flood of information that could paralyze us. Just like governments around the world are unwilling to shift their social and economic practices to prevent environmental disaster, their partnership with the private sector is turning our data into a dangerous hurricane whose course we cannot hope to chart. Even though institutions like the Internet Archive strive to stay outside of this conundrum, governments, private contractors and users in the so-called “information age,” are increasingly overwhelmed by the impossibility of truly knowing anything at all. Edward Snowden, in an interview with The Guardian in 2015, said, “The problem is that when you collect it all, when you monitor everyone, you understand nothing.”[13] Once the storm comes, and we’re faced with the collapse and dissolution of our systems and foundations, the question remains: will we drown?


  1. “Frequently Asked Questions,” The Internet Archive, accessed June 1, 2016, https://archive.org/about/faqs.php#21.  ↩

  2. A limited and obscured knowledge of the history of the interweb.  ↩

  3. These discussions are a big part of the scholarship of both Haydn White and Dominick LaCapra, amongst others. To read more, I recommend referring to Hayden White’s The Content of the Form (Baltimore: Johns Hopkins University Press, 1987), Dominick LaCapra’s History and Criticism (Ithaca: Cornell University Press, 1985), and Jacques Rancière’s The Names of History: on the Poetics of Knowledge (Minneapolis: University of Minnesota Press, 1994).  ↩

  4. Friedrich Nietzche, “On the Uses and Disadvantages of History in Life” in The Collective Memory Reader, ed. Jeffrey K Olick, Vered Vinitzky-Seroussi and Daniel Levy (New York: Oxford University Press, 2011), 73–79.  ↩

  5. Nonetheless, the development of sound and video recording, as well as instantaneous wireless communication across the world, have dramatically impacted the archival possibilities of contemporary societies and potentially redefining the historical consciousness in the digital era.  ↩

  6. 50 petabytes is the equivalent to 50,000 terabytes (or 50,000,000,000,000,000 bytes), as the prefix peta indicates the 5th power of 1,000.  ↩

  7. Robert Scheer, “Scheer Intelligence: William Binney and Blowing the Whistle On the NSA,” The Huffington Post, last modified March 11, 2016, accessed June 1, 2016, http://www.huffingtonpost.com/robert-scheer/scheer-intelligence-willi_b_9443362.html.  ↩

  8. Tim Shorrock, “Meet the contractors analyzing your private data,” Salon, published June 10, 2013, accessed June 15, 2016, http://www.salon.com/2013/06/10/digital_blackwater_meet_the_contractors_who_analyze_your_personal_data/.  ↩

  9. As these corporations grow, they simultaneously displace communities on their way. For example the Latinx community in the Mission District of San Francisco hasn’t been able to keep up with the dramatic increase in rent and private services, as wealthy tech workers populate its streets, coffee shops, and housing. The displacement of other forms of knowledge through archival practices for profit manifests itself in the coercive physical removal of disempowered communities.  ↩

  10. Brewster Kahle, “Universal Access to All Knowledge,” The Longnow Foundation, filmed November 30, 2011, accessed June 1, 2016, http://longnow.org/seminars/02011/nov/30/universal-access-all-knowledge/.  ↩

  11. Thales of Miletus, one of the most renowned pre-Socratic philosophers, argued that arche—the fundamental elements out of which the entire Universe emerged—was water. Curiously enough, arche is also the etymological origin of the word archive, as it refers to realms of power, authority and knowledge. This incommensurable sea of data—the arche of the 21st century—is the structures that hold us, the void where our world is suspended and where the blinking blue dots of surfers in the web are endlessly suspended.  ↩

  12. Walter Benjamin, “On the Concept of History,” in Selected Writings Volume 4: 1938 - 1940, ed. Howard Eiland and W. Michael Jennings, trans. Edmund Jophestl (Cambridge MA: Belknapp, 2003), 389–399.  ↩

  13. Alan Rusbridger, Janine Gibson and Ewen MacAskill, “Edward Snowden: ‘When you monitor everyone, you understand nothing,” The Guardian, last modified May 22, 2015, accessed June 15, 2016, https://www.theguardian.com/us-news/2015/may/22/edward-snowden-nsa-reform.  ↩

Bibliography

Benjamin, Walter. “On the Concept of History.” In Selected Writings Volume 4: 1938 - 1940. Edited by Howard Eiland and W. Michael Jennings. Translated by Edmund Jophestl. Cambridge MA: Belknapp, 2003.

“Frequently Asked Questions.” The Internet Archive. Accessed June 1, 2016. https://archive.org/about/faqs.php#21.

Kahle, Brewster. “Universal Access to All Knowledge.” The Longnow Foundation. Filmed November 30, 2011. Accessed June 1, 2016. http://longnow.org/seminars/02011/nov/30/universal-access-all-knowledge/.

Nietzche, Friedrich. “On the Uses and Disadvantages of History in Life.” In The Collective Memory Reader. Edited by Jeffrey K Olick, Vered Vinitzky-Seroussi and Daniel Levy. New York: Oxford University Press, 2011.

Rusbridger, Alan, Janine Gibson and Ewen MacAskill. “Edward Snowden: ‘When you monitor everyone, you understand nothing.” The Guardian. Last modified May 22, 2015. Accessed June 15, 2016. https://www.theguardian.com/us-news/2015/may/22/edward-snowden-nsa-reform.

Scheer, Robert. “Scheer Intelligence: William Binney and Blowing the Whistle On the NSA.” The Huffington Post. Last modified March 11, 2016. Accessed June 1, 2016. http://www.huffingtonpost.com/robert-scheer/scheer-intelligence-willi_b_9443362.html.

Shorrock, Tim. “Meet the contractors analyzing your private data.” Salon. Published June 10, 2013. Accessed June 15, 2016. http://www.salon.com/2013/06/10/digital_blackwater_meet_the_contractors_who_analyze_your_personal_data/.


Juan Pablo Pacheco is an artist, curator and writer currently based in Bogotá, Colombia. Focusing his practice on video and sets of conceptual exercises, Pacheco recently completed his MFA at the San Francisco Art Institute. His work and writing focus on the intersections of history, memory and archives as contested spaces for the circulation of collective narratives, identities and information..