• What does a terabyte mean, socially? (Or, DIY and infrastructure)

    I would like to talk with others about the practicalities of digital work at the scale of one or a few people. There are a lot of mysteries and mystifications around technology, and some of the ones I find myself thinking about a lot relate to scale. Some outside of digital humanities circles may not yet have a clear sense that it doesn’t require high-powered infrastructure for a single researcher to manage tens of thousands of documents on a laptop or a humanities instructor to self-host WordPress blogs. At the other end of things, however, Google has resources to operate at a scale the most advanced university digital humanities centers can’t approach. From far enough outside, lumping it all as technology stuff, these things can look more similar than they might to THATCampers. But even when one gets considerably closer there is a lot about the great, complicated, shifting middle scale that seems not all that well charted. Many of us don’t have good intuitions about what’s easy, what’s hard, what’s impossible, what’s changing, or how to find out. I’d like to compare notes and learn more.

    Though digital storage is not really the heart of the matter, we could start with some simple questions: how big is a gigabyte, or a hundred gigabytes, or ten terabytes, in social terms? How big, relative to social context, is a humanities database with five thousand records, or five million, or five billion? What changes about human scale between the context of desktop machines, public servers, and mobile devices? How are the possibilities of scale affected by individual knowledge, or access to well-established professional knowledge, or the rare expertise of a highly specialized team? What requires a center, and what requires a network? What scale is an afternoon’s digital humanities exercise, and what takes years of planning and major grant funding and a business model?

    Especially: how do we get our intuitions to keep up — and then make those intuitions visible and persuasive so that others can share them without mystification?

1 Comment


  1. Aaron Collie says:

    You’ve got me intrigued. I also think a lot about this scale.

    1. Computation scale (my laptop to the universities supercomputer)
    2. Storage scale (my flash drive/homeserver/phone to storage arrays)
    3. Project scale (1 FTE to multi-university international research)
    4. Funding scale (0 – large grants / hard funding streams)

    But the issue you’ve got me intrigued about is an entirely different scale. I think of it as information density. A terabyte of numeric data has a different density than a terabyte of HD video data. Then on top of that layer is your instrument layer… what are you looking for in the data (bit rate, sampling rate, methodology, etc).

    I work in a library and shelving and space cost money. We attempt to make the most of it, and now the same arguments are being applied to digital storage. But there has been centuries of vetting, reformatting, and innovations which has decided the book is the perfect information density for transmitting ideas.

    Don’t we need to understand the same about our storage arrays?

Leave a comment

You must be logged in to post a comment.