Digital Media Collections Are an IT Problem But Not an IT Solution

The power and the flexibility of content use and distribution in the digital realm is enabled by the ability to break everything down into the same essential components, into the 1’s and 0’s that form the atomic structure of data. In its idealized form, that content and the persistent structural wholeness of digital files do not matter in the same way they do with analog materials. One would not tear pages out of a book to ship it separately in smaller envelopes, nor would one store half-second fragments of a film on separate shelves in a room. The works could be reformed, but no easily nor cleanly. Data, however, those 1’s and 0’s, is sent, received, and shuttled around in packets, the fragmentation and compressibility of the whole, unlike with analog works, supporting efficiency, portability and far reaching usability for research and creativity.

The shift to digital workflows has necessitated a major shift in how we conceptualize the use and storage of assets. Creators, owners, records managers, and archivists are no longer the sole stakeholders in how documents and materials are taken care of long term. There is now a greater need to understand data management, technological infrastructure, and the particulars of software, hardware, files types, codecs and more. Likewise, the ease of creating and versioning digital works has led to an explosion in the number of files (as well as the number of network, local, and detachable drives to squirrel them away on), resulting in an overwhelming bevy of content to track and maintain. In a corporate or institutional environment, a creator or overseer of digital assets must either educate oneself on these topics or rely to a greater degree on IT departments to help manage their materials.

Integration and collaboration between departments is an essential component of organizational success today -– sharing resources, eliminating redundancy, and open communication help prevent the waste and lack of innovation that can doom an organization to irrelevancy and worse. However, the people who should be in control of setting policies for file management and for selection and implementation of asset management tools -- the archivists and records managers out there -- have ceded too much ground to a pure IT mindset.

As I see it, providing solutions to problems means applying one's areas of expertise to derive something that attempts to approach a balanced mix of functionality, efficiency, usability, and elegance. In the world of archives and media collections, this means, among other things, making decisions about metadata, file types, storage systems, and distribution systems that support findability, longevity, and flexibility for current and future use. Under an IT mindset, solutions hinge, among other things, more on processing speed, maximizing storage capacity, decreasing time to market or implementation, and monitoring data flows. Of course these things matter to people using or providing access to digital assets, but the paths to the end solution -- compression or low resolution, out-of-the-box asset management, decentralized or uncontrolled metadata creation, etc. -- are fraught with hazards for media. By not taking a more active role in the policy and decision making process, caretakers for media collections put the safety and usability of their assets at risk as well as their own ability to perform their responsibilities to the collection and to the organization.

At their core files are just data, but the ways we manage, use, interact, and create with them rely on intellectual, humanistic, or organizational structures that step away from data and back into nuance, language, and user experience. When we bandy about terms such as digital archive and digital asset management, we are actually using broad categorizations to simplify references to a host of complex and distinct solutions for working with file-based collections, solutions that vary greatly depending on the avenues of access and the functional needs of the organization.

This is especially true with audiovisual content, which presents much different needs and distribution methods than straight text files, including considerations for time-based presentation, aesthetic quality, and the management of very large files. For example, distributing assets publicly over the Internet may utilize lower-quality, “access copy” versions of content in a system designed to promote simple search and playback through streaming. Distributing assets internally to a marketing or development department may instead utilize high-resolution copies of content that can be downloaded and edited into new assets, retrieved through a system that promotes advanced search and integration with editing software. But both of these solutions only support findability and usability for media collections; they do not represent the needs of preservation for the highest resolution originals or preservation masters. These versions are infrequently accessed and, for audiovisual content, may range in the hundreds or thousands of gigabytes per file, thus solutions may include offline storage and ought to include redundancy and geographical separation of backups.

This is one area where the interpretation of what an archive is and what an archive does come into conflict. In environments such as email programs, “archiving” has traditionally been used to mean moving data off into deep storage so it is not eating up active space needed for incoming information. This is considered to be data maintained primarily under retention policies and is not meant to be quickly searched for and called up. Deep storage has its place as a strategy, but it should not be confused with the true sense or value of an archive or collection. An archive is a living resource within an organization, maintaining legacy assets but also bringing in new creations, and providing accessibility to both...If the proper resources and support are allocated to the archive itself. Archives are long-term investments, paying off over time by extending the usability of short-term investments, i.e., acquisition and creation of assets. Shortchanging the archive's ability to do its work now devalues past and current efforts by denying them a future.

Archivists have centuries of tradition, learning, and research which have informed the development of current practices, with an increasing focus on managing digital collections. IT professionals have their own areas of expertise, but these do not expand to all aspects of dealing with file-based materials. Tracking complex relationships among related or derivative assets... Providing accessibility at the intellectual rather than just the physical level... Selecting file formats and codecs based on potential longevity and fidelity to analog source originals... Developing metadata models that adhere to professional standards and that support the activities of collection management… These and more are areas of digital archiving that rely on data practices but that include considerations well beyond those of ground level data management. Today’s archival professional needs to collaborate with IT -- as well as many other departments -- but we also need to step up and take back control of those aspects of our collections that rightfully belong in our care.

--- Joshua Ranger



