The Future of Publishing

Long-Term Preservation of High-Value Digital Content

Long-Term Preservation of High-Value Digital Content

About the Author

Craig Van Dyck is Executive Director of the CLOCKSS Archive. Before CLOCKSS, he was at Wiley from 1996-2015 as Vice President, Content Management; and at Springer-Verlag New York from 1986-96, as Senior VP and Chief Operating Officer.

Craig served as Chairman of the Enabling Technologies Committee of the Association of American Publishers from 1995-1998, and was instrumental in the development of the Digital Object Identifier (DOI) system, and of CrossRef. He has served on the Boards of Directors of the International DOI Foundation, CLOCKSS, ORCID, CrossRef, and the Society for Scholarly Publishing, and was a member of the Portico Advisory Committee.

Craig’s portfolio has always included industry collaboration to improve the infrastructure of scholarly communications.

It’s 2040 and you want to read a decades-old book that analyzes Finnegan’s Wake. You can’t find the text of the book on the Web, and it’s not available on services like iTunes or Amazon. Libraries and bookstores have no print copies. There is no print-on-demand option available. The book has effectively disappeared.

This is the scenario that long-term digital preservation protects against.

Digital versions of content are often now the primary medium that users rely on, more than print. When print was the primary version, the community safely assumed that there were multiple copies of the content around the world, usually printed on long-lasting acid-free paper and often available for long periods in major libraries.

Now, with digital content in the forefront, libraries often do not own a digital copy. Users access the content on platforms, such as Apple, Amazon, Ebsco, Highwire, or the publisher’s own platform. Who, then, is responsible for safeguarding the long-term survival of the digital content and access to it?

Publishers themselves are not necessarily considered to be reliable long-term protectors of digital content. A publisher might lose interest in the content as its marketability declines. Or a publisher might go out of business or be combined with another publisher. And publishers normally do not have robust long-term digital practices in place.

This issue is of high importance in scholarly publishing because scholarly research has a long shelf life. Scholarly content tends to be of high value and costly, and digital versions have strongly supplanted print as the primary resource for end users.

As a result, librarians at universities have lobbied publishers to participate in trusted third-party long-term digital preservation systems. This article reviews the state of play for digital preservation of high-value content and raises open issues that the community confronts.

It is Not Only About Scientific Journals

Due to the importance that academic libraries place on digital preservation, scholarly journal publishers have stepped up by depositing their content into third-party systems such as CLOCKSS. There is excellent coverage for journals generated by large and medium-sized publishers. However there is a “long tail” of very small publishers (usually “Open Access”) who have not gotten the message. There is an ongoing effort to encourage these small publishers to participate in a preservation system.

When it comes to books, the picture is different. Even among scholarly book publishers (who often also publish journals), the coverage is not yet as strong as it should be. And there is little participation in digital preservation at more general-interest publishers. For example, book publishers for authors such as Malcolm Gladwell, magazines like The New Yorker, and newspapers like The Washington Post and The New York Times have not joined preservation systems.

The primary impetus for long-term digital preservation has come from libraries. As a result, for content whose market is strongly library-focused – like scholarly journals – preservation has become ubiquitous. But for other kinds of content, for which libraries are not the primary market, preservation is lagging. This is a problem because readers rely on the ability to go back to older content. For digital reading to become a full experience, readers should be able to access content today that they have accessed in previous years.

To continue reading the article, please fill the below form

    Disclaimer: This is to inform readers that the views, thoughts, and opinions expressed in the article belong solely to the author, and do not reflect the views of Amnet.

    Copyright © 2020 Amnet. All rights reserved. No part of this publication may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical methods, without the prior written permission of the publisher, except in the case of brief quotations embodied in critical reviews and certain other non-commercial uses permitted by copyright law. For permission requests, write to John Purcell, Executive Editor- Amnet, addressed “Attention: Permissions” and email it to: [email protected]

    Amnet