The Future of Great Research Libraries

The images you see now are among the oldest known artistic expressions of our species. They are the cave paintings from Altamira, in the province of Cantabria, Spain, estimated to have been painted about 18,000 years ago. The artist made use of natural protrusions of the cave roof to give depth to the images. He or she made pigments from charcoal, from clay, and from stones, employing various techniques to apply the colors. From these we infer artistic intent. These pictures of buffalo, horses, and elk are evidence of our primordial ancestors coming to grip with their world. There are other drawings in the cave that do not speak for themselves. What do these “tectiforms” represent? Can we hypothesize that these humans spoke, that they counted, that they communicated? Of course we can, but there are only images, these human expressions, for proof.

On the other screen are some simulations and some images of planetary surfaces depicting the Voyager space probe project. The simulations obviously show the spacecraft in perspective with distant planets and moons, but the surface image of the Neptune’s moon, Triton, represents image data sent back from Voyager. These craft are scientific expressions of man’s desire to know, to explore, to describe. Because each craft carries discs of gold inscribed with images of man, woman, and child as well as some of our music and math, they too are examples of human communication. We receive and interpret data sent back to us from space, as the flyby over the surface of Triton shows. And, as we all know, numerous other craft have been sent and more will be sent into space for exploration, each sending back signals of what their mechanical sensors have perceived.

Between these extremes - 18,000 year old cave paintings, and streams of data sent back to Earth - come the entire range of the human record of expression, all which are potentially the responsibility of our societies’ cultural custodians, namely museum curators, librarians, and archivists. And whether such representations are transmitted on paper or by bits and bytes, they are in principle ideas worth saving by one generation, for transmission over time, to subsequent generations. This is our work, librarians' work: the selection and organization of ideas, expressions and knowledge for distribution from the past, for the education and edification of those alive now, as well as for people not yet born.

It is good to reflect on this work on the occasion of the 400th anniversary of the opening of the Bodley Library, at a time that has seen human communication and expression greatly perturbed by the rapid growth and adoption of the Internet as a new channel for expression and the transmission of ideas. This celebration of the founding of the Bodley Library and the persistence of the Bodley’s librarians, keepers, and masters at Oxford is an auspicious time to look forward as well as back. That is my task today: to look forward, to speak about the future of the great research libraries. It is a great honor to be asked to do so, and I thank Reg Carr for extending the invitation.

As I began to read and reflect on this assignment, I did the usual review of the literature. Imagine my horror to discover that there are well over 3,850 hits on a Google search of the phrase “future of libraries”, though only four hits on the phrase “future of great research libraries,” all referring to this talk at this conference. A quick search of the periodical literature produced about 500 entries for articles, notes, and so forth on the more inclusive phrase, few of which duplicated the Google search results. So, with a little license, one might say that the recent literature comprises over 4,000 entries.

While I considered my topic, I pondered what we might mean by the phrase “great research libraries” and why these merit discussion as having a future apart. I was mindful the immediate audience would include colleagues and friends who lead research libraries which they and others, myself included, might describe as “great.” In a sense, I think of this as a genuine superlative, that is, not as a comparative
term.

Though there is substantial presumption that size alone makes a great research library, another measure might be longevity. In my view that list would start with the Biblioteca Apostolica Vaticana. The U.K.’s Consortium of University Research Libraries has a nice set of criteria for membership that could be applied as well:

What then makes a great research library? Here are my criteria:
•a very large collection, both in numbers of volumes as well as in number of disciplines, topics, languages, and literatures
•a distinctive, broad, and extensive collection, with prominent special collections and substantial annual rates of acquisitions
•highly qualified subject and technical professionals on staff
•a lengthy history of library operations
•currency with changing methods and materials of scholarship by collecting, organizing for access, and preserving new research resources
•an active program of preservation and conservation

With these criteria in mind, I would guess that we could today construct a list of about 40 or 50 “great research libraries” in the world, several of them national libraries, many university libraries, especially including the Bodleian, and a very few public libraries. Great research libraries also have squads of specialist staff, not limited to specialist librarians, bibliographers, and archivists, but also including exhibit curators, publishers and editors, security officers, and … conference organizers. For the purposes of this presentation, all other libraries fall into a taxonomy no one has patience to develop, to describe, or to discern with precision.

Harold Billings, our distinguished colleague who is head of the Libraries of the University of Texas at Austin, adds a telling qualification:
“Sustaining a great library does not diminish other libraries. It adds to, builds up, and enlarges the capabilities of other libraries, and is itself strengthened in turn through its collaborative association with others.”

To speculate on the future of the great research libraries, one should consider what research might be done in the future as much as determine, through our selection of materials for our collections, what research will be possible. These predictions of what some scholars might wish to do are useful in shaping the programs of great research libraries to accommodate and support them.

Here are three sample scenarios based on what scholars are asking for now and what will be possible soon. Each scenario is set in the second decade of the 21st century. There is one for the humanities, one for the social sciences, and one for the sciences. Others could easily be constructed for other domains.

The first scenario is that of a musicologist investigating the simultaneous use of jazz idioms and stylistic elements by French classical musicians of the first half of the 20th century and the use by American jazz musicians of French musical elements in the same period. In addition to the core musical questions - of properly identifying stylistic elements like harmonic structures, use of specific chords, rhythmic patterns, formal structures, distinctive decorative elements, modal melodies, and more or less clear quotation - there are social, economic, and perhaps even political questions entwined.

In order to pursue this line of research on the core musical question, the scholar will need to have listened extensively to the easily available works from the repertories in question. She will then convert exemplars of the styles to transposable, numeric “signatures” in digital form. She will then run a program comparing her signature elements to the thousands of recorded excerpts she will have collected.

Automatic searches with the specifications of composers, performers, groups, styles, critics, performing locations, and titles of works will have been devised to search the contents of digitized newspapers of the period, as well as magazines, histories of music, memoirs and collections of letters, and archival documents of the key composers and musicians. As results from one search bring new names and places to light, her searches and analytical work will be altered to include the new names and the new pieces of music. Among the sub-topics arising in this research are those of racial boundaries in various locales, exoticism as an element of popular taste, the transmission of stylistic elements among musical communities, and compositional practice and intentions of the various individuals identified in this research.

There are several indicators that this phenomenon occurred in the works of Ellington, Ravel, and Stravinsky, to name but a few, but the larger study simply cannot be done today. One must be blessed with an unusually good musical memory and plenty of time for laborious analysis conducted piece by piece in order to deal with these concepts.

We turn now to the social scientific scenario, which concerns a team of scholars in several nations working in the field of international relations, in particular the questions of the effects of the General Agreement on Tariffs and Trade and the World Trade Organization on bi- and multi-lateral relationships among nations in the period 1950-2010 and the growth of local and national economies in developing nations in the same period. Among the subordinate questions to be addressed by this team’s research, are questions of the effects of information on decisions making. A particular focus will be a select sub-set of developing nations from each area of the world.

Among the research resources needed for this line of research are many archival collections of papers from government agencies, non-governmental organizations, protest groups and labor unions. Government documents, many of which were issued only on the web, will be needed too, along with the papers of individuals involved in various of the organizations and movements. The team will also need a host of statistics in paper and machine-readable forms gathered by a range of number hunters, from individual consultants and companies, to national agencies, to the United Nations. All source materials will be digitized, searchable, and, for the quantitative data, available for extraction, manipulation, and analysis.

In order to gather this array of research resources, the team will need to include librarians, information brokers, government document curators, and social science data consultants. Source materials will be drawn from many repositories around the world and installed on a server for the whole team to use, regardless of their locations. As data flow in, they will need to be analyzed, then correlated with the news reports, public statements, and archival records. Some imaginative work will be necessary too in organizing the documents so they can be seen in chronological and topical orders as well as arranged by proposition. The work in this scenario cannot be undertaken on the grand scale suggested, primarily because the source materials are essentially unavailable for digitization and thus for analysis. We turn now to a scenario in the sciences.

The researcher in our scientific scenario is an ichthyologist seeking to understand the factors in the depletion of native oysters over the course of the past 50 years. Of concern are parasites, diseases, and the biochemistry of disease organisms. At issue as well are pollution, exploitation, changing water salinity and silting. Geographical and mathematical models will have to be developed to describe past, and to predict future, trends. The native oyster in question is Ostrea edulis, distributed naturally around the British Isles, the North Sea, the Mediterranean, and the Black Sea; it has been exported in seed stocks to North America, Australasia, and Japan.

Because of the wide distribution of this oyster over the past centuries, scientific and trade information about this oyster has been generated in numerous languages, in peer-reviewed journals, government documents and in fisheries’ publications. It is known that recent articles about oyster parasites and diseases are distributed across numerous journals covering the life sciences, but identifying them will be difficult due to variance in local naming conventions for the oysters and their diseases. Local records of estuarine salinity, perturbations of local estuaries’ banks and floors, and weather conditions will be needed, as well as reference to the surrounding land and marine topographies. The roles of improved methods of packing and transportation will be factored in the problem, as will the inadvertent distribution of exotic species, competitors, parasites, and diseases in the holds of ships. Geographical information systems and mathematical modeling will be employed to analyze the extent of oyster stocks, harvesting, predation, farming, and disease vectors, leading to a study of oyster epidemiology.

Each of these scenarios is based on research directions and methodological developments observable now. Each requires access to enormous amounts of information, only some of it in the published literature, ranging from the most popular to the most arcane. Each project features roles for technical and subject librarians to gather, analyze, and model the information. Not one of these projects can be completed without recourse to network-based communications and information as well as to high-performance computing. And all of them will require digitization of existing data. In the future, such research scenarios will be commonplace, I predict. The main point here is that we must constantly and continuously understand what our readers will require, anticipating as best we can their needs.

The implications for the great research libraries are plain. Applications of digital methods and global communications have changed and are changing the nature of research in nearly every discipline. Not a single one of our functional areas will be left unchanged as we adapt. Most significantly, our collection development and preservation programs need to accommodate the new information sources and the new formats. And it is abundantly clear from these scenarios that the additional means, the tools, agents, and methods, for broad and deep discovery and retrieval of distributed information resources will be essential. While on the one hand, the Internet and the global telecommunications systems have liberated us from our local precincts, it is true as well that they are making us more dependent upon other information sources in order to satisfy the imaginations of our scholars.

For those doubters among you, let me call your attention to the incredible generative effects the Thesaurus Linguae Graecae had on classics scholarship in the decade of the 1980s. Think as well about the impact readily available quantitative data on human behavior has had on the course of the social sciences and on the practice of politics. And what would our scientists in particle physics, almost all of the life sciences, and many fields of engineering have done to pursue their research without access to computers and to the enormous journal literature in these fields? Virtually all disciplines now produce results heavily, and in some cases entirely, dependent upon information technology to make their findings known and available.

Publishing of all sorts is now heavily dependent upon information technology and, ironically, there are more published items available and more or less accessible in printed form than ever before. In addition, for some crucial scholarly publications, for example the BMJ and the European Sociological Review, the Internet editions are the most complete versions, supplanting the primacy of the parallel print editions. And therein lies one of the great quandaries for the great research libraries: how to represent all of the records of civilizations’ marches, whether in our individual collections or collectively, perhaps even in some coordinated fashion, among our universities. From our perspective, civilizations do not decline; instead, as they proceed, they produce more documentary confusion.

It is also plain from these scenarios that libraries in general, and the great research libraries as leaders in the field, must continue and, indeed, expand our their efforts to conserve the artifacts bearing information already in our their care. To do so, we need to find ways to re-engineer conservation methods and practices now using craft procedures. Mass deacidification for acid-laden wood pulp paper, better methods for dealing with coated paper stock, and widespread adoption of paper-splitting and reinforcement processes— such as those pioneered by Ernst Becker at the Zentrum für Bucherhaltung in Leipzig – are examples of approaches that, if applied at a great many libraries, would preserve paper artifacts now more or less doomed to self-destruction.

And presuming that we can develop trustworthy methods of operating digital archives, a subject that which will arise later, we should turn to mass digitization, in place of our microfilming projects. For this to work, we will need some new tools, ones which industrialize what is now a labor-intensive process. Because we must preserve and conserve our collections, the great research libraries should take the lead to develop the new methods to safely accomplish mass digitization. We need also to purchase services or machines on the marketplace, but only if they are efficacious, affordable, and do not harm our books.

The wisdom of the monks, scholars, and rulers who first established universities and then their libraries was not misplaced with regard to the purposes of these institutions. Any objective, scientific view of the research product of the great universities would show qualitative differences when contrasted with other sorts of universities and libraries. The number and types of connections made possible by large, well organized, and deeper collections result in qualitatively different research products. Libraries with great collections make qualitative differences in the types of research possible in their areas of specialization and across those areas. This assertion applies to the great research libraries among the national and public libraries equally well.

There is a corollary to this notion. Great research libraries are developed by staff - highly qualified curators, catalogers, arrangers, and others, whose expertise is applied both to the gathering and organizing of research resources and to the provision of those resources to scholars. They contribute significantly to the research possibilities and opportunities of the great research libraries. Great research libraries afford subject, genre, language, and technical specialists in sufficient numbers and, frankly, provide these very special people with working environments in which their expertise and often peculiar psychologies can be put to good works.

Furthermore, the three scenarios presented earlier illustrate the growing dependency of researchers on a mix of digital and traditional information resources, as well as some new research methodologies requiring computers. However, and just as in the past, scholars at the forefront of their disciplines hereafter will likely be caught up simultaneously in benefiting from – as well as requiring – extensive collections of information. Great research library collections before the Internet age stimulated and suggested lines of research, in part by patterns and leads emerging from the massive collocation of related material and in part by the work of catalogers, arrangers, and bibliographers setting meta-information in place and deliberately seeking connections and occasionally finding contradictions. As a matter of course, scholars conducting research have made known to bibliographers their requirements for books, manuscripts and archival material. Canny bookmen have contributed by keeping their own eyes open for unusual material that might be offered to libraries to support collection trends already started. This virtuous tangle of requirements, sources, scholars, and librarians has continued in the Internet age.

The great research libraries must continue to add to their collections of unusual material. As such collections grow, so do the number of planned and systematic opportunities for research. Moreover, adding unica and uncommon items to already large special collections improves the chances for serendipitous discoveries. The lively minds working in our libraries make connections intuitively, as much as from our cataloging, and follow veins of ideas in mining our stacks for the subjective glory holes of gold, different for each individual, but precious for all nonetheless.

At the Bodley, as reported in its annual reports and press releases, the continued growth of the collections of medieval manuscripts – while simultaneously adding modern political and literary papers, music manuscripts, maps, and Orientalia – mark the progress of this great research library.

We should note the addition of the late 11th- or early 12-century Islamic scientific manuscript, “The Book of Strange Arts and Visual Delights,” which includes a previously unknown set of maps. This important manuscript is especially auspicious in this 400th year,
as is the acquisition of the manuscript of the “Hebrides Overture” of Felix Mendelssohn for the already extensive archive of Mendelssohniana. Collecting at this intensity of rarities is a mark of great research libraries. The assembling of the Sackler Library is another sign of this great library looking to the future.
Selecting a New World sibling library not quite at random, to illustrate further the point, we have acquired at Stanford in the past decade the archive of the iconic Beat poet Allen Ginsberg, the papers of the Southern Pacific Railway and the Apple Corporation, the archive of the polymath engineer and architect R. Buckminster Fuller, and significant additions to our archival holdings on John Steinbeck and William Saroyan. Also residing in the Engelbart collection at Stanford is the 1964 prototype pointing device, the world’s first mouse.

And to counter those who might believe that the collecting instincts of dedicated, even obsessive, individuals will never or rarely direct them to digital materials, we have examples at Stanford of digital items coming along with traditional materials, as well as one example of a collection entirely devoted to computer and video games, the Cabrinety Collection of software, hardware, and literature.
Remember as well that the Internet Archive is a collection of Internet sites collected by and for its patron, Brewster Kahle. No doubt there will be many other such examples coming to us and, as in the past, one function of the great research libraries will be to provide aid, comfort, and fellow traveler-ship to individual collectors, including those most interested in digital objects.

Lest some mistake my theme and belief that great collections are the foundations of great research libraries for a paean solely to rare books, manuscripts, and archives, let me hasten to add that great research libraries also amass huge collections of ordinary books, government documents, films, music, and journals. Just as the cause and effect tangle is typical with regard to scholars and libraries and services, so are there yin and yang complementarities of special and ordinary collections in the great research libraries. Adding digital objects to the mix augments the networks of ideas that span the common as well as the obscure, awaiting invigoration by inquiring minds, exploring our physical and virtual collections. Unusual, special, rare collections need auras of general collections, quickly and easily available, to be fully exploited. And conversely, great, large, general collections, accumulated over time, become more and more “special” as they become deeper, wider, more multi-lingual, and incorporating more points of view, records of more controversies, and successive layers of cultural engagement.

It is not my intention, however, to deny or belittle the development of research collections in other sorts of libraries. In fact, I believe that each library, no matter how small, has contributions to make to the development of the logical, global research library, that greatest construction of them all. Regardless of the motivation to develop a collection which might serve research purposes, whether one’s collection policies are devoted to local history, to further elaborating a collection assembled by a donor, or even to focus on a single specialty, we can contemplate providing intellectual access to all such collections on a global scale.

I do not advocate that all libraries attempt to become research libraries. Indeed, I do not believe that most research libraries should pretend that they can become great research libraries, because there are too many examples of the diffusion and resulting waste of local and national resources to lowest common denominator acquisitions programs, altogether too plain and too vanilla, contributing too little of unique or special substance. Rather, emphasis should be upon the ways that any and all libraries can contribute uniquely to the commonwealth of knowledge.

So, now some thoughts on other roles of libraries in the future…


What has the Internet Age meant to libraries and particularly to the great research libraries? What will the great research libraries undertake because of it? The transformation and the possibilities can be seen in several functional perspectives: access, distribution, analysis, collecting new genres, preserving digital information.

The possibilities for libraries to provide intellectual access to their collections have widened considerably. From the 1970s, library staff have been able to search on-line union catalogs, such as those of OCLC and RLG, the two notable survivors of the shared cataloging stage of the use of information technology by libraries. Now we present our catalogs of our holdings via browser interfaces, sometimes hyperlinking from catalog records to digital editions as in this example of a link from an OPAC on the left to the European Journal of Biochemistry. In addition, we provide access via on-line abstracting and indexing services to the unanalyzed contents of journals and anthologies in our collections.

Some of us have begun to provide synthetic guides to the literatures of various disciplines, usually prompted by twinned forces: the desire to make readers more self-sufficient in their information discovery and retrieval activities and the need to provide guidance to readers, who are novices in disciplines not their own. We are now seeing an increase in the amount of hyperlinking from cited references to the sources cited in scholarly communication and similarly from abstracting and indexing services to the Internet editions of the sources they cover. Developing this web of hyperlinking will contribute to the next great phase of the Internet, that of the Semantic Web, by increasing the amount and types of meta-data. We in the great research libraries can contribute to the development and operation of the Semantic Web and should cheerfully contribute metadata for our holdings. We should develop taxonomies and ontologies to permit cross-format and multiple-genre relationships to be extracted. Also, we should report to readers, on demand, what others interested in their topics or research resources are reading.

There are other developments in which publishers, librarians, and information technologist working together have created narrowly specialized digital libraries of information and information services for widely dispersed communities of scholars. Such “knowledge environments” provide access to journal articles, newly commissioned perspectives and reviews, protocols and guidelines for conducting research, alerting and customization services, and often unique, methods for displaying and navigating pertinent information services. In addition, these knowledge environments offer communication services in the form of on-line open and moderated forums in addition to personal and institutional directories. Once there are a great many knowledge environments, we will need to provide the means to perform discovery and retrieval functions among them as well as support cross-disciplinary twigging and cloning so characteristic of truly healthy evolution of scholarly enterprises. On the screens now are examples from the BoneKEy knowledge environment focused on bone and mineral research in medicine and another on cellular signal transduction with its unique connections map showing flow charts of intra-cellular biochemical processes with hyperlinks to explanatory information from each information “blob.”

Despite the wealth of access possible through various network based metadata sources, performing a literature search across all – or even some – of them is a daunting task. One of the major tasks facing research libraries is the development of discovery and retrieval engines that will take a single search argument and apply it across a range of metadata sources, ideally a range set by the searcher. There are some early versions of such a research tool. FlashPoint from the library at Los Alamos National Laboratory searches BIOSIS, the various sections of the ISI citation indices, Engineering Index, Inspec and the e-Print arXiv. Another, the search engine at HighWire Press, simultaneously searches all of Medline and about one million articles in over 340 journals. Here also is the HighWire Topic Map. Other search engines are in development, but we will go through many generations of increasingly sophisticated versions before we satisfy our readers.

In addition, there needs to be much creative work on the effective display of the “hits” retrieved. Eventually, we will want that sort of discovery and retrieval engine to seek words and phrases in complete texts as well as meta-data.

The great New Yorker cartoon defines in the broadest terms the effects of the Internet on publishing. Anyone with a computer and a link to the Internet can become a publisher. The mass of popular “publishing” presents a problem to archivists and those scholars of contemporary culture who want to preserve at least a sample of popular discourse from our age. In the arena of serious publishing, the Internet has created a maelstrom of change, threatening some traditional players, while simultaneously empowering others. Certainly scholarly and serious journals have been transformed, with many new features like hyperlinking to cited references, “prospective citations,” multiple resolutions of images, and opportunities to communicate directly with authors. And we are seeing the beginning of the transformation of the scholarly monograph along the same lines. “Classic” texts, from the ancient philosophers, to the Latin and Greek Church fathers, to Marshall McCluhan can now be read on the Internet, sometimes free of cost, sometimes not. Printed music is coming to the web. These digitized editions in their simplest form, such as the 4,000 titles in Project Gutenberg, make out-of-copyright titles widely available, but encoded digital texts allow new forms of research, involving comparisons at various levels of literality and meaning. Examples include the Miguel Cervantes Virtual Library, the Electronic Text Center at the University of Virginia, The British National Corpus, and the Forced Migration Online, both at Oxford.

Libraries have new possibilities for distribution of their holdings too. Lately, as scanners with better resolution and more efficient handling of books have become available, some notable experiments in digital interlibrary loan of whole books have been undertaken. We are beginning to see more automated scanning devices that deal gently with books as objects while providing accurate conversion of the captured images. These new devices promise to make scanning of books and documents less costly and faster, thus opening the possibilities not just for lending books on-line, but also for the analysis of the texts.


E-book distributors like Ebrary offer on-line access for library patrons and readers everywhere to currently published books, albeit in small numbers as yet. They too offer good searching of the texts for words and phrases and other useful functions. Even in this early stage, searching a collection of books for words and phrases has begun to provide the same magnitude of increase of benefits to our readers, as did the retrospective conversion of our card catalogs, despite the earlier caviling of Nicholson Baker.

One should not neglect to mention the factor of convenience of access to on-line information resources resulting from the Internet Age. For our local readers and staff, the provision of on-line access 7 x 24 has meant that procrastination can rise to new heights of risk, delaying until the wee hours that bit of research and writing due soon after the dawn. But, it has also meant that we can operate 24 hour libraries without burdening staff to remain in the physical facilities in the same wee hours.

Digitizing some of the treasures of the great research libraries is well underway, but more should be done. Various projects of the Bodleian itself are contributing digital facsimiles of early manuscripts from the collections of Oxford’s colleges and from the Bodley itself. Other great libraries around the world have begun projects such as these to demonstrate the possibilities for research and teaching. Where formerly we might make a microfilm master on demand, with roughly the same investment we can produce a digital copy and make it globally available. Some of us have concerns about vitiating our institutions’ cultural patrimony, but, those concerns notwithstanding, it is clear that the possibilities for assembling virtual collections of materials - related by provenance, by subject, by author, and by many other characteristics - could contribute to much more efficient scholarship than our reliance on physical copies allowed in the past.

Contemplate digitizing ALL of the items in the collections of one of the great research libraries and then making as much of the digitized material as possible available to the world. Yes, of course, there are copyright issues to be solved, but imagine the effects of making not thousands, but millions of books and perhaps tens or hundreds of millions of journal and newspaper articles available for searching and retrieval on-line. Is it too much to say that the results would be a new Enlightenment? Maybe we will see such a project underway one of these days.

I have mentioned new forms of analysis in the research scenarios and now want to point out the additional requirements for libraries to make these new methods available to our researchers. The various sorts of search and comparison functions necessary for advanced textual research methods require libraries to mount the texts and provide data searching and manipulation software. Encoded texts, whether using SGML or XML and their variants, provide more opportunities for advanced research. Many publishers, e-book distributors, and libraries are providing texts and search engines. However, occasionally, texts must be entered into new data formats to enable special sorts of examinations. In these cases, too, the great research libraries will be called upon at least to supply the desired texts to the research teams, and often to adopt the new data formats. The full repertory of such transformation of texts, prototyping, and adoption of new methods is probably without practical limit. It is important for librarians to serve on the research teams, contributing the development of the new methods.

New analytical methods are constantly emerging. One of the least developed is that of “sonification” of data for navigation of large and complex data sets by sound. A research project at the Stanford Humanities Lab has treated oceanographic data and stock market performance in gross and by individual company equities. Will this research lead anywhere? I am not sure. But I am sure we need to be cognizant of such exploration. Another area, now in heavy demand, is analysis of social scientific data, much of it generated by government agencies. All the great research libraries offer statistical data in digital form, but using it requires specialized assistance. As the use of social scientific data increases, the great research libraries will develop web-based interfaces so that readers and researchers can get to the data they need without assistance by library specialists, who will then be available for more specialized consulting services. To conclude this section on analytical tools, I will digress significantly to a demonstration of the advanced use of geographical information systems to better understand history – in effect to see it in new ways – and ultimately to better represent our times.

The work of David Rumsey, a private collector of maps, atlases, and related material is illustrative of some new analytical methods, in his case ones utilizing a variety of new Geographical Information Systems (GIS) and some incremental advances he has stimulated in information management.

Rumsey has been collecting for a long time, but over the past few years he saw and then adapted the possibilities offered by information technology to make his collection accessible, to add information to digital versions of his maps, and to produce a variety of new views of those digital versions.

Rumsey uses Luna Imaging’s Insight Software and various GIS software. One can use three different means to interact with the digital images and data he provides. An ordinary web browser using Java scripts can be employed. An Insight Java client can be downloaded with advanced functions. Or, one can download a GIS browser with an applet.

David has associated himself with Berkeley, Stanford, the American Antiquarian Society, and three commercial interests, Luna Imaging, Telemorphic Software, and AMICO, the art image distribution agency for a number of important museums. Here is a screen showing the Arrowsmith map of 1844 and the numerous functions of Luna’s InSight software.

Rumsey shares his collection in many ways: He has contributed 6500 of his metadata records, with hot links to his images, to UC Berkeley’s online catalog.

Rumsey makes his maps available through many tools and intermediaries: Google, the Open Archive Initiative, Mellon’s OAIster, E.S.R.I.’s shared Geography Network, and the Electronic Cultural Atlas Initiative.

Let us look at some GIS functions that Rumsey is making possible for researchers. Here is Colton’s 1836 Map of New York City, georectified.

Then it is overlaid with modern street and shoreline data for the New York City metropolitan area.

Zoom in to the area around Central Park

We can see changes to the shoreline on the left, landfill for the Henry Hudson Parkway, and see changes in Central Park – notably, the placement of the park itself in 1856 and the moving of the reservoir north, etc.

We can compare historical maps to each other more effectively in GIS. Here, the Colton 1836 NYC again.
And here a map physically twice the size of the Colton, the Dripps Map of New York, 1852.

When both are shown in GIS, the Dripps map is now about 1/6 the size of the Colton, because it is on a much larger scale.

Zoom in to lower Manhattan with the 1836 Colton

Then overlay the 1852 Dripps map as a transparency.

Zoom to Tompkins Square on the 1836 Colton

Then overlay the 1852 Dripps map of the same area and see changes over 16 years of development.

In1802, Thomas Jefferson sent Lewis & Clark to explore the far reaches of the Lousiana Purchase. The Bodleian Library was only 200 years old then.

Rumsey is using GIS to allow enhanced interpretation of the historical maps as well as the ability to combine them with modern geospatial data. This method permits us to more accurately trace Lewis and Clark’s route of discovery.

Here is the 1814 published map of Lewis and Clark’s Journey to the Pacific, in its original form, before georectification

Here is the same map after georectification. The amount of change is not too bad considering that Lewis and Clark were using inaccurate clocks to determine longitude and lots of estimating by eye up and down mountain ranges and valleys. This example took about 40 known points to georectify.

Now we can combine the old map with modern map data showing major roads, some cities, and state boundaries.

And we can even add vector data that shows all the Lewis and Clark camp sites (in yellow).
GIS can also be used to create interesting mosaic maps – here, the Lewis and Clark Map of 1814 is shown with a 30 mile buffer on each side of the trail route, which then blends into an 1879 US General Land Office Map of the same area, which then blends into the US 1970 National Atlas Map, and finally the mosaic is bordered on all four sides by current satellite imagery of the area. The following slides show how the blending works
First, showing the blend with the 1814 Lewis and Clark Map gradually becoming more prominent.
Next, gradually showing how the 1879 US General Land Office maps are involved.
Finally, swiping in the current Satellite imagery. [Click for map blending and overlay animation]
On the stand over there is a photo of the blended map for you to inspect more closely after this presentation is concluded.

Rumsey is also working with 3D GIS. Here is a 1926 maritime Chart of San Francisco Bay showing depths with numbers on the chart surface This method allows us to perceive the shape of the surface of the Earth obscured by water. .
Here is a modern bathymetric model of the same area of the bay with depths shown in GIS in exaggerated fashion.
We can combine the two and see the depths shown on the 1926 chart using 3D GIS.
Now, zooming in the Golden Gate straights we can see the trench gouged out by tidal action.
We change our viewpoint as if we were hovering over Oakland, looking west toward the Golden Gate.
Zoom in to Yerba Buena Island, shown on the original map in 1926 as a small hilly one, then in the combined data view here we can see the land fill extending the island north for the 1939 World’s Fair, now known as Treasure Island

Another example of 3D GIS – taking this 1915 Wall Map of San Francisco by Chevalier and
combining it with current Digital Elevation Models (DEM’s) from the United States Geological Service.
We “Shrink wrap” the 1915 SF Map using 3D GIS.
zooming in and turning, so that the peaks to the south are in the top of the screen.
We can change the source of illumination – [move through the next slides quickly as in animation to show the changes in illumination.] Note the graphic in the lower right corner showing azimuth and altitude of the sun’s rays We can change the illumination … not add fog …
Now let’s Overlay modern streets on the 1915 map
Here are the 1915 streets so you can compare growth of the city infrastructure. These functions allow us to see the landscape in varying conditions of sunlight, and to plot the growth and change in the layout of streets. Such representations of reality help some better understand their surroundings.

David Rumsey is remarkable, because he is investing heavily in making his maps available over the Internet in a variety of ways. His motive is to excite others about the beauty and historical interest of his map collection, but also to allow them to be used in digital editions he has created.

Now we turn to new genres and literatures.

Hypertext and hyperlinking provide report writers and fiction writers with new possibilities. Certainly we have seen the use by undergraduates of our facilities in inserting images, charts, graphs, tables, moving pictures, and sounds in reports and projects. Because an increasing proportion of these reports involve multiple media and texts, they often cannot be delivered on paper, but must be delivered to their teachers in digital form. And as some of these students become professional scholars, we will see more such mixed media reports and articles in scholarly communication. A few examples of this phenomenon were seen as early as June 1995 in Science Magazine, and other journals, like the Molecular Biology of the Cell, followed suit shortly thereafter. In the example on the left … the example on the right … entitled … Babbling.

However, in the realm of creative writing, hypertext has provided some new possibilities for authors, not only to tell stories, but also to engage readers in making choices in the directions a story might take. Here are a few examples of how hypertext fiction can work from Richard Holeton’s “Figurski at Findhorn on Acid.” The new forms of expression are almost certainly extrapolating from video and computer games as well. Multiple-user Dungeons and Dragons provide directed environments for playing out tales of battles, conquest, and general mayhem. Similarly, other gaming environments involve sports, historic settings, exploration of exotic places, and just plain gambling. All of these have the potential to inform and to increase the expressive possibilities for writers of all genres.

Some academics read more into this. Witness this excerpt from a prospectus by a team of scholars working on the history of video and computer games. “ A spectrum of theorists …proclaim that we are on the verge of a new – albeit post-human – renaissance, an era in which the human being becomes seamlessly articulated with the intelligent machine, a condition in which there are no demarcations between bodily existence and computer simulation, between cybernetic mechanism and biological organism.” Think about the consequent opportunities for new items for our collections!

That there may not yet be a “classic” in the hypertext genre is irrelevant to the responsibility the great research libraries must bear. Here, at the birth of so many new genres of expression and communication, we must collect examples of each, perhaps widely, and then we must preserve them so that scholars, students, and pundits in the next generations might examine and evaluate them for the history of our present. We cannot distinguish the quick from the stillborn in and among these genres.

The prospect of collecting the new genres, ones like hypertext fiction and video games, raises one more function thrust upon the great research libraries from their early days, that of preserving ideas and expressions, data and doggerel born digital as well as digitized. To rehearse local history, remember that Thomas Bodley stepped into the breach caused by the gradual disappearance of the manuscripts given by Humfrey, Duke of Gloucester, to Oxford between 1435 and 1444. Bodley could not replace the manuscripts of either Duke Humfrey or the earlier benefactor, Bishop Cobham, and Oxford has recovered only a few of them over all these centuries. Thomas Bodley could and did, however, set in place a library tradition, with suitably designed and constructed facilities, involving professional, full-time librarians. He established rules that have preserved the collections for these many generations of readers and stimulated the development of the collections by gifts of his own books and those of others. By negotiating the 1610 agreement with the Stationers Company for what amounted to copyright deposit, Bodley set the scene for the development of many great research libraries.

It is now time to fill another breach, that of the disappearance of ideas and expressions born, published, and distributed digitally. There are too many examples of important information gathered and published in the past few decades simply disappearing, either because no effort was made to collect and preserve it or because the technology storing it or necessary to read it has not been maintained. We thus lack images of the Earth from the NASA Landsat photograph collection from the earliest days of the orbiting cameras. Census tapes from the 1960s are unreadable. Satellite observations of Brazil in the 1970s, critical for establishing a time-line of changes in the Amazon basin, are also lost on the now obsolete tapes to which they were written. It is not only the data that are fragile, but also the rapid passing of generations of software and hardware that will leave us unable to provide access to those digital objects. Clearly the leadership of the British Library and the establishment of the Digital Preservation Coalition here are significant steps in the creation of trusted digital archives. The National Digital Information Infrastructure and Preservation Program of the Library of Congress, now concluding a year of intensive planning, should lead to support for some prototypes of digital archives in the U.S. Work at the National Library of Australia in the PANDORA project is notable too.

Stewart Brand, founder, editor & publisher of the Whole Earth Catalog and co-chair of the Long Now Foundation, has said that “exercise is the great preserver [of data].”

It is that notion which underlies the operations of the network caching software, known as LOCKSS, which many of you have beta-tested. LOCKSS is an acronym for Lots Of Copies Keep Stuff Safe. Brand and many in this room today have good reason to fret that unless we in the great research libraries begin to preserve digital objects, we may be creating what our successors will label a digital dark age.

Another breakthrough is the recent announcement that the Koninklijke Bibliotheek, the National Library of the Netherlands, will become the first official digital archive for all 1,500 Elsevier Science journals amounting to over 7 terabytes of data. The Elsevier decision to allow the Koninklijke Bibliotheek to function as its digital archive is recognition that libraries will function in their traditional role even when the media of the objects under care change. More will follow suit, I am sure, especially when some of the prototype digital archives now contemplated have proven satisfactory.

[The Dark Side]

Optimistic prognostication about what great research libraries ought to do and logical extensions from present programs and operations notwithstanding, we must face the dark side of the Internet Age. It threatens to inhibit our future and our abilities to deliver information, ideas, and expression to our readers now and for a very long time to come The continued imbalance of values and valuation between the originators of ideas and expression and certain irresponsible publishers is troubling. Scientists and scholars consider the reports of their work to be important for many reasons, including indirect monetary gain. Certain publishers, on the other hand, see such reports as information to be treated as a commodity. There is a resulting clash of value systems involved here, but not many librarians have recognized that in holding the purse strings, they hold what may be one of the keys to improving the present imbalance to one more favorable to scholars and their institutions. Librarians in the great research libraries must retain their selectivity and avoid re-allocating funds– from the humanities and social sciences – to pay for the excessively expensive journals in the sciences, technology, and medicine. And in particular, the great research libraries need to recognize the responsible publishers among the herd and suggest that scholars send their manuscripts to them.

Librarians who have allowed themselves to be seduced by the “big deals” offered by many of the for-profit scholarly publishers find themselves stuck in long-term contracts and increasing expenditures for a single publisher’s products which spending then results in decreased spending for other books and serials. Having succumbed to the blandishments of a lot more information for seemingly small increases in cost, some librarians are now discovering that much of the information added to their original careful selections is little used by their readers. Cancellations are forbidden or severely restricted by the long-term deals, so responsiveness to one’s primary clientele – faculty and students at one’s home institution – is sacrificed as well.

The most grievous difficulties are arising, in my view, in new laws, amendment to old laws, changed regulations, and cunningly developed international treaties, all limiting the rights of citizens in democracies to freely read, otherwise observe, and finally make use for private purposes copyrighted intellectual property. In the U.S. there has been a pitched battle, fought mainly by the Disney Company and other popular media barons as well as the Association of American Publishers. The results have been the provisions of the Digital Millennium Copyright Act and various attacks on librarians. The World Intellectual Property Organization and the various attempts to “harmonize” approaches to European Community and national intellectual property codes are producing inhibitions to creative work as much as they are rewarding the actual creators of new ideas and expressions. There seem to be two central issues in this thicket of protectionism, on one side advocacy for the rights of commercial interests and, on the other, concern for the rights of citizens to exchange and use information. The first issue involves the regulation in the digital environment of who can read what, where, and for what cost.

Essentially those interested in the presumed rights of companies already in the publishing game for profit (and lately by controlling access), as well as keeping high the barriers to entry into the publishing game, have heavy influence on legislators and legislation. The second issue is the right of the creators to establish their own means of publishing; it arises as the creators and their home institutions (universities and scholarly societies), attempt to regain their own rights to publish. Such social movements as the Public Library of Science illustrate, in the irate fervor of its adherents, the furor these issues are creating.

In the U.S., fair use, a doctrine defined in the U.S. code, is under constant attack by those who believe that every time a digital version of anything is read, some money needs to be sent to the copyright holder. And fair uses of media material in academic settings for teaching and research are objectionable too. There are now bills in the U.S. Congress making criminal some violations of copyright.

For those who have even passing interest and engagement in this topic, the work of professor Larry Lessig is essential. Lessig has written, in his book “The Future of Ideas”, that the way we as a society resolve issues of access to digital content, whether networked or not, "will determine what the 'free' means in our self-congratulatory claim that we are now, and will always be, a 'free society.'"

Lessig is notable also for establishing the “Creative Commons,” a site providing authors with a set of tools and advice on making their creative works available with terms they specify rather than those of commercial intermediaries.

The Chilling Effects Clearinghouse is one among several initiatives to address the problem of the Dark Side’s intellectual property issues from the viewpoints of citizen-readers. Chilling Effects focuses on the U.S., but there are similar concerns elsewhere. For example, there is a site by Paul Burton here in the U.K. devoted to the question of control of the Internet with examples and issues drawn from all over the world. Running out of the University of Leeds is another site devoted to cyber-rights and cyber-liberties.

Librarians and law school professors are not alone in this fight. In the U.S. there is a new digital consumer organization seeking to develop a grass roots coalition to lobby against the well-funded, well-entrenched representatives of the Disney company and their industry . And one Congressman, Rick Boucher of Virginia, has taken up fair use as one of a series of positions protecting the interest of consumers, of ordinary citizens.

It seems that the work of the Joint Information Systems Committee and the Publishers Association in developing the Guidelines for Fair Dealing in an Electronic Environment characterizes a more collegial and straightforward attempt to deal with similar issues. Those guidelines are not without tension and stress, I know, and the Publishers Association website features its rather protectionist 1996 statement on The Use of Digitised Copyright Works in Libraries but not the jointly agreed guidelines. On the other hand, the UK Commission on Intellectual Property Rights with its focus on intellectual property regimes and their impacts on poor and developing nations is an example of an enlightened government effort to improve access to information in poorer nations while improving the observation of intellectual property rights. A segment of the statement of Libraries and Archives Copyright Alliance on copyright in the digital age strikes a balance in the copyright issues for many of us. I quote: “However, LACA maintains that overprotection of copyright could threaten democratic traditions and impact on social justice principles by unreasonably restricting access to information and knowledge. If copyright protection is too strong, competition and innovation is restricted and creativity is stifled.” Close quote.

The role of the great research libraries in this arena is to advocate for the rights of our readers now and in the future. Laws and regulations impinging upon our functions as custodians of culture need to be identified as such and we in the great and smaller research libraries need to join the coalitions of those on our side of the issues, the side of fair use and fair dealing for private, educational, and research purposes of information.


The World Wildlife Foundation issued a report in July 2002 projecting that the Earth’s biological assets could be exhausted by the year 2050. Overpopulation, overfishing, pollution, and many other factors suggest that we need another planet to occupy, because we are rapidly consuming this one. If this is the case, the role of the great research libraries in capturing as much of the key documentation of human activity, especially in the industrialization of agriculture as well as the clinical, nutritional, and pharmaceutical “advances” effectively increasing the number of humans alive at once, could provide the few survivors of the looming disaster with valuable information crucial to the survival of our species and culture.

[CONCLUSION]

In the few years preceding the formal opening of this library in November of 1602, Thomas Bodley could hardly have conceived of the extent of change libraries were to enjoy and endure over the centuries until this day. In Bodley’s lifetime, books were only slightly less rare and less talismanic of intellectual pursuits than they had been at the time of Duke Humfrey’s gifts of books to Oxford 150 years earlier. The chains that bound the books in Humfrey’s and Bodley’s libraries in their earliest incarnations were tangible reminders of the status of books as treasures, to be consulted in the earliest days only by senior members of the faculty. Oxford’s earliest records about its books mention several trends which continue to this day: solicitation for books and funds for the university’s library collections and facilities; inconsistent attention to support for new acquisitions and for the upkeep of the library; but, on the other hand, sporadic periods of deep concern and conscientious engagement by university officers to address and ameliorate the states of affairs of the Bodleian Library.

The Bodleian has been joined in this wonderful status by a few other great research libraries and a slightly more numerous lot of research libraries with a few great collections each. The fact that these other libraries have arisen in some sense gives comfort and aid to the Bodleian’s keepers and masters, but could as well lead to the diminution or weakening of support for the acquisitions and operations, which in the case of the greatest research libraries are qualitatively different than in all others.

It is a mistake to jumble all sorts of libraries of all sizes and conditions as an easy way of coping with changes in economic fortune, of technology, and of the nature of research. The convenience of such mélanges of libraries to policy makers and administrators, people earnestly trying to deal with recalcitrant problems of asset management, is exacerbated by the sort of thinking represented by themes like “access instead of ownership” and by the famous New Yorker cartoon “On the Internet, no one knows you are a dog.” To the contrary, however, one can discern dogs on the Internet, great collections do determine and provide access, and the functions and possibilities of great research libraries are different from those provided by all other sorts of libraries.


There are only a few themes in this presentation. Most important is that Deep and broad collecting is essential among the great research libraries, especially of unusual materials and of new sources and genres of ideas and expression in digital forms. Research libraries still need to find, nurture, and take advantage of subject, language, and technical specialists, some of which will be involved in specifying and perhaps even inventing new technologies.

Thanks to the advances in information technology and the Internet, the great research libraries have new opportunities to serve scholars by new methods of analysis of their holdings, some of which will need to be digitized, and by distributing access to their holdings globally. In taking advantage of information technology in these ways, the great research libraries will be responding to and stimulating developments in research methodologies as well as the formation and reinforcement of communities of scholars right around the Earth.

The great research libraries need to collect digital information resources for the present and the future, and they need to discover ways to protect, conserve, and make those resources accessible in the face of rapidly changing technical environments, legal and financial hurdles, with as yet untested methods. They need to cooperate with one another in operating digital archives so that there are always multiple copies of any digital resource, on the grounds that in redundancy lies long term safety. They need to make accessible digital information resources to their readers, just as they did printed and manuscript sources of earlier centuries, in part because exercise of the data ensures its longevity and in part because use of information for private, educational, and research purposes is a key factor in the persistence of democracy itself.

Today, a week after the first anniversary of the terrorist attacks in America, the Bodleian and all libraries, whether their future is bright or dim, represent and reflect the best, the highest aspirations of man. In stark contrast to terrorism, libraries and their staff protect the freedom to speak, to read, and to think. Long may it be so!

Thank you very much for your patient attention.

Michael A. Keller, University Librarian, Stanford University. A talk for "A Celebration of Libraries" on the 400th Anniversary of the Bodleian Library, Oxford, 19th of September 2002

Acknowledgements