Stanford and Google Book Search
Statement of Support and Participation
December 2005
In December 2004, Stanford University announced its intentions
to provide Google with access to its book collections to be
included within Google Book Search. Stanford issues this statement
of
support and participation to express the public benefits that
will result from this monumental undertaking to provide search
and discovery access to the world’s printed works.
GOOGLE BOOK SEARCH
Google Book Search is part of Google’s over-arching company
goal to “organize the world’s information.” As
part of that mission, Google determined that it would scan the
world’s printed books and make them word searchable through
Google’s Internet site. So a search for a particular term – “recombinant
DNA” – will result in a search return containing
bibliographic information about every work available to Google
that includes the searched term. The bibliographic information
includes standard online catalog information, such as title,
author, date, and place of publication, publisher, size and it
also includes information about how to access the full-text of
the physical work, either through purchase or through a library.
Google Book Search has two key components, Google Books Partner
Program and Google Books Library Project. Under the Partner Program,
Google accepts digital works directly from publishers, and, according
to contractual terms with the publishers, the search results
will include not only bibliographic information, but also some
text from the digital work.
Under the Library Project, Google is scanning works in libraries.
The partner libraries include the University of Michigan, Harvard,
Oxford, New York Public Library and Stanford. For works in the
public domain – those works not subject to copyright – Google
will display the full text of the book with the search result.
For those works in copyright, Google will provide only bibliographic
information and a few “snippets” from the actual
digital text, usually not more than a few lines including the
search term; no full text pages will be displayed.
WHY STANFORD IS PARTICIPATING IN THE LIBRARY PROJECT
Full-text searching of all literatures is an extremely powerful
and useful tool for helping people identify and find texts of
interest to them; this is the discovery process. Discovery of
information has made the Internet the spectacularly successful
resource it has become. By providing similar discovery tools
for the tens of millions of printed books held in libraries,
a very large and culturally important fraction of the world’s
information will retain its rightful place as a trusted source
of information and expression and its value in the processes
of teaching, learning, and research will be magnified. Google
Books Library Project, as well as Stanford’s other digitization
efforts, is a major step in assuring that Stanford’s huge
investment in its millions of library books can provide returns
to readers of all ages, backgrounds, and locations. Keyword searching
of the contents of the Stanford library collection – as
well as the other books in the Google Book Search index, whether
from publishers or other partner libraries – will dramatically
enhance the discovery process, not only for Stanford students
and researchers, but for everyone around the country and throughout
the world with access to an Internet portal.
Many of the books on Stanford’s shelves (as well as those
of the other participants) are believed to be protected by copyright.
For this reason, Google will not display the full text of copyrighted
books to readers. In other words, in most cases, this project
is primarily supportive of the discovery process, not the delivery
process. However, and very importantly, for those books in the
public domain – those not protected by copyright – Google
will display the full text to readers. This will make a great
deal of literature and information available to all readers at
no charge, a major public good that will be of particular value
to children and teachers in primary and secondary education,
as well as to those unable to use physical libraries due to disability,
location, or other challenge.
To provide the world’s information seekers the means to
discover content, and in the case of public domain materials
to access content, is one of the primary reasons that Stanford
chose to participate in the Library Project.
WHERE STANFORD IS IN THE SCANNING PROCESS
Google began scanning works from Stanford in approximately March
of 2005. Stanford has selected its federal government collection
as the first set of works to be scanned under the project; these
works are in the public domain. Once this large collection is
scanned, Stanford will focus on providing works to Google that
were published in the United States up to 1964 and that are believed
to be in the public domain. Stanford’s current focus is
on older works in the public domain because Google will make
the entire texts of these works available to readers and researchers,
and because many of these older works are deteriorating at a
rapid pace and it is a priority to digitize these works while
they remain physically sound.
USE OF STANFORD FILES
Another major reason Stanford is part of the Google Books Library
Project is so that it can access digital copies made by Google
of books in its collections. Although Stanford is currently focusing
on works in the public domain, eventually Stanford would like
to bring more works, including copyrighted works, within the
project. Understandably, content owners and copyright holders
want information and assurances regarding Stanford’s plans
for these digital files. Stanford University respects not only
the laws of copyrights, but the principles driving those laws.
Stanford’s uses of any digital works obtained through this
project will comply with both the letter and spirit of copyright
law.
WHAT STANFORD DOES INTEND TO DO WITH FILES
PRESERVATION is one of the vital social roles that libraries,
particularly research libraries such as Stanford’s, provide.
Stanford’s primary intent in obtaining a digital copy is
to ensure the preservation of our library’s resources.
Library collections are vulnerable to catastrophic losses from
fire, flood, or earthquake. For example, Stanford libraries suffered
flooding in 1978 and 1998, resulting in damage or loss to many
thousands of books. These losses were minor compared to the more
recent and tragic flooding of the University of Hawaii Library
and the public and university libraries of New Orleans or the
1986 arson fire at the Los Angeles Public Library. A digital
copy of works ensures the future of not only Stanford Libraries
resources, but of the collected works of our society, our civilization.
EVEN BETTER DISCOVERY TOOLS. Stanford hopes to build better
tools to discover information, such as through taxonomic mapping,
associative searching, hyperlinking, and data-mining techniques.
LINKING TO STANFORD’S ONLINE CATALOG. Stanford will add
links from Stanford University Libraries’ online catalog
to public domain works displayed in Google.
DELIVERY OF FULL-TEXT DIGITAL CONTENT TO CAMPUS. Eventually,
Stanford would like to explore legal ways to maximize our campus
readers’ use of digitized texts. Currently, Stanford purchases
or contracts access to thousands of digital publications held
in copyright by others – electronic journals being the
most common case. It honors contractual and legal limits on its
use of these publications. Stanford’s policy with regard
to books digitized by Google is exactly the same: it will respect
any copyrights and licenses and prevent abuse (including hacking,
mass downloads, and the like) by others.
WHAT STANFORD DOES NOT INTEND TO DO WITH FILES
Stanford does not intend to violate the legitimate rights of
content owners to control the distribution and exploitation of
works under copyright.
STANFORD’S REACTION TO LITIGATION AGAINST
THE GOOGLE BOOKS LIBRARY PROJECT
Stanford is saddened by various publishers’ and the Author’s
Guild decision to file suit against Google in an effort to curtail
the Google Books Library Project – a project that has the
potential of providing an invaluable social good.
Stanford believes that courts reviewing these cases will conclude
that making a digital copy for the purpose of indexing and searching
works is a fair use. Historically, copyright law has allowed
the copying of works without permission where there is no harm
to the copyright holder and where the end use will benefit society.
Here, there could be nothing objectionable under copyright law
if Google were able to hire a legion of researchers to cull through
every text in the Stanford University Libraries’ shelves
to ascertain each work that includes the term “recombinant
DNA.” There could be nothing objectionable with those researchers
then sharing the results of their efforts and providing bibliographic
information about all works in Stanford’s libraries that
include this term. Through the application of well engineered
digital technologies, Google can simulate that legion of researchers
electronically through algorithms that can return results in
seconds. A digitization of the entire work needs to be created
in order for Google to make possible the word searching process
of such value to readers, but copyright law allows that digitized
copy as a fair use. Thus, keyword searching of copyrighted texts
and providing references to those texts is permitted by existing
copyright law. The digital copies produced through this project
are necessary to automate the searching process.
Stanford has nearly 9 million volumes in its collection, many
of which are still in copyright, but out of print with no continuing
commercial viability. These so-called Orphan Works have no champions
at all, without an author or publisher available to ensure continuing
accessibility by researchers, scholars, students and readers.
While the publishers point out in their lawsuit that they have
created a mechanism to preserve their catalogs of in print books
through various digital projects, what of the Orphan Works, which
represent a significant portion of works in university and research
libraries? If the Google Books Library Project is restricted
as the plaintiffs are requesting these Orphan Works will remain
relatively inaccessible and their contents undiscovered except
through laborious manual searching by people present at our library
locations. This result would not be in the public interest.
Google provides a mechanism by which publishers and authors
may opt-out of the Library Project, so copyright owners have
the ability to bypass this project. But Orphan Works have no
champions to opt-in to the project. And, if the plaintiffs win
the day, the American public’s ability to discover these
works will not be realized.
CONCLUSION
It has been stated in the press and the “blogosphere” that
Google Book Search heralds a new age of access to the world’s
literatures, that it is a truly transformative development. Stanford
entered into the Library Project because it will revolutionize
everyone’s ability to discover information, from elementary
school students through post-graduate researchers at great centers
of learning. Stanford’s hope for this project is that its
9 million books will be discoverable to everyone with access
to an Internet portal. Stanford is proud to be part of this monumental
undertaking.