I am under embargo* on the details of this until later tonight, but this just came to me without me asking from a very reliable source. To honor the embargo, I will reserve my analysis and thoughts for later, and simply reprint the text of a note which was sent to certain parties at Harvard today. It describes what has been called (by the NYT) “Project Ocean” – a pilot project to scan and make searchable the contents of some of the world’s most prestigious libraries. I went into some of the issues this raises toward the bottom of this recent post (where I talk about Google Print). For the entire text of the email, click on the extended entry. Snippets:
As all of us know, Harvard’s is the world’s
preeminent university library. Its holdings of over 15 million volumes
are the result of nearly four centuries of thoughtful and comprehensive
collecting. While those holdings are of primary importance to Harvard
students and faculty, we have, for several years, been considering ways
to make the collections more useful and accessible to scholars around
the world….
Harvard University is embarking on a collaboration with Google that could
harness Google’s search technology to provide to both the Harvard community and
the larger public a revolutionary new information location tool to find
materials available in libraries. In the coming months, Google will collaborate
with Harvard’s libraries on a pilot project to digitize a substantial number of
the 15 million volumes held in the University’s extensive library system.
Google will provide online access to the full text of those works that are in
the public domain. In related agreements, Google will launch similar projects
with Oxford, Stanford, the University of Michigan, and the New York Public
Library. As of 9 am on December 14, an FAQ detailing the Harvard pilot program
with Google will be available at http://hul.harvard.edu….
If the pilot is deemed successful, Harvard will explore a long-term program with Google through
which the vast majority of the University’s library books would be digitized and
included in Google’s searchable database. Google will bear the direct costs of
digitization in the pilot project….
* To be clear, my rules on embargos are this: I promise not to report anything I’ve been told by the organization that requests the embargo until the embargo time, but if similar information comes to me through third party sources, I will report that information. I will not, however, use that third party information as an excuse to disclose any information still under embargo.
]]>< ![CDATA[
December 13, 2004
Dear Colleague,
I am writing today with news of an exciting new project within the
Harvard libraries. As all of us know, Harvard’s is the world’s
preeminent university library. Its holdings of over 15 million volumes
are the result of nearly four centuries of thoughtful and comprehensive
collecting. While those holdings are of primary importance to Harvard
students and faculty, we have, for several years, been considering ways
to make the collections more useful and accessible to scholars around
the world. Now we are about to begin a project that can further that
global goal-and, at the same time, can greatly enhance access to
Harvard’s vast library resources for our students and faculty.
We have agreed to a pilot project that will result in the digitization
of a substantial number of volumes from the Harvard libraries. The
pilot will give the University a great deal of important data on a
possible future large-scale digitization program for most of the books
in the Harvard collections. The pilot is a small but extremely
significant first step that can ultimately provide both the Harvard
community and the larger public with a revolutionary new information
location tool to find materials available in libraries.
The pilot project will be done in collaboration with Google. The
project will link Harvard’s library collections with Google’s resources
and its cutting-edge technology. The pilot project, which will be
announced officially tomorrow, is the result of more than a year of
careful consultation at many levels of the University. We could not
have achieved a meaningful pilot project without the efforts of the
Harvard Corporation; the President, Provost, Chief Information Officer,
and Office of General Counsel; the University Library Council; and
senior managers within the College Library and the University Library.
A full description of the pilot program follows here, with further
materials available on the Harvard home page tomorrow.
With best regards,
Sidney Verba
Carl H. Pforzheimer University Professor and Director of the University Library
Project Description:
Harvard’s Pilot Project with Google
Harvard University is embarking on a collaboration with Google that could
harness Google’s search technology to provide to both the Harvard community and
the larger public a revolutionary new information location tool to find
materials available in libraries. In the coming months, Google will collaborate
with Harvard’s libraries on a pilot project to digitize a substantial number of
the 15 million volumes held in the University’s extensive library system.
Google will provide online access to the full text of those works that are in
the public domain. In related agreements, Google will launch similar projects
with Oxford, Stanford, the University of Michigan, and the New York Public
Library. As of 9 am on December 14, an FAQ detailing the Harvard pilot program
with Google will be available at http://hul.harvard.edu.
The Harvard pilot will provide the information and experience on which the
University can base a decision to launch a large-scale digitization program.
Any such decision will reflect the fact that Harvard’s library holdings are
among the University’s core assets, that the magnitude of those holdings is
unique among university libraries anywhere in the world, and that the
stewardship of these holdings is of paramount importance. If the pilot is
deemed successful, Harvard will explore a long-term program with Google through
which the vast majority of the University’s library books would be digitized and
included in Google’s searchable database. Google will bear the direct costs of
digitization in the pilot project.
By combining the skills and library collections of Harvard University with the
innovative search skills and capacity of Google, a long-term program has the
potential to create an important public
good. According to Harvard President Lawrence H. Summers, “Harvard has the
greatest university library in the world. If this experiment is successful, we
have the potential to provide the world’s greatest system for dissemination as
well.”
In addition, there would be special benefits to the Harvard community.
Plans call for the eventual development of a link allowing Google users
at Harvard to connect directly to the online HOLLIS (Harvard Online
Library Information System) catalog (http://holliscatalog.harvard.edu)
for information on the location and availability at Harvard of works
identified through a Google search. This would merge the search
capacity of the Internet with the deep research collections at Harvard
into one seamless resource-a development especially important for
undergraduates who often see the library and the Internet as
alternative and perhaps rival sources of information.
Eventually, Harvard users would benefit from far better access to the 5
million books located at the Harvard Depository (HD). If the University
undertakes the long-term program, Harvard users would gain online
access to the full text of out-of-copyright books stored at HD. For
books still in copyright, Harvard users could gain the ability to
search for small snippets of text and, possibly, to view tables of
contents. In short, the Harvard student or faculty member would gain
some of the advantages of browsing that remote storage of books at HD
cannot currently provide.
According to Sidney Verba, Carl H. Pforzheimer University Professor and
Director of the University Library, “The possibility of a large-scale
digitization of Harvard’s library books does not in any way diminish
the University’s commitment to the collection and preservation of books
as physical objects. The digital copy will not be a substitute for the
books themselves. We will continue actively to acquire materials in all
formats and we will continue to conserve them. In fact, as part of the
pilot we are developing criteria for identifying books that are too
fragile for digitizing and for selecting them out of the project.
“It is clear,” Verba continued, “that the new century presents
unparalleled challenges and opportunities to Harvard’s libraries. Our
pilot program with Google can prove to be a vital and revealing first
step in a lengthy and rewarding process that will benefit generations
of scholars and others.”
Here’s a link to the University of Michigan annoucement that went to Michigan faculty – more details and a formal annoucement promised tomorrow a.m.:
http://vielmetti.typepad.com/vacuum/2004/12/google_search_t.html
thanks, Ed
Very interesting project.
Thanks also for your useful searchlinks
I wonder if you have ever visited the old (web-relatively speaking) http://www.searchlores.org site, that I personally find quite ia treasure for searching related matters.
There you have a similar project: “considering ways to make all kind of internet based collections accessible to anyone around the world…”
E.K
Back in the late 1960’s and early 1970s, my company Input Services, Dayton, Ohio converted library catalog cards into a computer readable format. Notably Ohio State Univ., Michigan State Univ., University of Calif. at Berkeley and I participated in a three year pilot porgram for the Library of Congress, using optical scanning equipment.
Now, semi retired, perhaps I could be of service in your project with Google.
regards,
Ken Benson
2929 NE 49th St, #6
Fort Lauderdale, FL 33308
Ph: 954 295 2530