The NYT now reports on Google’s program to digitize some of the world’s most important libraries, and it is truly an amazing project. Google was founded at Stanford in partial association with that university’s digital library effort, so this must be a pretty proud day for Stanford, which is a participant, as well as the original Googlers. John Markoff spoke to Larry Page:
Mr. Page said yesterday that the project traced to the roots of Google, which he and Mr. Brin founded in 1998 after taking a leave from a graduate computer science program at Stanford where they worked on a “digital libraries” project. “What we first discussed at Stanford is now becoming practical,” Mr. Page said.
The details: Google is working with Stanford, the University of Michigan, Harvard, Oxford, and the New York Public Library to make millions of books available in its index. For now the project is in pilot phase, but there are hopes and expectations this will go big in the next few years. A source told me the project was originally named Google Library, but for now it will exist under the Google Print moniker. An example of Google Print is here. The screenshot at left is what I was provided by Google for today’s launch.
The implications here are significant. First, the idea that the world’s knowledge, as held through books and libraries, is opening up to all via a web browser cannot be understated. It’s one thing to have the an original copy of The Origin of Species on the shelves, where students and interested parties have to travel to find it. It’s another to have it available to everyone via a search index and your web browser. Second, this move clearly puts Google in the category of innovator when it comes to adding information to their index. But it also raises significant business model questions, one that are both exciting and unanswered. I brought them up in an earlier post:
A very interesting case will be Google Print. As that program expands, and it’s rumored that it will, dramatically, a number of questions arise. How will Google monetize out-of-copyright books? If it indeed does bring tens of thousands of out-of-print books onto the web and into its index, will it allow others to access and index that new treasure trove, or will it act more like a traditional media company, which would “own” that resource for itself? How will it choose what it brings into the index – those that might sell? Those that somehow are the most “in demand” by some measurable standard? With regard to books that are in print, will it limit itself to being soley an organizational tool supported by AdWords, or will it start to take a vig for books that are sold via the Google Print service (in fact, maybe it does already and I’m simply unaware of it – any publishers out there, let me know!)? And will the print model scale to television and movies or music?
Google Print already monetizes a selection of in-copyright books via advertising, and shares some of those revenues with the publishers. But it’s a very short distance between that and, say, an affiliate link to Amazon or any other booksellers for a cut of an in copyright sale. It’s also a very short route to the on demand publishing of an out of print and out of copyright book with a company that is set up to do such a deal, and I am aware of at least one that is about to launch that will provide just such a service. Of course, if you want an ebook, that can be arranged as well. For out of copyright books, the tail is extraordinarily long, and quite possibly very very profitable. In other words, this could well be a step toward diversifying Google’s revenue streams away from advertising and into direct sales and/or subscriptions – ie, the content business. As one source who is familiar with the industry tells me, Google is not doing this only out of the kindness of its heart – there is a lot of money to be made in selling books, in particular books with no copyright.
I did ask Adam Smith, a manager of the Print program at Google, how Google will decide which books get scanned first. He said quite forthrightly that he did not have a good answer for me on that yet. I’ve heard from others that for now it’s pretty random, but the question is important. As to whether Google will allow anyone else to index the books they scan, I am pretty sure the answer is no. After all, Amazon is also scanning books, and I am sure they aren’t letting others in on their hard work. I’ll repost if that turns out to be inaccurate. And of course there are other efforts, including Project Gutenberg and the Internet Archive. But now, we have a commercial giant who has both a mission-based (organize the world’s information and make it accessible) as well as a commercially viable reason to bring this information to the world. As David Hayes, a copyright lawyer at Fenwick who worked on this deal and who I’ve known from my own work with his firm put it: “This will create a revolutionary new information location tool that should be a benefit to the whole world.” I for one applaud the effort – it’s an example of enlightened capitalism, and I hope it thrives.
Update: I originally posted the wrong image, new image to come.