Tuesday, June 09, 2009

"Google Books Mutilates the Printed Past"

by J J Cohen

From the Chronicle Review, an essay by Ronald G. Musto that argues Google's massive project to digitize the archives of several prestigious libraries achieves more harm than good. Musto provides compelling evidence that the scanning has proceeded (in many case) quite poorly, with pages incompletely reproduced or blurry or otherwise demonstrating some sign that a fatiguable human being brings book to glass. He concludes:
If we acknowledge that Google Books is serving up to us only a mutilated, good-enough version of our already vicarious understanding of the past, what value does that hold for us? What dangers lie in wait for generations of students and scholars for whom the digital — and Google's version of it — will become the only reality? Must a whole new generation begin to reassemble the mutilations produced by Google Books to create authoritative and reliable digital texts? Must 2009 repeat the efforts of 1509 in reassembling, collating, editing, and republishing the scattered fragments of the manuscript past, which the age of print finally made uniform and authoritative? That would be absurd, precisely because it is so unnecessary. But should Google Books prevail, and the resources of the scholarly community be made irrelevant by Google's sheer scale and force, the future of our past will be in great doubt.
Though the execution could certainly be better, I have a difficult time seeing why scanning these books and making them avaialble is not good. The library copy doesn't vanish after digitization ... and to have the book's text now included in Google searches seems to me a great benefit.

Am I missing the evil Google commits through this project?

8 comments:

Rick Godden said...

I'm also not inclined to see the google digitization as evil. Flawed at times, sure. But so is the implementation of any new, or even continuing, technology. Yes there's a fatiguable human doing the work, who might make errors, but that's part of the dissemination of text overall. Whether it's an editor making the choice for us given certain manuscript evidence, or misreading a manuscript altogether, or whether it's a scribe making an error when copying, this slippage in the integrity of written text is probably unavoidable. The pristine, inviolate textual past may be a myth rather than something we can apprehend.

For myself, I ought to have thanked Google Books in my acknowledgements page to my dissertation when I turned it in 2 months ago. I can safely say, that particularly for a disabled person who has trouble reaching books, Google Books probably saved me additional weeks if not more.

prehensel said...

I'm not sure. Certainly the scanning of books is not a bad thing, and I can't imagine Google Books just killing library copies outright. It's the long-term consequences that give me pause.

In teaching freshmen, I've found that it's difficult to get them to actually go to the library already; if they can get copies of books online, it makes it even less likely they'll make the effort to go the the physical library. That invites two outcomes to my mind:

1) Whatever has not yet been digitized isn't read or cited nearly as much. The works digitized survive and grow in popularity; the physical works not digitized lag behind because they're a (relative) pain in the ass to get. (Which sounds like the sort of argument we get into about "fine art"--what was important was preserved and what wasn't important didn't survive, but in this case it wouldn't be a Panglossian cache of works; it would be a collection of books that were part of the public domain or whose publishers played ball with Google).

2) Like Musto argues, the digitized copy becomes the copy. I can envision a future in which physical books are relegated to the place we now reserve for those books still categorized by the Dewey-Decimal system (at least in UO's library). That is to say all but forgotten. It's pretty easy not to worry about losing or destroying a physical copy when you have an electronic one, but that would seem to lead to a general loss by attrition of other physical copies of books. Because they're only worth something if they're digitized, losing physical copies (even those not digitized) doesn't amount to much. It just seems that it would make it less likely that physical copies will be protected or cared for. I'm not saying that digital copies will necessarily put us in the same position we're in with the Finnesburh Fragment, but I think it makes things more likely.

(Plus, when the zombie attack comes and the power grid goes down, I'd still like to be able to curl up with a shotgun and a copy of Grapes of Wrath.)

Rick Godden said...

prehensel, despite my grateful comments above, I definitely share your concern about the digital being the (only) copy. I teach a researched writing course, and it's very difficult to get my students to understand that there is more sources available than what they can find on Academic Search Premiere. I make them physically get a bound volume of a journal to peruse, but of course that can't fully address the larger concerns you raise.

Regarding the Zombie Apocalypse, well, I'd be out of luck anyway...

dan remein said...

In the scanning and making available, jeffrey, no. But I do fear what google's search algorithms (determined in unknown ways at least in part influenced by corporate interests on large searches, which minoritizes the minority search (the first thing that comes up in most google searches is a wikipedia entry))will do to limit the possibilities and alternatives of imagination. I think google has fundamentally changes how we think about asking questions and research (not by questions, but by single word assertions). I also think that what google provides, produces a new kind of consciousness or imagination, one whose options, while seemingly endless, are actually quite limited because we think we've got it all in front of us on that first search screen when really there is much much more to the picture.

kvond said...

I remember reading a 17th century manuscript of an treatise on Optics on-line, (not done by Google), a treatise I had not hoped would ever be available to me as a Non-Affliated thinker, and finding of all things a human hand in the picture of one page. Does one think, "Damn that lazy student/worker" or "thank God I have access to this text"?

My entire study of the optics of Spinoza (apparently the first of its kind anywhere in the world) would not have been possible without Google Books and other newly on-line material. It's joke to decry the quality of this access, when the access itself has changed the direction from which scholarship itself can come, so very radically.

To give a sense of just how committed the quoted author is...

"If we acknowledge that Google Books is serving up to us only a mutilated, good-enough version of our already vicarious understanding of the past, what value does that hold for us?"

Only a MUTILATED Copy? This guy has to read some Latour, methinks.

"The Migration of the Aura – Exploring the Original through Its Facsimiles”, thoughts about it here, if any are interested:

http://kvond.wordpress.com/2009/03/08/the-flatness-of-latours-concept-of-origin-and-holbeins-the-ambassadors/

Holly Crocker said...

The google digitization project is not evil, but authors should keep apprised of the way the "settlement" will affect access to texts in the future (not because books are improperly scanned, but because there are issues of intellectual property and copyright protection that are still being worked out).

http://www.intellectualpropertylawblog.com/archives/copyrights-the-google-book-digitization-settlement-the-fair-use-question-remains.html

and

http://news.cnet.com/8301-13578_3-10229637-38.html

As the second (and more recent) article points out, there are worries about the information megalith google will become--these fears may be unwarranted, but we should all think about what it will mean for one company to have such a database (either for teaching, as prehensel points out, or for publication, since texts included in the database will certainly circulate more widely).

cheers, h

DGPitard said...

I've been downloading, reviewing, and bookmarking a lot of Google's scans for the Lollard Society Bibliographies, and I've got to say that he has one good point, after having gone through a couple of hundred books page by page: some are so poorly done I've had to collate several incomplete copies to make one, or scan pages out of a hard copy to fill a gap. Some scans miss pages, some duplicate pages, some have ridiculous smudges, sometimes a hand covers the text. I don't know that they've fixed it, but go look at their copy of Wyclif's Tractatus de ecclesia. I don't know if that's "evil," but the process sure isn't perfect.

Karl Steel said...

I've also found google books invaluable for the 19th-century editions that are precisely the kinds of books I find difficult to get at Brooklyn College. It's made my life easier and my scholarship much deeper. The benefits to others are, as Rick reminds us, still larger. And certainly the fixation on the physical object sounds like nothing more than any typical reaction to a new communications technology.

That said, I want to echo Holly and thank her for her links. The problem with this particular technology may be its connection to a particular megacorporation. I can also direct folks to this article, "The Long Goodbye," by Elisabeth Sifton, an article that--confessedly--annoys me in tens of ways, but there's still this, which I found valuable...particularly its final refusal of nostalgia, contra everything we have come to expect by this point in the otherwise ephebiphobic screed:

"It's a colossal irony to have the guys and gals of Amazon, Google and their ilk lusting for free book "content" as premium material on which to stake their enlarged claims to commercial riches. For these clever mathematicians and engineers who are shaping the electronic business of our time and the archives of the future, these baby-faced young entrepreneurs, have risen to their mercantile eminence without encountering books, and don't think they need to. I enjoyed the fatuous surprise of Google's Sergey Brin discovering that "There is fantastic information in books. Often when I do a search, what is in a book is miles ahead of what I find on a Web site." Translating this backhanded recognition of value into his own debased lingo, he understands that books make for "viable information-retrieval systems," information being the only cultural signifier he recognizes, evidently. His company's amazing presumption that book people should simply hand over the keys to their priceless kingdom shows how completely he and his colleagues misunderstand what is at stake.

"But these Internet people don't care. For billionaires like Brin, accessing the giant river of infinite book "content" onto which they can glue paid advertising is simply a giant new way to make more money, and they are single-minded about that. The giveaway is not only in their ignorance but in their reluctance to share the wealth. For its Look Inside program, Amazon demands that publishers give it, gratis, electronic files of the books, along with blurbs and cover art, arguing that in return the publishers will have increased sales. How might you prove or disprove that? (Publishers might recognize Amazon's argument, since it resembles the pathetically phony one about composition costs that they themselves used against writers years ago.) The (not yet settled) settlement between Google Book Search and the publishers who sued it for copyright infringement proposes to give a breathtakingly audacious near-monopoly to Google and mingy terms to writers. We publishers seem to have forgotten that Google's and Amazon's profit margins are triple or quintuple ours, and we haven't always checked our contracts with the authors.

It is a confused, confusing and very fluid situation, and no one can predict how books and readers will survive. Changed reading habits have already transformed and diminished them both. I, for one, don't trust the book trade to see us through this. Wariness is in order. Three centuries ago, John Locke agreed that we shouldn't base our freedom to read books on the proclaimed good offices of the business itself. "Books seem to me to be pestilent things," he wrote in 1704, "and infect all that trade in them...with something very perverse and brutal. Printers, binders, sellers, and others that make a trade and gain out of them have universally so odd a turn and corruption of mind, that they have a way of dealing peculiar to themselves, and not conformed to the good of society, and that general fairness that cements mankind.""