All Classification Schemes Have Bias
As David Weinberger notes. In particular, the Dewey Decimal System has inherent religious biases. I've done some research on Mr. Dewey as part of my book, and he was quite the bigot, it appears.
I wonder, 100 years from now, when folks are writing the history of indexes like Google and Yahoo, what biases will emerge?




Comments
Where did Weinberger say Dewey a bigot? He wrote:
"But it's not because Melvil Dewey (1851-1931) was a racist...Dewey himself was a progressive on social issues. For example, he hired seven women at the Columbia University library —a radical idea —and even set up a library school that admitted women."
I don't have a particular stake in all this, but to me it seems Weinberger is suggesting Dewey's arrangement of religious categories was altruistic - unless another source can be provided that suggests otherwise.
Speaking of Bigoted Giants of Communication, Samuel Morse, the brilliant inventor of the telegraph, donated a chunk of his earnings from that device to anti-Irish-immigrant groups like the "Order of United Americans" (aka the "Know-Nothings"). Describing the Irish as a teeming sea of ignorant, dirty, overly fertile drunks, he ran for office on a platform of keeping America 'pure.'
I think of Morse sometimes when Irish-American Pat Buchanan uses similar language to rail against today's immigrants - and when a couple of my older Irish-American relatives agree with him. My relatives, Mr. Buchanan, and myself should all be grateful that Morse, the Pat Buchanan of the 1840s, did not get his wish.
I have not sought evidence of anti-Irish bias in old telegrams, but I feel a ghostly twinge when I pass Morse's statue in Central Park.
Future people will most likely recognize the anti-male bias that's not yet widely understood today. It's there in all media, though the internet isn't quite as bad as TV and newspapers.
Not a defender of Buchanon but to be fair I believe his issue with immigrants has to do with illegals. As for bigotry in the later 1800s it would not be surprising to find that the majority of adult males back then held various bias beliefs.
In my travels and observations I find that little has changed in this way.
If I were reading a description of the Dewey Decimal system in a book I would not want references to his value system, I would expect that he had flaws. If the book was about Dewey and how the system was developed I would be interested in knowing if his core beliefs were reflected in his work.
DDC can easily be expanded at the local level to create new classification numbers. DDC is what's called a synthetic classification scheme. Take different parts of the scheme and merge them together as needed. Finally, DDC is always being updated.
http://www.oclc.org/dewey/updates/default.htm
Bottom line, yes, Dewey had many issues and like you John have read many accounts. However, this does not mean that other religions cannot be handled by the scheme. Weinberger is correct that a revamp adding would cause many problems and cost
many $$$. Wicca, Jews for Jesus and other groups CAN be handled but with longer numbers.
This is versus the other major library classification scheme called LCC. (Library of Congress Classification).
This scheme is not as easy to expand. It's considered an enumartive classification (it's all spelled out).
Classification does provide underlying subject access but in this country at most libraries, it's primarily used to "mark and park" materials. In other words most libraries only use classification materials so they can place them on the shelves (call numbers) and bring like things together. Don't get me wrong browsing and serendipity are KEY!!! Of course, most OPACS can be virtuall browsed by call/classification number (just like you would be physically walking along the stacks) but most people don't search this way.
What people call things AND how most people search for material (when searching by subject) is by using Library of Congress Subject Headings (LCSH). These headings are edited and voted upon by a group of librarians at LC.
The biggest problem is that these subject headings often take time to come into official usage. Literary warrant is what drives most of this and they often don't reflect how people speak. Of course, any library can use or modify these headings.
If you would like to see some recent lists of LCSH terms being added, dropped, or modified this page has links to recent bulletins.
Go to http://www.loc.gov/catdir/cpso/
and scroll to:
Library of Congress Subject Headings
Select a bulletin by date.
You'll also see a link to read recent additions to the LCC scheme.
Prior to online catalogs most catalogers limited records to only three subject headings. Why? Creating, filing, and maintaining catalog cards was a timely and inexpensive process. Now, with computerized catalogs, it's much easier to do this. Computers have also made working with cross references easier.
Other controlled vocabs and lists of subject headings exist. For example,
***ProQuest has its own thesaurus of terms as do many other databases that handle content from serials. LCSH and Dewey are not traditionally used for this type of material. That's right, all of these schemes are used just for monographs, maps, and other types of material.
***Other classification, subject oriented, schemes exist. For example, ERIC not only offers subject access but also classified access to entries.
Finally, Weinberger should have been a bit clearer (for the non-librarian audience) in his post. When he's talking about school libraries, he means K-12. Most larger libraries (including most university libraries) use the LCC/LCSH schemes.
oops. I forgot to add that I agree with one of Weinberger's conclusions, "Taxonomies are tools, so there's no such thing as the One Right Taxonomy, just as can-openers aren't more right than asphalt spreaders. By building in sufficient metadata — no easy task — diverse groups now and forever can build taxonomies that suit their needs."
That's right. Trying to be all things to all people is very difficult.
Of course, one of the biggest issues WHEN it COMES to general web search tools like Google and Yahoo is the use of controlled vocabs and classfication.
It's going to be very difficult to get all sites that crawled by these engines to agree on a controlled list of terms and then to apply them properly. Of course, simply creating a scheme would be a huge and expensive task. Spam and other issues would also need to be considered.
Resources and technology like Vivisimo are doing a good job to create technology that will do some of this dynamically but it's still far from what you get when using a professional cataloger/librarian to do the same thing.
This is why I believe that for many types of searching keeping content in smaller subject databases (verticals) where applying the proper metadata (what we've called cataloging in the library world for many years(-: ) is easier.
Some databases are free, others would be available
free via a local library***, while others would be fee-based available via a micropayment or subscription). Advertisers might sponsor free acess to a database if you view a certain number of ads, make a purchase, etc.
Then, at the time of search, the user can use federated search technology to merge material from disparate databases (he or she has acess to) and dedupe results.
Another plus is that instead of having to search many databases, each with a different interface, a searcher will only have to utilize one. As a librarian, one of the most common things I here is that people know many databases exist that might be useful, they just don't know where to start. This would help alleviate this problem.
If this sounds like metasearching it is to a degree. For the database types out their it also sounds like I'm talking about Z39.50 kind of stuff. It is. However, in the past few years the technology has improved many times over. Many "federated" search technology will now work with any database type, independent of protocol.
NISO is doing a great deal of work in this area.
http://www.lib.ncsu.edu/niso-mi/index.php/Main_Page
This paper (about a year old) talks about many of the companies/technology in the meta/federated search space.
http://www.natlib.govt.nz/files/CUI_Report_Final.pdf
*** Btw, you already have FREE access to many databases via your local public library. No need to visit the building. All you need is a library card. Every library offers different databases. here's an example of what the San Francisco Public Library offers for free, 24x7, from any computer with web access. WOW!
http://www.sfpl.org/sfplonline/dbcategories.htm
And yes, it's not a misprint, SFPL offers FREE full text, full image (delivered via PDF) access to every article from the NY Times back to Vol. 1, No. 1. Even the ads have been given subject descriptors.
For a deeper look at classification systems and cultural bias, try "Sorting Things Out: Classification and Its Consequences," by Bowker and Star.
Google frequently defines its algorithm as "objective", e.g. http://www.google.com/technology/, do there's no bias there. I haven't heard Yahoo! make this claim, so one must more closely watch their results for bias. Nutch punts altogether and says that objectivity is impossible, that transparent bias is preferable. How lame!
Leave a comment