Amazon gets book smart
Amazon adds an Online Reader for search inside books. John says, “If this is what I think it is, this signals that Amazon is getting into all forms of readable content online, a shift in biz model strategy.”
Resource Shelf summarizes the features:
– Search for words or phrases in the book (you can also search the entire Amazon.com database or A9).
– View single pages or continuous pages by scrolling
– Zoom in or Zoom out (very useful)…
…and notes that this is all part of Amazon’s step further into the “upgrade program where you can read purchased books online, print pages, add notes, bookmark pages, etc.” which “is similar to what you can already do with books accessible (for free).”
Technorati & AP team-up
Technorati and the Associated Press begin sharing a dynamic feed of the most blogged about AP articles at its +400 member sites. The Technorati announcement: Increasingly, what the blogosphere says about a news story becomes part of a more complete story, lending diverse perspectives and often expert commentary…When readers visit an AP member Web site that uses AP Hosted Custom News, they will see a module featuring the “Top Five Most Blogged About” AP articles right next to the article text, dynamically powered by Technorati. Additionally, when readers click on an AP article, Technorati will deliver “Who’s Blogging About” that article.
This follows similar service partnerships Technorati shares with the WashPost and other publications. Bloggers cheer.
Free eBook fair
Celebrate the 35th anniversary of free eBooks and the Project Gutenberg. For its birthday month of July, Gutenberg plans to offer free, permanent download access to over 1/3 million books. The PDF-file books are available with support from The World eBook Library, which Resource Shelf says normally charges $8.95 a pop for a permanent download. (SearchBlog recently looked forward to scanning through a good eBook at the neighborhood universal library.)
Data mining the blogosphere
A new paper maps out what the blogosphere offers in research potential and challenges. Written by Gilad Mishne at the Intelligent Systems Lab, University of Amsterdam, “Information Access Challenges in the Blogspace” is available in PDF.
First, Mishne describes the blogosphere: in time, as highly dynamic and tied to current events; in structure, as primarily a network of individuals in one-to-one relationships; in language, by informality and subjectivity. Mishne foresees a huge development in the still-infant specialized blog-search services and tools –such as Technorati and Blogpulse. According to the paper, the blogosphere grows at a rate of 750,000 new posts per day, with a steady readership of 20% of internet users. A couple speed bumps in both data analysis and retrieval: frequent misspellings that skew keyword tracking and spam. Mishne concludes the blogspace ultimately lends itself to future research in sentiment analysis, tapping the vox populi for the genesis and evolution of trends, profiling individual bloggers and communities, and enhancing search quality.
Instant, dynamic, spelling-flexible search
A series of search engines, developed by the German company Exorbyte, provide instant, dynamic, orthographically-adaptive suggestions and results. Co-founder, Franz Guenthner, a Professor of Computational Linguistics at the University of Munich (previously at AltaVista, All The Web), says that “contrary to the Google type of suggest in use elsewhere e.g. Snap.com – [Exorbyte] finds all the records in the index even when the query is orthographically defective” (spelled wrong).
Here are a few demos applying their search engines:
– A tri-lingual (French, German, English) engine that supplies Wikipedia entries: Exorbyte -Wiki
– A German job search engine, launched last week: Job a Nova
– A German shopping site: Billinger
Guenthner says Exorbyte engines can “search tens of millions of records in an “approximate mode” at under 10 milliseconds.”
CQ Web is relased in beta
Today, Q-Phase releases its contextual web search tool CQ Web in beta. CQ Web breaks down a search query into an index of keywords, ‘keypairs’, and keyphrases, each with corresponding focused results. Q-Phrase says CQ Web identifies relationships not only between the original search terms, but also among keywords extracted from the results pages.
Aside from the obvious search giants, CQ Web accesses several Web 2.0 content sites including MySpace and del.icio.us, bringing the number of search engine options to eleven. One note, their press release says CQ WEB automatically visits “the most relevant search results” of the major search engines “to discover significant keywords and topics relating top the original search query”—that sort of sounds like CQ Web only analyzes the first few pages (or however many) of results from whichever search engine you select for it to piggyback.
After playing with the beta (downloadable for PC and Mac OS X) with a search on “Searchblog” using Google, a mini review of CQ Web. I’m not so sure the “interface circumvents the ‘hit or miss’ nature and trial-and-error link clicking” as promised, but then that’s a big promise. CQ Web can help an initial search query be more robust by delivering more contextualized results, but it looks like the beta still needs refining. (The query on Searchblog produced “online poker” as a ‘keypair’ and delivered at least one spam page.) Though it delves deeper, it does so at the tip of the proverbial iceberg, so users should be careful to target their keywords— because instead of the desired url displaying on the n-th page of results, in CQ it might not display at all without proper initial focus. How focused? A search for “Searchblog” produced only 111 main topics and 65 total results. The keyword index is an added benefit, even if not always complete or “the most meaningful” keywords.