<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: More On Yahoo, Google, Index, Size</title>
	<atom:link href="http://battellemedia.com/archives/2005/08/more_on_yahoo_google_index_size.php/feed" rel="self" type="application/rss+xml" />
	<link>http://battellemedia.com/archives/2005/08/more_on_yahoo_google_index_size.php?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=more_on_yahoo_google_index_size</link>
	<description>Thoughts on the intersection of search, media, technology, and more.</description>
	<lastBuildDate>Wed, 22 May 2013 20:35:00 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.4.1</generator>
	<item>
		<title>By: lorraine</title>
		<link>http://battellemedia.com/archives/2005/08/more_on_yahoo_google_index_size.php#comment-20398</link>
		<dc:creator>lorraine</dc:creator>
		<pubDate>Sat, 10 Jun 2006 13:28:15 +0000</pubDate>
		<guid isPermaLink="false">http://battellemedia.com/archives/2005/08/more_on_yahoo_google_index_size.php#comment-20398</guid>
		<description>&lt;p&gt;Can&#039;t believe you said it either but I&#039;m still laughing.  Iwas referring to Font size and I can bearly see it.  Don&#039;t laugh I&#039;m serious.  Can you give me an answer.  Thanks L&lt;/p&gt;</description>
		<content:encoded><![CDATA[<p>Can&#8217;t believe you said it either but I&#8217;m still laughing.  Iwas referring to Font size and I can bearly see it.  Don&#8217;t laugh I&#8217;m serious.  Can you give me an answer.  Thanks L</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Brien</title>
		<link>http://battellemedia.com/archives/2005/08/more_on_yahoo_google_index_size.php#comment-20397</link>
		<dc:creator>Brien</dc:creator>
		<pubDate>Fri, 09 Sep 2005 02:34:01 +0000</pubDate>
		<guid isPermaLink="false">http://battellemedia.com/archives/2005/08/more_on_yahoo_google_index_size.php#comment-20397</guid>
		<description>&lt;p&gt;I know this is an old post, but I&#039;ve seen all kinds of increases in my search engine saturation for my sites, first from Yahoo right before their big index claim, and now Google looks like its ready to fire back.  Number of pages available for my sites has tripled in the Google index in less then a few weeks.  I&#039;d bet their going to do something astronomical to their number soon.  In some cases it appears that they are crawlign more and in other cases it looks like they might have loosened their de-duplication rules.  Either they ordered even more hardware, or maybe every developers workstation is crawling in its idle time now!&lt;/p&gt;

&lt;p&gt;Anyone else hearing bits in this space?&lt;/p&gt;</description>
		<content:encoded><![CDATA[<p>I know this is an old post, but I&#8217;ve seen all kinds of increases in my search engine saturation for my sites, first from Yahoo right before their big index claim, and now Google looks like its ready to fire back.  Number of pages available for my sites has tripled in the Google index in less then a few weeks.  I&#8217;d bet their going to do something astronomical to their number soon.  In some cases it appears that they are crawlign more and in other cases it looks like they might have loosened their de-duplication rules.  Either they ordered even more hardware, or maybe every developers workstation is crawling in its idle time now!</p>
<p>Anyone else hearing bits in this space?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jean VÃƒÂ©ronis</title>
		<link>http://battellemedia.com/archives/2005/08/more_on_yahoo_google_index_size.php#comment-20396</link>
		<dc:creator>Jean VÃƒÂ©ronis</dc:creator>
		<pubDate>Fri, 19 Aug 2005 10:49:37 +0000</pubDate>
		<guid isPermaLink="false">http://battellemedia.com/archives/2005/08/more_on_yahoo_google_index_size.php#comment-20396</guid>
		<description>&lt;p&gt;Jack&gt; Thanks for citing my post in your comment above.&lt;/p&gt;

&lt;p&gt;I have added a detailed analysis of the NCSA study, which shows that it is totally flawed:&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;http://aixtal.blogspot.com/2005/08/yahoo-pages-manquantes-2.html&quot; rel=&quot;nofollow&quot;&gt;&lt;/a&gt;&lt;a href=&quot;http://aixtal.blogspot.com/2005/08/yahoo-pages-manquantes-2.html&quot; rel=&quot;nofollow&quot;&gt;http://aixtal.blogspot.com/2005/08/yahoo-pages-manquantes-2.html&lt;/a&gt;&lt;br /&gt;
&lt;/p&gt;

&lt;p&gt;--jv&lt;/p&gt;</description>
		<content:encoded><![CDATA[<p>Jack> Thanks for citing my post in your comment above.</p>
<p>I have added a detailed analysis of the NCSA study, which shows that it is totally flawed:</p>
<p><a href="http://aixtal.blogspot.com/2005/08/yahoo-pages-manquantes-2.html" rel="nofollow"></a><a href="http://aixtal.blogspot.com/2005/08/yahoo-pages-manquantes-2.html" rel="nofollow">http://aixtal.blogspot.com/2005/08/yahoo-pages-manquantes-2.html</a>
</p>
<p>&#8211;jv</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Kendall Willets</title>
		<link>http://battellemedia.com/archives/2005/08/more_on_yahoo_google_index_size.php#comment-20395</link>
		<dc:creator>Kendall Willets</dc:creator>
		<pubDate>Sun, 14 Aug 2005 21:59:06 +0000</pubDate>
		<guid isPermaLink="false">http://battellemedia.com/archives/2005/08/more_on_yahoo_google_index_size.php#comment-20395</guid>
		<description>&lt;p&gt;Thanks for the study, Matt.  One problem I noticed is that spam is biased towards random word combinations, so queries like &quot;ensiform teleprompter&quot; return a lot of spam and porn sites.  Apparently the web spammers have adopted the same tactic used by email spammers, of loading a page with random terms to dilute the significance of the spam terms.&lt;/p&gt;

&lt;p&gt;Have you made any attempt to separate spam from the results?&lt;/p&gt;</description>
		<content:encoded><![CDATA[<p>Thanks for the study, Matt.  One problem I noticed is that spam is biased towards random word combinations, so queries like &#8220;ensiform teleprompter&#8221; return a lot of spam and porn sites.  Apparently the web spammers have adopted the same tactic used by email spammers, of loading a page with random terms to dilute the significance of the spam terms.</p>
<p>Have you made any attempt to separate spam from the results?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Burton Floyd</title>
		<link>http://battellemedia.com/archives/2005/08/more_on_yahoo_google_index_size.php#comment-20394</link>
		<dc:creator>Burton Floyd</dc:creator>
		<pubDate>Sun, 14 Aug 2005 20:37:29 +0000</pubDate>
		<guid isPermaLink="false">http://battellemedia.com/archives/2005/08/more_on_yahoo_google_index_size.php#comment-20394</guid>
		<description>&lt;p&gt;Size might not matter if you are &quot;Waving the right Wand&quot; We believe we have found that wand making zipcode searches fast obsolete. &lt;/p&gt;

&lt;p&gt;Burton&lt;br /&gt;
&lt;a href=&quot;http://roadjunctions.blogspot.com&quot; rel=&quot;nofollow&quot;&gt;http://roadjunctions.blogspot.com&lt;/a&gt; &lt;/p&gt;</description>
		<content:encoded><![CDATA[<p>Size might not matter if you are &#8220;Waving the right Wand&#8221; We believe we have found that wand making zipcode searches fast obsolete. </p>
<p>Burton<br />
<a href="http://roadjunctions.blogspot.com" rel="nofollow">http://roadjunctions.blogspot.com</a> </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: zzz</title>
		<link>http://battellemedia.com/archives/2005/08/more_on_yahoo_google_index_size.php#comment-20393</link>
		<dc:creator>zzz</dc:creator>
		<pubDate>Sun, 14 Aug 2005 20:26:22 +0000</pubDate>
		<guid isPermaLink="false">http://battellemedia.com/archives/2005/08/more_on_yahoo_google_index_size.php#comment-20393</guid>
		<description>&lt;p&gt;&quot;any claims that any one company can accurately estimate another&#039;s index are simply not defensible&quot;&lt;/p&gt;

&lt;p&gt;Implying they can say whatever they want because nobody can prove them wrong anyway.&lt;/p&gt;

&lt;p&gt;Whatever the size of their index may be (i can&#039;t believe i just wrote that), this story just reeks of pr (as they admit as well). Ultimately however, people don&#039;t care for &quot;index size&quot;, they just care for better search results.&lt;/p&gt;</description>
		<content:encoded><![CDATA[<p>&#8220;any claims that any one company can accurately estimate another&#8217;s index are simply not defensible&#8221;</p>
<p>Implying they can say whatever they want because nobody can prove them wrong anyway.</p>
<p>Whatever the size of their index may be (i can&#8217;t believe i just wrote that), this story just reeks of pr (as they admit as well). Ultimately however, people don&#8217;t care for &#8220;index size&#8221;, they just care for better search results.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Matt Cheney</title>
		<link>http://battellemedia.com/archives/2005/08/more_on_yahoo_google_index_size.php#comment-20392</link>
		<dc:creator>Matt Cheney</dc:creator>
		<pubDate>Sun, 14 Aug 2005 20:03:16 +0000</pubDate>
		<guid isPermaLink="false">http://battellemedia.com/archives/2005/08/more_on_yahoo_google_index_size.php#comment-20392</guid>
		<description>&lt;p&gt;Several researchers (myself included) at the National Center for Supercomputing Applications at the University of Illinois at Urbana-Chamapign ran a fairly extensive study about this issue (about 10,000 queries) and found that Google returns more results than Yahoo in almost every single case. We found that Google returned well over 150% more total results and gave more results in about 97% of our queries.&lt;br /&gt;
&lt;/p&gt;&lt;p&gt;&lt;br /&gt;
Our full study and test code is available online at: &lt;a href=&quot;http://vburton.ncsa.uiuc.edu/indexsize.html&quot; rel=&quot;nofollow&quot;&gt;&lt;/a&gt;&lt;a href=&quot;http://vburton.ncsa.uiuc.edu/indexsize.html&quot; rel=&quot;nofollow&quot;&gt;http://vburton.ncsa.uiuc.edu/indexsize.html&lt;/a&gt; &lt;/p&gt;</description>
		<content:encoded><![CDATA[<p>Several researchers (myself included) at the National Center for Supercomputing Applications at the University of Illinois at Urbana-Chamapign ran a fairly extensive study about this issue (about 10,000 queries) and found that Google returns more results than Yahoo in almost every single case. We found that Google returned well over 150% more total results and gave more results in about 97% of our queries.
</p>
<p>
Our full study and test code is available online at: <a href="http://vburton.ncsa.uiuc.edu/indexsize.html" rel="nofollow"></a><a href="http://vburton.ncsa.uiuc.edu/indexsize.html" rel="nofollow">http://vburton.ncsa.uiuc.edu/indexsize.html</a> </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jack DeNeut</title>
		<link>http://battellemedia.com/archives/2005/08/more_on_yahoo_google_index_size.php#comment-20391</link>
		<dc:creator>Jack DeNeut</dc:creator>
		<pubDate>Sat, 13 Aug 2005 01:16:34 +0000</pubDate>
		<guid isPermaLink="false">http://battellemedia.com/archives/2005/08/more_on_yahoo_google_index_size.php#comment-20391</guid>
		<description>&lt;p&gt;Jean Véronis has an interesting post about Google and Yahoo index size on his blog at: &lt;a href=&quot;http://aixtal.blogspot.com/2005/08/yahoo-19-billion-pages.html&quot; rel=&quot;nofollow&quot;&gt;http://aixtal.blogspot.com/2005/08/yahoo-19-billion-pages.html&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Most of what he writes is in French, but he translates some interesting articles (like this one) into English. &lt;/p&gt;</description>
		<content:encoded><![CDATA[<p>Jean Véronis has an interesting post about Google and Yahoo index size on his blog at: <a href="http://aixtal.blogspot.com/2005/08/yahoo-19-billion-pages.html" rel="nofollow">http://aixtal.blogspot.com/2005/08/yahoo-19-billion-pages.html</a></p>
<p>Most of what he writes is in French, but he translates some interesting articles (like this one) into English. </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Kendall Willets</title>
		<link>http://battellemedia.com/archives/2005/08/more_on_yahoo_google_index_size.php#comment-20390</link>
		<dc:creator>Kendall Willets</dc:creator>
		<pubDate>Fri, 12 Aug 2005 23:17:11 +0000</pubDate>
		<guid isPermaLink="false">http://battellemedia.com/archives/2005/08/more_on_yahoo_google_index_size.php#comment-20390</guid>
		<description>&lt;p&gt;I don&#039;t know how many have sat through Peter Norvig&#039;s &quot;size matters&quot; presentation on machine learning algorithms, but his central thesis is that the worst algorithm outperforms the best ones, given a large enough sample size.&lt;/p&gt;

&lt;p&gt;He also said that about 30% of their crawl is duplicates, so I can see where there&#039;s room for debate about net index sizes.&lt;/p&gt;</description>
		<content:encoded><![CDATA[<p>I don&#8217;t know how many have sat through Peter Norvig&#8217;s &#8220;size matters&#8221; presentation on machine learning algorithms, but his central thesis is that the worst algorithm outperforms the best ones, given a large enough sample size.</p>
<p>He also said that about 30% of their crawl is duplicates, so I can see where there&#8217;s room for debate about net index sizes.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Aki</title>
		<link>http://battellemedia.com/archives/2005/08/more_on_yahoo_google_index_size.php#comment-20389</link>
		<dc:creator>Aki</dc:creator>
		<pubDate>Fri, 12 Aug 2005 22:14:51 +0000</pubDate>
		<guid isPermaLink="false">http://battellemedia.com/archives/2005/08/more_on_yahoo_google_index_size.php#comment-20389</guid>
		<description>&lt;p&gt;John, I also did a quick study and found some interesting results.  Specifically, while Yahoo estimates more results on their first results page, they usually only deliver 30% of that estimator.  Also, with respect to unique and even total results, Google seems to give me 50-70% more hits.  You can look at my queries and results in my entry at: &lt;a href=&quot;http://blog.akashjain.org/2005/08/12/is-yahoos-index-really-bigger-methinks-not-really-googles-index-seems-50-larger/&quot; rel=&quot;nofollow&quot;&gt;http://blog.akashjain.org/2005/08/12/is-yahoos-index-really-bigger-methinks-not-really-googles-index-seems-50-larger/&lt;/a&gt; &lt;br /&gt;
- definitely interested in anyone else&#039;s thoughts.&lt;/p&gt;</description>
		<content:encoded><![CDATA[<p>John, I also did a quick study and found some interesting results.  Specifically, while Yahoo estimates more results on their first results page, they usually only deliver 30% of that estimator.  Also, with respect to unique and even total results, Google seems to give me 50-70% more hits.  You can look at my queries and results in my entry at: <a href="http://blog.akashjain.org/2005/08/12/is-yahoos-index-really-bigger-methinks-not-really-googles-index-seems-50-larger/" rel="nofollow">http://blog.akashjain.org/2005/08/12/is-yahoos-index-really-bigger-methinks-not-really-googles-index-seems-50-larger/</a> <br />
- definitely interested in anyone else&#8217;s thoughts.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
