Matt is the man who the SEO/SEM world looks to for answers around most things Google related. Over the past month Melanie and I have been having a wide-ranging email exchange with him on spam, the role of humans at Google, and other things. Here’s the result:
Let’s say you decide to leave Google and are asked to write an exact job description for a replacement to do exactly what you do now. What does it say? (We told Matt to be honest, or his options will not vest!)
My official job is to direct the webspam team at Google. Webspam is essentially when someone tries to trick a search engine into ranking higher than they should. A few people will try almost anything, up to and including the mythical GooglePray meta tag, to rank higher. Our team attempts to help high-quality sites while preventing deceptive techniques from working.
As a result of working on webspam, I started talking to a lot of webmasters on forums, blogs, and at conferences. So I’ve backed into handling a fair chunk of webmaster communication for Google. Last year I started my own blog so that I could answer common questions, or to debunk stuff that isn’t true (e.g., inserting a GooglePray meta tag doesn’t make a whit of difference). These days when I see unusual posts in the blogosphere, I’ll try to get a bug report to the right person, or to clarify if someone is confused.
As you pointed out, you’ve become the human voice between Google and webmasters/SEOs. We’ve heard Google needs to manually remove spam sometimes. And even the algorithm-based feed for Google News requires an editorial gatekeeper for selecting sites. Do you think there is a growing role for human presence in Google’s online technologies?
Bear in mind that this is just my personal opinion, but I think that Google should be open to almost any signal that improves search quality. Let’s hop up to the 50,000 foot view. When savvy people think about Google, they think about algorithms, and algorithms are an important part of Google. But algorithms aren’t magic; they don’t leap fully-formed from computers like Athena bursting from the head of Zeus. Algorithms are written by people. People have to decide the starting points and inputs to algorithms. And quite often, those inputs are based on human contributions in some way.
The simplest example is that hyperlinks on the web are created by people. A passable rule of thumb is that for every page on the web, there are 10 hyperlinks, and all those billions of links are part of the material that modern search engines use to measure reputation. As you mention, Google News ranks based on which stories human editors around the web choose to highlight. Most of the successful web companies benefit from human input, from eBay’s trust ratings to Amazon’s product reviews and usage data. Or take Netflix’s star ratings. This past week I watched Brick and Boondock Saints, and I’m pretty sure that L4yer cake and Hotel Rwanda are going to be good, because all those DVDs have 4+ stars. Those star ratings are done by people, and they converge to pretty trustworthy values after only a few votes.
So I think too many people get hung up on “Google having algorithms.” They miss the larger picture, which (to me) is to pursue approaches that are scalable and robust, even if that implies a human side. There’s nothing inherently wrong with using contributions from people–you just have to bear in mind the limitations of that data. For example, the three companies I mentioned above have to consider the malicious effect that money can have in their human systems. Netflix doesn’t have to worry much (who wants to spam a DVD rating?), while eBay probably spends a lot more time thinking about how to make their trust ratings accurate and fair.
Google recently added user-tagging to photos. it’s an interesting way to sort search, adding a personal and human dimension yet opening up a can of worms for syntax and keyword variation. Is this social training of human-input going to be applied to other dimensions of search at Google? Requiring labels to gain a critical mass before they become official is clever step, but of course its not immune to automated spamming. From your perspective on quality control, is this going to open up doors for more abuse of Google as a platform?
I personally would love to see more human input into search at Google. But the flip side is that someone has to pay attention to potential abuse by bad actors. Maybe it’s cynical of me, but any time people are involved, I tend to think about how someone could abuse the system. We’ve seen the whole tagging idea in Web 1.0 when they were called meta tags, and some people abused them so badly with deceptive words that to this day, most search engine give little or no scoring weight to keywords in meta tags.
Google took a cautious approach on this image tagging: the large pool of participants and their random pairing makes it harder to conspire, and two people have to agree on a tag. Users doing really weird things would look unusual, and image tagging is easy for people but much harder for a bot. As tagging goes, it’s on the safer end of the spectrum.
I think Google should be open to improving search quality in any way it can, but it should also be mindful of potential abuse with any change.
W3C Schools is listing its supporters’ websites on Page Rank 9 and PR7 pages in exchange for donations, $1000 a pop in cash or trade (http://www.w3.org/Consortium/sup). Speculation on this is buzzing because though W3C is a well respected educational resource many SEO blackhats endorse similar tactics. Does Google consider link selling a type of webspam against Google’s TOS? And if so, should we expect to see some kind of a censure on W3C? Or how does it differ from what Google considers webspam?
I’ve said this before in a few places, but I’m happy to clarify. Google does consider it a violation of our quality guidelines to sell links that affect search engines. If someone wanted to sell links purely for visitors, there are a myriad number of ways to do it that don’t affect search engines. You could have paid links do an internal redirect, and then block that redirecting page in robots.txt. You could add the rel=”nofollow” attribute to a link, which tells search engines that you can’t or don’t want to vouch for the destination of a link. The W3C decided to add a “INDEX, NOFOLLOW” meta tag to their sponsor page, which has the benefits that the sponsor page can show up in search engines and that users receive nice static links that they can click on, but search engines are not affected by the outlinks on that page. All of these approaches are perfectly fine ways to sell links, and are within our quality guidelines.
Did the W3C decide to add the metatag on their own, or was that the result of talks between you and the W3C?
We were happy to talk to people at the W3C to answer questions and to give background info, but I believe they made the decision to add the metatag themselves.
Thanks for the considered responses, Matt!