The following guest post was submitted by Alister Cameron. Read his blog at Alister Cameron – Blogologist.
I was reading through the Google Blogsearch patent application today. It was filed back in September 2005 and never mentions Blogsearch by name, of course. In reading through all the convoluted legaleze, I discovered what I think are some rather intriguing statements, that give us a tantalizing insight into how the architects of Google’s “secret recipe” think…
Now, this post is probably not going to answer as many questions of yours as it may raise new ones. But the point I want to make here is this: Google determines a quality score for every blog post you write based on more factors that most of us have ever really understood. Indeed, the range of measurements contributing to Google’s quality score applied to your blog posts is nothing short of amazing.
Truthfully, I found myself thinking of Big Brother as I tried to grasp the magnitude of Google’s data-gathering capabilities. You’ll see what I mean as we dig deep into the bowels of this patent application. Along the way, I will consider some ramifications for how you blog and how you approach the marketing of your blog.
We need to dive into the text of the patent application here, specifically a section titled Determining a Quality Score for a Blog Document, starting with a summary of what Google will take into consideration when looking at the “positive indicators” of a blog post (we will not look at the negative indicators today):
[0037] Positive indicators as to the quality of the blog document may be identified (act 620). Such indicators may include a popularity of the blog document, an implied popularity of the blog document, the existence of the blog document in blogrolls, the existence of the blog document in a high quality blogroll, tagging of the blog document, references to the blog document by other sources, and a pagerank of the blog document. It will be appreciated that other indicators may also be used.
Each of the “indicators” listed above are now detailed in turn, and it’s here that Google start to be more revealing about their methods and intentions…
A Key Indicator: Feed Readership
[0038] The popularity of the blog document may be a positive indication of the quality of that blog document. A number of news aggregator sites (commonly called “news readers” or “feed readers”) exist where individuals can subscribe to a blog document (through its feed). Such aggregators store information describing how many individuals have subscribed to given blog documents. A blog document having a high number of subscriptions implies a higher quality for the blog document. Also, subscriptions can be validated against “subscriptions spam” (where spammers subscribe to their own blog documents in an attempt to make them “more popular”) by validating unique users who subscribed, or by filtering unique Internet Protocol (IP) addresses of the subscribers.
I’m intrigued that in this section Google seems to suggest that their main way of determining the popularity of a blog is the number of feed subscribers. This is an acknowledgement that a subscription to a blog feed is a much clearer indicator of a reader’s commitment to your blog than inbound traffic, which can be manipulated in all sorts of (e.g. black hat) ways.
However, the really important question to ask here is: How does Google know how many people are subscribed to your feed? Answer: Google Reader, the most popular feedreader at the moment, with a (loosely) estimated 30%+ share or the newsreader market. With this kind of marketshare, Google has 100% accurate data on what feeds 30% of people are subscribed to, and can make a very accurate “educated guess” on the other 70%. It’s an enviable and powerful position to hold.
So for starters — thanks to Reader — Google has a very accurate read on how many people are subscribed to your blog feed. But beyond just subscriber numbers, Google is using Reader to analyze the clicking and reading behaviour of feed subscribers. Google doesn’t just know if I’m subscribed to your blog feed; Google knows how often I actually show up to read your posts, when, where and how often I click through to your site, and so forth. And again, Google can make accurate extrapolations from how people interact with your feed on Reader, to how users of other newsreaders are doing the same.
And all this to derive a popularity rating of your blog and individual posts!
It makes me wonder: if Google sees themselves first and foremost as a search company, then Google Reader may exist, in their minds, first and foremost to measure feed readership accurately and thus derive an accurate quality score for Blogsearch… not (just) to give you and me another funky web-based newsreader to use. Think about it.
Clickety Click Click
[0039] An implied popularity may be identified for the blog document. This implied popularity may be identified by, for example, examining the click stream of search results. For example, if a certain blog document is clicked more than other blog documents when the blog document appears in result sets, this may be an indication that the blog document is popular and, thus, a positive indicator of the quality of the blog document.
Another indicator of the popularity of your blog (and of individual posts) is gleaned from a study of how people click on Google’s search results. For sure, Google records these outbound clicks, and somehow rewards the blogs that get the most clicks. Google will be looking for a lower-ranked blog post on a given page of search results that keeps getting more clicks than a higher-ranked listing. To Google, this would suggest that they need to honour the lower-ranked blog with a higher position in the search results, given that it’s getting more of the clicks.
Of course, the important question to ask here is: What can I do to increase the likelihood that a searcher will click on my link when it appears on a page of Google search results? The answer has been well covered, and has to do with search engine optimization (SEO) techniques related to your blog post’s title and the snippet Google displays under the title of your listing in the search results.
Google’s Verdict: Blogrolls Matter
The Google Blogsearch patent application includes a full three paragraphs dedicated to blogrolls and the significance Google ascribes to them in determining the quality of a blog. So we better not miss this:
[0040] The existence of the blog document in blogrolls may be a positive indication of the quality of the blog document. It will be appreciated that blog documents often contain not only recent entries (i.e., posts), but also “blogrolls,” which are a dense collection of links to external sites (usually other blogs) in which the author/blogger is interested. A blogroll link to a blog document is an indication of popularity of that blog document, so aggregated blogroll links to a blog document can be counted and used to infer magnitude of popularity for the blog document.
Google counts how many other blogs’ blogrolls include a link to your blog, and assigns your blog a score accordingly. This implies that Google knows about blogrolls and respects the fact that bloggers use them to indicate respect/trust/honour for/to other blogs. Blogrolls matter.
[0041] The existence of the blog document in a high quality blogroll may be a positive indication of the quality of the blog document. A high quality blogroll is a blogroll that links to well-known or trusted bloggers. Therefore, a high quality blogroll that also links to the blog document is a positive indicator of the quality of the blog document.
If your humble C-list blog is listed on someone else’s blogroll that otherwise contains only A-list blog links… you’re going to get lots of lovin’ from Google! That’s the point of this paragraph, anyway: Google looks at the kind of company your link keeps on blogrolls.
[0042] Simlarly, the existence of the blog document in a blogroll of a well-known or trusted blogger may also be a positive indication of the quality of the blog document. In this situation, it is assumed that the well-known or trusted blogger would not link to a spamming blogger.
If you’re on Scoble‘s blogroll, Google assumes you’re a) not a spam site and b) immediately worthy of respect… Scoble said so.
The critical question here is obvious: How can I get a link to my blog on more and better blogrolls? And again, the answers are many and varied but (I think) all come down to one key point: you have to earn a place on someone’s blogroll… with good content and consistency. Sure, you’ll get a few blogroll links from buddies and work associates perhaps, but if you want an ever growing number of blogroll links, you need to endear yourself to people you’ve never met, who have grown into committed readers over time, and will (perhaps) reward you with a blogroll link, just coz they want to.
Social Bookmarking is Good for Your Blog Rank!
[0043] Tagging of the blog document may be a positive indication of the quality of the blog document. Some existing sites allow users to add “tags” to (i.e., to “categorize”) a blog document. These custom categorizations are an indicator that an individual has evaluated the content of the blog document and determined that one or more categories appropriately describe its content, and as such are a positive indicator of the quality of the blog document.
The shift away from local/desktop URL bookmarking to online services like Technorati, del.icio.us, Stumbleupon, ma.gnolia, reddit (and a ton of others) has created an entirely new “social” experience of shared bookmarks, affinity/interest groups and voting systems (like Digg). And for Google and other search engines, this has meant the ability to compare on-page and link-text keyword analysis with a new third factor: tagging.
So now there are three ways Google can do the keyword analysis to work out what your page is about (and rank you accordingly): a) on-page factors, b) inbound link-text, and c) tagging/categorization on social bookmarking sites. And Google respects tagging because it reflects people’s idea of what you’re blog (or post) is about.
Further, I’m guessing Google is very sophisticated in how they analyze the content of social bookmarking sites. (I bet they have a bot and analytical apparatus purpose-built for this purpose.) Rest assured they factor in the number of times a given post of yours has been bookmarked, and how frequently a given tag is used.
Here, the important question for fellow bloggers is: How can I get my blog posts properly and extensively tagged across the verious social bookmarking sites?
The terms Social Media Optimization (SMO) has been coined to, in part, encompass the various answers to this question. Rohit Bhargava is credited with coining this term and it was he who first suggested a number of rules or goals for SMO. I suggest you start there.
My personal challenge to you would be to see this SMO thing as an exercise in establishing and maintaining mutually beneficial relationships. It’s in the mutuality of these online friendships that people bookmark, tag, vote for and in other ways express support of each other’s blogging efforts. But that’s subject deserves a post of its own.
(Note: it would be remiss of me not to make one more point on tagging: tag your post content properly. That’s the love Technorati, in particular, is looking for. When people bookmark your site to, say del.icio.us, they tag as they see fit. When you tag your own post content, your get the chance to cover all the bases you want covered. So get it right!)
Is Google Reading Your Mail?!
Read this carefully:
[0044] References to the blog document by other sources may be a positive indication of the quality of the blog document. For example, content of emails or chat transcripts can contain URLs of blog documents. Email or chat discussions that include references to the blog document is a positive indicator of the quality of the blog document.
Are you thinking what I’m thinking?! Google has a massively popular hosted email service – GMail. They also have Google Talk, a chat service. You probably knew that. But did you know Google has intentions of crawling the content of your GMail emails and Google Talk chat sessions?! Now, I don’t know if they actually do that or not, and I haven’t gone hunting thru their terms of service seeking clarity, but their stated aim is clear: to find URLs in two key forms of personal online communications (email and chats), and to use these discoveries to further rank blogs and blog posts.
I have to say it makes perfect sense. Why? Because Google is looking to build a more and more accurate profile of your and my blog. And to do this Google wants to see corroborating evidence of popularity across as many different “media” as possible: web pages, blog posts, search results click patterns, blogrolls, social bookmarking services, and now email and chat session content. Wow… that’s called being thorough.
(Note: Google will no doubt also be analyzing chat content from other services where the “transcripts” are indexed. Twitter immediately comes to mind, here.)
So what’s the big question to be asking? I think it’s this: What can I do as a blog author to ensure that my posts are being linked to, in email and chat conversations? And my answer remains the same: consistently write compelling (sometimes controversial) content that people will want to point others to. You just can’t get past this one… you need your own high-quality content pumped out on a regular basis.
Some Concluding Questions
I’ve quoted just a few paragraphs from a much longer (and largely boring) patent application for a product that ended up being called Google Blogsearch. Reading through the bits of the application that made any sense to my non-legal mind, and comparing that to what I know of Google Blogsearch, I was left with a few questions I thought I’d bring to the ProBlogger community:
- Does it make sense to have Blogsearch separate from the main Google search engine? I’m not sure about that, but in dedicating a unique search service to blog content, Google is telling us that in some sense it’s a different kind of content with a different indexing algorithm applied to it.
- As Google Blogsearch gains in popularity, will new (or adjusted) SEO strategies emerge along with it? Is Blogsearch different enough to warrant different strategies? How different are the search results compared to the same query in the main Google search engine?
- How many people actually use Google Blogsearch (http://blogsearch.google.com)? I haven’t seen any data out there on the popularity of that service.
- Do you use it? Why? What do you like about it? Anything you don’t like about it?
- How do you find Google Blogsearch compares with Technorati? What are the major differences you have observed in their search results? Do you have a preference?
Enjoy this post? Get more like it by subscribing to our RSS feed.
[…] Rowse has posted an excellent analysis on the inner workings of the Google Blogsearch made available through their patent application in September 2005. As Darren mentions, the […]
[…] classifica os blogs para as pesquisas efectuadas no Blogsearch. Deixo agora aqui referência para este artigo da autoria de Alister Cameron colocando no Problogger, onde é desvendanda com mais detalhe a patente Google sobre pesquisa de […]
— How many people actually use Google Blogsearch?
Sometimes I used Google Blogsearch to search what bloggers are talking about at the moment; also, I have some visitors are from GoogleBlogsearch. :-)
[…] Cameron has a very interesting post on ProBlogger about how Google Blogsearch indexes your blog […]
[…] How Google Blogsearch ranks your Posts… In their own words! freaky (tags: security privacy) […]
Alister — this was really a great post. Excellent breakdown. I think Google is putting all the pieces together (conversation monitoring, etc) before they construct what will probably be the definitive “primary” social network that will tie all of this together. This analysis of their blogsearch goes along way to show how sensitive they are to social media metrics.
Does it make sense to have Blogsearch separate from the main Google search engine?
I don’t think so. Blogware is becoming a standard content management system these days. Almost every site keeping up with the times has a feed, too. And these methods sound like effective methods to improve the search, why not use them on regular sites?
As Google Blogsearch gains in popularity, will new (or adjusted) SEO strategies emerge along with it?
Of course. SEO strategies will always adjust to match the search and ranking systems.
Is Blogsearch different enough to warrant different strategies?
Maybe a few different strategies, but none of these things in particular are things that can’t be used to search other sites.
How different are the search results compared to the same query in the main Google search engine?
Different. I like being able to narrow the results for different time periods. For example, right now if you search “menu foods” you will find page after page on the pet food recall. That’s valid, that’s great. But what if I want to find out about something Menu Foods did a few years ago? Or what if I’m just looking for stuff on menus and foods for my restaurant?
Narrowing by date is SOOOO effective to help find these types of things. I’m continually frustrated with Google regular search when it pops up pages that are four years old and haven’t been updated or when Technorati blog search can’t find anything over a minute old. When I want to search, I want to search. I want to dig, to hunt and to find.
How many people actually use Google Blogsearch (http://blogsearch.google.com)?
I wonder. I use it.
Do you use it? Why?
Yup. Because I like to try new things and because I think it’s better than Technorati. Technorati excludes blogs for weird reasons. And they freely admit they’re a media company, not a tech company, so they really don’t care if their algorithm sucks.
What do you like about it?
Advanced search features. I want even more of them.
Anything you don’t like about it?
I don’t really like the way it works. And I don’t like how they take a certain amount of characters regardless of the post size. I think it should be a percentage of the post, not a specific number of characters.
How do you find Google Blogsearch compares with Technorati?
I like Google better. But I use both depending on what I’m looking for.
er… isn’t that what google adwords do – they read the words in our e-mail and post relevant ads next to them. From that to just scanning for URLs is a small step so on the privacy bit I wouldn’t be that suprised!
Interesting.
The point regarding click counts = popularity. Perhaps google are seeing the future where their upcoming competitors (the wiki foundation with their new search engine based on users input) might become a threat. A pure mathematical algorithm is no longer enough. The results must now adjust with the users browsing habits and trends.
I personally use blogsearch with great results. Lets see what the future holds for it.
Great article, thanks for the education.
Well, Darren, in my case, I don’t always use Google Blog Search or in a harsher tone, I don’t use it at all. One of the reasons is that when I’m looking for some content, I don’t want to get my results restricted to blogs.
And, Google has gone wrong with separating it from the main search engine. On to your second question, I think such a thing called BEO(Blog Engine Optimization) would evolve.
About Technorati and Google Blog Search, it all goes to what I said first. I never use these services to search for content because I don’t want to get my results restricted to just blogs.
Thank you for the post. This is the first time I used Google Blogsearch. The results are different. As I see it gives result in two ways. Related blogs give nearly the same result you find in Google. But the resulted post are really different from what you can get from Google. It seems to be good. In Google Icould searh for only relevant blogs. Because a mainly the two or three year old posts had the highest PageRank, so I normally find an old post instead of the recent ones. Now Google Blogsearch seems to be better- you can find recent posts on first places.
But I do not know whether it will be easier to drive great traffic to a blog. I think no. I belive, that great writing skills, and great efforts will give the desired results!
[…] Rowse‘nin sitesinde Alister Cameron adlı konuk yazar bir girdi hazırlarmış. Girdi’de Google’un blog arama motor patentini incelemiş ve Google’un hangi […]
[…] article on Problogger describes a recent Google blogsearch patent filing: Is Google Reading Your […]
[…] Determining a Quality Score for a Blog Document – the main way: feed subscribers a popularity (implied or not) of the blog document, existence of the document in blogrolls (high quality or not), tagging of the blog document, references to the blog document by other sources, and a pagerank of the blog document (tags: Blogging Popularity PageRank Ranking Google Google_Reader) […]
[…] Mas até agora eu não tinha ouvido falar que essa indexação do conteúdo dos emails pudesse ser utilizada para ajudar a formar o ranking das buscas do Google. E foi essa possibilidade que foi levantada em um texto de ontem (18/04) do Problogger.net, chamado como o Google Blogsearch pontua seu post… pelas palavras dele. […]
[…] to use my sidebar for this. Darren Rowse at ProBlogger has some very good advice on his post for Google’s Verdict: Blogrolls Matter. I’ve never been steered wrong by his advice yet; he uses only white hat SEO and […]
[…] useless data: Comes from the I told you so […]
[…] How Google Blogsearch ranks your Posts… In their own words! – ProBlogger […]
[…] * From Alister Cameron via problogger, here is another excellent algorithm type post on how Google Blogsearch ranks your drivel. […]
[…] How Google Blogsearch ranks your Posts… In their own words! […]
Awesome find. Makes sense to scan through Googles patents notes. Thanks for sharing. I’m curious what Google already uses and what they’re never going to use and most curious about what they already use but didn’t wrote down in this patent.
You for sure restarted the “tell a friend by email about this” hype.
[…] Cameron wrote a very good guest post on Problogger about how Google Blogsearch ranks your posts. This may be a very boring post because it is lengthy but it did give me some useful […]
Its also important how your subpages are named. So its best to use subpage names like netvance-software-entwicklung instead of netvance_software_entwicklung
This gives a much higher ranking in the result lists.
greetings from Wien (http://www.netvance.at)
[…] is a great article from ProBlogger about some of the numerous things Google analyzes when indexing your blog. Aspects they might be […]
I use Google Blogsearch nearly every day. I am always searching for keywords on blogs that might be similar to mine so I can go and comment on them. Interesting to hear how it works.
Wow these are really crazy and the crazy part is most of these can be manipulated.
[…] Bon je me doit quand meme de faire un lien vers ce post de blogger pro […]
[…] גוגל למרות שזו שומרת מידע אישי עלינו ואפילו למרות שזו קוראת את הדואר האלקטרוני שלנו בכדי לתת תוצאות חיפוש יו�…. אבל האם אותם סטנדרטים גבוהים של אתיקה עומדים בעינם […]
[…] blog search ranking algorithm explained, I mean speculated, by Alister Cameron at […]
[…] blog search ranking algorithm explained, I mean speculated, by Alister Cameron at […]
Currently the website is just a convenient place to post and collaborate on models used in fundamental equity research. Right now you can find versions of a LVLT model I am developing in the “Project” folder of that website.
[…] haven’t had a blogroll on my site in years. Turns out that’s a mistake. I’ll be working on remedying that in the next few […]
Interesting article Darren. I tried various blog searching tools today – Icerocket, bloglines, Feedster, Google blog search and bloglines. As per my analysis, google blog search provided maximum number of results, many of them were quite relevant to my query.
It seems that when it comes to searching – google has upper hand over its competitors. This is also confirmed by the huge number of patents Google is filing in the searching area.
:-)
I thought blogrolls were pointless. I really thought Google would consider such lists to simply be a form of link farming.
Now I see I was mistaken.
I’ll admit to being a little concerned about their mining of end user email, but then I never had the false sense of privacy I’m sure many have.
Very interesting post,
Fantastic post! Raises some really interesting questions, I guess only time will tell.
Three thoughts on this.
Google’s ranking criteria have always struck me as being similar to the popularity contests that go on in high school lunch rooms — who’s in, who’s out, who’s cool, who’s not. It’s as useful and fair as a gossip mill. That type of juvenile ranking is a tough enough struggle even when you’re at the top. Now, everyone’s on this treadmill trying to appease the google-gods instead of writing good articles.
The idea of scouring people’s emails to find out what they’re saying just seems prying and nosy. I realize email isn’t private, but I really don’t want to be part of their little plan to be the keepers of all the secrets of exactly what everyone is thinking. Who died and left them in charge, anyway?
What Google is doing is sorting out the unpopular opinions and driving them to the obscurity of the bottom of the list. That’s NOT diversity.
[…] Cameron, in a post for ProBlogger, asked, “How does Google know how many people are subscribed to your feed?” (See under the subtitle “A Key Indicator: Feed Readership.”) His answer was […]
[…] Cameron wrote a very good guest post on Problogger about how Google Blogsearch ranks your posts. This may be a very boring post because it is lengthy but it did give me some useful […]
Makes sense, as the 800 pound gorrilla of search gets better and smarter, we will see more and more pure and relevant search as a result.. driven by only one factor – Is it good for the visitor? So if this post is being crawled – Thanks g**gle !
I’m publishing a comprehensive review and summary of blogsearch, including my own analysis, and summary of this and Bill Slawski’s analysis (http://www.seobythesea.com/?p=541) over at SEO ROI. Thouhgt you might be interested in the wrap up.
Interesting read. we’ll see what effects these techniques have on blogsearch
Interesting read. Thx for digging the info out and sharing it.
I use blogsearch it is good.
Never used blogsearch before as I wasn’t able to figure out why things pop out in the results there, I’ll try a few of your tips hoping it would work. Thanks.
I guess, blog search can not give you the kind of traffic you want. I use it mainly to find related blogs in a niche.
How can google go through gmail like this, isn’t that invasion of your privacy in some way? I would not want them sifting through my gmail emails… that’s personal stuff.
In my experience.
Blog search will help you increase traffic.If your content fresh,it’ll show top result of blog search.Not sure it may help serp too.
Hmmm.
Read every word with great attention.
I suppose it’s good news in a way because it means that spammers don’t get a look in. I say this because all the things you mentioned i.e high quality blogrolls, number of feed subscribers, etc seem to depend on having quality content on your blogs
Bad news is there are loads of blogs with quality content in any niche so how does one raise their heads above the crowd