This guest post is by Mark Collier of www.DropMining.com.
For the last year I’ve been spending my free time after school conducting Red Bull-fuelled coding sessions in the pursuit of one single goal: to bring more science to the SEO industry.
As an industry we are still only taking our first few baby steps into the world of maths, stats, and data-driven decisions. For a seemingly data-dependent industry, SEO professionals are influenced to a surprising degree by rumour, anecdotal evidence, and unscientific tests.
SEOMoz were the true visionaries in conducting correlation studies to analyse search engine algorithms. With my project, I hoped to take it that little bit further and analyse more factors on a larger dataset.
After analysing the top 100 search results for over 10,000 keywords I had gathered 180,000,000 (one hundred and eighty million) data points on 186 potential factors in the Google algorithm. This has lead to the most comprehensive published research into Google’s algorithm, and some pretty incredible findings.
With so much data and so many findings it would be impossible to go through them all here on Problogger.net, so in this post, I have hand-picked all the most important findings for bloggers.
Background: correlations explained
Correlations are a useful but imperfect indicator of the relationship between two pieces of data, in this case search engine ranking and the factor being tested. They range from -1 to 1, a minus number meaning the factor correlates with a negative impact on ranking, and a positive number meaning ranking and the magnitude of the factor move in the same direction.
How close the correlation is to either of the 1’s is an indicator of its importance/strength. A 0.7 correlation is very strong whereas a 0.05 correlation implies almost no relationship between the two variables.
For example, correlation studies have been used to link income to education. As we all know the more education we have, on average the more income we earn, but where did that statistic come from and why do most people believe it?
Correlations are used to prove relationships between two pieces of data, in this case amount of education and income, and to figure out how important that relationship is, by putting numbers behind the logic.
Here’s a little example (these are made up figures [credit]):
In this sample, the correlation is + 0.79. Just from looking at the data, you can see that the more time the study’s participants spent in education, the more income they earned.
This is verified by the correlation which is a positive number (when education increases, income increases) and is very close to 1.0. This demonstrates that the relationship between education and income is a strong one.
My research findings
Now that you understand correlations, let’s look at what my research revealed about SEO.
Finding 1. SEO plugins are not the answer
I’m sure you are aware there are a whole host of WordPress SEO plugins available for your blog. These plugins tend to deal primarily with on-page SEO, for example, placing keywords in the URL, title, meta description, etc.
While some plugins deal with the indexing side of SEO, which may provide some small SEO benefit, the majority tend to focus their efforts on these on-page factors.
The truth is that contrary to all the rhetoric of SEOs and industry “experts” over the last ten years, according to my research these simplistic have almost no bearing on a page’s rank in Google.
That’s not to say that Google hasn’t developed more advanced algorithms to analyse content on a page, but certainly the traditional factors such as keywords being in title tags, h1/h2/h3 tags, etc. can be ignored when writing blog posts.
The main learning here for bloggers is that instead of worrying about search engines when you write your next blog post, you should focus 100% on the user.
Here’s the proof. Each ranking factor below is correlated to search engine ranking.
Check out all these articles in Darren’s resource on how to write a great post. Guess what? None of them talk about how users love a title tag stuffed with keywords or headings tags that are meaningless space-fillers designed solely for search engine spiders.
Finding 2. You gotta love link building
The only set of factors to have all the signals tested show a significant positive correlation was links.
Without a doubt, the single most important factor in gaining search engine ranking is building links to your blog.
Page Authority, an SEOMoz metric that models the PageRank for a given URL, was by far the most influential factor in the study. What this means is that it is not only important to build links to your homepage, but also to the posts you want to get rank well in Google.
When was the last time you wrote a guest post or created a viral infographic? How much time do you spend doing keyword research or doing repetitive, mundane tasks like manually optimizing posts for keyword density?
If there is one piece of action everybody who reads this post should take, it is without a doubt to create a link-building strategy for your blog.
Finding 3. Domains still matter
There’s been a lot of scaremongering about exact match domains of late, but the fact is that Google still highly values EMDs that have high quality content on them.
That’s the key. If you have a blog and you plan to publish great content that users will love then a EMD can be a massive help in getting you to #1 for that big keyword.
After seeing this significant positive correlation between EMDs and ranking #1 in Google, I looked a little deeper at the domain name market and learnt that there were over 200,000 domains expiring every day.
Bloggers can catch dropping domains before they go back onto the market and create incredible sites with them, which will have a natural advantage over the competition. Matt Green wrote a great post about this tactic right here on Problogger.net.
Research in summary
I think the key learnings from all this data for bloggers can be summarised into one sentence: “write for your audience not the search engines, build links to your great content, and develop your blog on a great domain with incredible domain authority.”
Do you focus on SEO? What works to push your site up the search rankings? Share your thoughts on my research in the comments.
This guest post is by Mark Collier from www.DropMining.com and www.TheOpenAlgorithm.com
That’s what I was thinking. I prefer SEOMoz besides they get the info to you faster, more accurately and with actual examples that have been executed and proven to work in search engine rankings.
The quality content still matters a lot and one should forget about SEO and just keep on providing unique content that can grab people’s attention.
I do agree with you on exact match domain names. Google won’t hit those EMD websites which has unique and quality content in it, One example is BloggingTips(.)com
I Visit lot of sites but This site article is very nice beacuse this site content is very nice and this is google friendly site. thanks for good articles.
So you are telling that High Quality content is still the essential part of SEO?
And I really love link building.
Thanks for the incredible insights into the world of SEO. As a newbie blogger trying to attract and build membership in my Community of heart disease patients and caregivers, I’m very interested in the topic.
Write for readers not for search engine, thats a good idea. Real SEO.
Yes, content quality is the king.
Thanks for clarifying about Exact Match Domains.
In my mind I was quite sure Google was fine with them, but those that added very poor quality content to them got affected with the update geared towards them.
Make sure to add very good content to that domain.
Nice analysis of SEO factors!
Link building is still of one those tough topics to discuss about. It can be the most abused part of SEO.
Otherwise, thanks for the excellent article Mark!
As an SEO myself, I think this post makes some good points. However, I think there are a couple of areas where it’s important for people to understand more fully some of the context to what is being said, and to know that they have to consider the bigger picture.
Primarily, link building. Whilst it’s true that a site with good authority is an attractive place to have a link, that is far from the only consideration. Given that Page Rank itself is based on links, a site which has plenty of them will (mostly) fare better in that respect. There are plenty of sites out there which have great authority but which I would never want to get a link from – because it’s fairly easy to build artificially.
Some of the other types of things to look at when considering whether a link is a good idea:
1) Relevance – is the content of the site relevant to yours? It doesn’t have to be exact but some level of connection is important. For example, a website about weddings and one about cake making can be relevant to each other. Cake making and body building? Probably not so much.
2) Content – look at the content already being placed on that site. Is it good quality or does it look as though it has been written with a keyword in mind, for example? Does it link out to other sites which are not relevant (see above)? Would you be happy for your doubtless excellent content to be associated with it?
3) Social – is the site also active on social media, and does it have a level of interaction on the site from users (real ones, not spam comments with links in)? Social itself is probably not that big of a factor directly, but it is a big signal to search engines – if there’s a void where social media should be, the content is probably either not very popular or – much worse – not intended for people to see and share. Over time it will only become more important as they better figure out how to measure this stuff accurately.
4) Location – this can be tricky to measure, but location of the site can play a part. Is the domain name clearly from a foreign country? E.g a .com or .net can be ambiguous but .ru or .cn, to give a couple of obvious examples, are not.
In isolation, the above might not cause a problem but remember that Google assesses all of the links to a site and will notice the patterns and footprints. As Samuel says, link building has been hugely abused by SEOs in the past, and Google is vigilant in cracking down on anything it sees as “spammy” as a result.
There are no hard and fast rules as to how many links it takes to get great rankings or where they should appear – everything in SEO is relative.
If in doubt, read Google’s Webmaster Guidelines and adopt a safety first attitude – focus on building relationships with others in your niche and find ways to accumulate those links without having to build them yourself.
With any SEO advice, there are two things to remember. First, correlation is not causality. Just because something appears to indicate a relationship might not mean there definitely is one. Measuring 186 factors is exhaustive work but probably only scratches the surface of the total number of factors Google uses.
Second, just because you were given great advice yesterday might not mean it is still good advice today. Google is constantly tinkering with its algorithms and rankings constantly fluctuate while they test user behaviour to find out which results they find most useful on a given keyword. Then there are the regular updates, like Panda, which will continue to evolve in the future. The same applies to the EMD update mentioned in the post.
SEO can be complicated at the best of times but it is made harder by those who are determined to take shortcuts or try to game the rankings. As with most things, if you’ve been told that Technique X is guaranteed to propel your site to the top of the rankings, it’s likely too good to be true.
All of this means that what any blogger or SEO should be doing is the following:
1) Figure out what your audience want/need
2) Build it
3) Help them to find it
This was really a kind of new but yet great info for any newbie like me. This is the exact time when all blog posts should be written for users but not for search engines . . kuddos
I have the sinking feeling that blogs are going the way of cars. Used to be any old person could pop out their own broken timing chain and replace it themselves. Now it’s a timing belt, sometimes hidden under a major portion of the cars guts, and only a licensed, expensive mechanic can fix it. Only blogs are worse. You have to build the car yourself.
Some of us are just writers, not mechanics and certainly not manufacturers. We just want to drive and we want to bring our friends along.
Be cautious with your correlation charts! Every graphic in this article uses a different X-axis scale which can bring you to inconsistent conclusions. It also looks like the X-axis on each chart is also non-linear which can fool the senses too.
“Page Authority as measured by SEOmoz” just shows they have good factors that they tested built into their algorithm (which will be my next look) – or that some method of page rank from SEOmoz might have been used in that metric (and the two measures are naturally correlated because a factor in one determines the other metric – “internally correlated”).
Inbound links hit all the top spaces is expected since that is what Google advertises, and is generally ‘difficult’ to obtain just based on content written in the short term. A blog is more important than website because all these comments inside an article can offer a lot of inbound and outbound links.
I suggest to anyone using correlation that they look at any key data finding (such as “# of IPs linking” vs “Rank”) set in an X-Y dot plot and then draw a trend line through that data. If the data spread like a snow ball smashed against a cement wall you know not to trust the correlation factor even if it’s a number close to “+/-1”. If the dots line up along/near the trend you can start to believe it. A 0.75 or 0.80 or better correlation starts to visually appear predictable, below that you see snowballs. It’s a quick general view but it protects your analysis.
I love the use of real math in data analysis so please keep it up!
(I worked for some years as a six sigma black belt in industrial corporations using statistics to solve production and quality problems).
link building for me is free and easy to do, being a hobbyist photographer and blogger i use spare time to discover and learn new things about the subjects on the web, while doing so i try to leave comments (links) behind!
nice post, applicable to all forms of bloggers!
To me on page seo is not about keyword stuffing, but clearly labeling everything and making it easily accessible to the search engines. Obviously the content should also be well written and completely original.
Users will find your content if you follow these basic rules.
GOOD ADVICE, THANK YOU VERY MUCH
A very good analytical post, Mark.
As a statistician, my contribution will be that anyone reading this should also note the following data health check:
Correlation (in isolation) is not conclusive and may not imply causation even when the ‘proof’ is statistically significant.
Correlation can be coincidental which may lead to the fallacy of ‘Post hoc ergo propter hoc’ (Latin), which means “after this therefor because of this”.
In other words, because ‘X ‘occurred before ‘Y’, therefore ‘X’ caused ‘Y’ is said to be a logical fallacy.
This is not to say correlation should be ignored. For example a third factor (W) may be at play where search engine ranking is concerned. The third factor may have caused the search engines to take note of X and therefore Y. The question is, ‘what is that factor?”
So correlation may be taken as a ‘pointer’ or ‘hint’ and used to carry out further research before a conclusion can be arrived at.
So for me, these are very useful hints, thank you Mark.
Thanks for paying attention to one our post and referring it it. With all due respect to your thought process, I beg to differ from your views on EMDs. Its not only the Google that puts your whole site on risk, its owners itself who could potentially kill his own site by selecting EMD (Exact match domain).
Let me explain; in most of the cases, bloggers choose the domain that directly relate to a product / company. There are plenty of such sites already available on net. The whole exercise is a result of desperation and to attract Google robots top crawl and rate on higher side.
However, the sad side of this selection is the ‘legal implication’ one should understand. There are many case studies on net that clearly stats how biggies suddenly kill such sites instantly. since most of the EMDs hold company name or brand name, they are always sunder obligation of copyright ingringment or violation that carries the risk of loosing your domain.
Does any one really want to put his heart and soul in a site – by selecting EMD – which is always risking its existence ?
Great insights on what SEO practitioners nowadays should focus on. I don’t believe that keyword stuffing, using tools to get links, etc. still works (maybe in a very non-competitive industry yes? I dunno) but great content will ALWAYS work.
Some other factors I look for in a properly optimized site:
1. Outreach to possible linkers via email
2. Social Signals
3. Internal linking (the proper way of doing content to content on-site)
4. Well-madecontent that focuses on what people who visit their site needs
Trust factor is also good to note, since Google authorship can be a remarkable “tool” to get more links via content marketing.
Great article with detailed analytical data. Thanks for such a informative post.
Quality & unique content still rank high. Mashable, problogger are good example
AWESOME SAUCE!! I haven’t paid any attention on correlation before. However because of the article, you made me realize on the importance of correlation on the SEO side.. Thanks for sharing.
Great post – While SEO is mechanical in nature its obvious that along with Google it’s getting much more human and I think your point about link building relates back to the importance of relationship building. Much like it – link building is a sign of a relationship, trust and a personal recommendation and validity, which is the number one purchasing influencer and why relationships are top of mind for businesses and brands to establish. I was wondering if you came across any information about blog directories or blog commenting in your research and how they relate to SEO?
No I didn’t have access to data on the type or source of links and even if I had a list of links (even a sample list) for each result, the computing power required to determine link sources is fairly massive.
I hate recommending advice not based on data, but here surely logic and some clear thinking is likely to prevail. Do you think blog directories or commenting is a serious and editorially relevant recommendation? Probably not, is Google’s algorithm that specific? Who knows.
Very interesting article about SEO. Now days business marketing via internet is booming. Due to this reason SEO is more powerful.