Weekend Project: Correct Content Mistakes that are Damaging Your SEO

This guest post is by Sophie Lee of IBS Tales.

In February 2011 my website lost 50% of its traffic overnight, and a further 20% disappeared two months later. I was a victim of Google’s infamous Panda update, and like many other webmasters, my first reaction was to assume that Google had messed up—my site contains nothing but high quality, deathless prose, and I’m sure yours does too.

As time went on, though, I began to realize that my site had been penalized because it deserved to be. I hadn’t deliberately set out to produce thin content, or put duplicate URLs in the index, or make other amateur SEO mistakes, but that’s what I had been doing, regardless of my good intentions.

I set about fixing aspects of my site that should never have been broken in the first place, and one year on, I believe that my site has markedly improved. I need to be honest and say that I haven’t recovered from Panda, and so I can’t promise that this article will help you recover your rankings if you’re a fellow Panda victim.

However, I can tell you that Panda has been a massive wake-up call for me, and opened my eyes to some horrible mistakes that I was making as a webmaster. Are you making the same mistakes? Are you sure?

Mistake 1: Thin or shallow content

Panda quickly became known as the update that targeted thin or shallow content. I checked my site and found that around 10% of my pages had less than 100 words on them. Now, word count alone may not mean a huge amount, but what, exactly, can you say in less than 100 words? I had intended to develop these pages as I went along, but I’d never got round to it. They had to go, so I removed them completely and left them to 404.

I also looked at pages that might be useful to my visitors or to me, but could easily be flagged as thin content by an algorithm. For example, I had a page named blank.htm that I used as a template page. It was, of course, blank, and it shouldn’t have been on the server. I had an entire page that showed my search box and nothing else. Another page showed my mailing list sign-up box and nothing else. If I worked at Google, I’d have penalized these pages too.

Mistake 2: Duplicate URLs and pesky parameters

One issue that I had neglected almost completely was the way in which Google was indexing my content. Panda woke me up. A search for my site on Google came up with over 800 URLs. I had roughly 400 pages of content on my site, so what was going on?

Firstly, for reasons lost in the mists of time, I had used dropdown lists in some of my navigation links. These links were being indexed by Google with [?menu] parameters in the urls, resulting in duplicate urls for a whole bunch of pages. I replaced the dropdowns with simple [a href] links and put canonical tags on all of my pages to indicate that I wanted the plain URLs with no [menu] parameter to be the “correct” URLs.

I also realized that I had the syntax [Disallow: /*?] in my robots.txt file, put there because it’s part of the robots.txt file that WordPress recommend in its codex. This command meant that Google couldn’t see the content on any page with a question mark in the URL, and that meant that it couldn’t see the new canonical settings in any of the duplicate URLs. I removed that line from my robots.txt file, and a couple of months later, the duplicate URLs had disappeared from the index.

Secondly, my WordPress blog was producing duplicate content on category, tag, and monthly archive pages. Previously, I had believed the Google guidelines that said you shouldn’t worry about duplicate content that is legitimate: “If your site suffers from duplicate content issues … we do a good job of choosing a version of the content to show in our search results.”

However, the prevailing view of the SEO blogs I read was that noindexing these duplicate pages was the best way forward, because that would leave no room for doubt as to which URLs should be returned in searches.

I found that the Meta Robots plugin from Yoast enabled me to easily noindex all of the dupes, and they were gone from the index in a month or so. I did find that some URLs tended to get stuck in the index, presumably because they were simply crawled less often, and in those cases I used Webmaster Tools to get the URLs crawled more quickly.

If I found a URL that just wasn’t shifting, I used “fetch as googlebot” to fetch the URL, and then, once it was found, clicked on “submit to index.” This tells Google that the page has changed and needs crawling again, and this got the URLs crawled and then noindexed within a few days, on average.

Mistake 3: Not using breadcrumb navigation

Almost every site I visit these days uses breadcrumbs—those links at the top of the page that say “Home > Cameras > Nikon cameras” or similar, to let you know at a glance where you are on the site.

They stop your site visitors from getting lost, they help pagerank to flow, and they look good. I should have added them years ago.

Mistake 4: Not displaying social buttons

I know, I know—you can’t believe I didn’t have social buttons coming out of my ears already. I just don’t like the fact that I have to register with Twitter and Facebook and Google+ to run my own website. But I do. So I have.

Mistake 5: Ignoring blog speed and server location

I got a shock when a search at whois.domaintools.com told me that my server was in Canada. I checked with my host and they said that all their servers were in Canada, which I had been completely unaware of—I had blindly assumed that they were all in the USA.

I won’t make that mistake again. It may not make a huge different to rankings, but Matt Cutts has confirmed that server location is used as a signal by Google so it seems crazy to host your site anywhere other than the main country you’re targeting.

I switched from the dirt cheap host I had been with to a Hostgator business package. I stuck with a shared server, although I did ask for a dedicated IP address to isolate my site from any potentially spammy neighbors.

I also took a look at the speed of my site using tools like webpagetest.org. The tests showed that although my site was fairly quick, I was missing some easy gains, the most obvious being that some of my images were 40kb or 50kb when they could easily be compressed to 10kb. I also turned on mod_deflate/mod_gzip in Apache, which sounds impressively technical but involved checking one box under the Optimize Website section in the Hostgator cpanel. That setting meant that all my content would be compressed before it was sent to a browser.

Finally, I made sure I was using asynchronous code for those dastardly social media buttons, making them load in the background rather than holding up the display of my main content.

Mistake 6: Misusing h1 headings

I found that, for some inexplicable reason, I had set up many of my pages with two h1 tags—one in the main content, and one in the left-hand navigation bar. I got rid of the left-side h1s so that the main heading for each page reflected the main subject for that page.

Conversely, I realized that my blog theme put the overall title of my blog into h1 tags rather than the titles of the individual blog posts themselves, so every single page on my blog had the same h1 title. I switched to a different blog theme (Coraline) and the problem was solved.

Mistake 7: Ignoring Google authorship

I had been seeing little headshots in my Google results for months, often clicking on them because they stood out without asking myself why they were there and whether I could get them for my content too.

What I know now is that they’re called rich snippets, they’re part of Google’s authorship program, and you need to link your site to a Google+ profile with special markup code to get one. I found the Google instructions for this process confusing, but this post from Yoast was much clearer.

I then used the Google rich snippets tool to check that I had set things up correctly, and filled in this form to let Google know that I was interested in using rich snippets for my site.

Once I had submitted the form, it took around a week for my photo to start showing up in the SERPs.

Mistake 8: Running sister sites

I was actually running two websites on the same topic when Panda hit, and the update crushed them both. The main reason that I had chosen to run two websites was to protect myself against a drop in search rankings. That obviously worked out great.

I began to wonder whether Google frowned upon two domains on the same topic. Obviously, ten domains on the same topic, all targeting the same keywords, would be spam … so could two domains be spam too, or at the very least ill-advised?

The more I thought about it, the more I realized that splitting my website into two had been a mistake. Surely one brandable, strong website was better than two weaker sites? One site with 1000 backlinks was going to be more powerful than two sites with 500 each. The consensus within the SEO world was that multiple domains on the same topic was simply a bad idea, Panda or no Panda.

I decided to merge the two sites together, and so I had to choose which domain to keep. One domain was much newer than the other, contained a couple of dashes separating exact match keywords, and had a really, really, really silly extension. The other domain was at least two years older, had more backlinks, was a dotcom, had no dashes, and was brandable. It didn’t take a genius to figure out which domain I should be using.

I 301-redirected the newer domain to the old one on a page-by-page basis, so www.newsite.com/thispage.htm redirected to www.oldsite.com/thispage.htm. This is the code I used for this, placed in the .htaccess file of the new site:

RewriteEngine on
RewriteRule (.*) <a href=”http://www.newsite.com/$1″ target=”_blank”>http://www.newsite.com/$1</a> [R=301,L]

I checked that the redirects were working using the Webmaster Tools “fetch as googlebot” feature. It took around a month for all of the main pages of the old site to be removed from Google’s index, and about another month for the entire site to go. I then went on a hunt for anyone who’d linked to my newer domain, finding backlinks through the link: mysite.com operator at Blekko and opensiteexplorer.org, and asked them to link to the older domain instead.

Now what?

If these changes haven’t returned my blog to its old position in the SERPs after a year, what’s the point? Why don’t I just give up?

The point is that I’m proud of my website. It’s suffering right now, but I believe in it. And that’s the greatest advantage that a webmaster can ever have. If you believe in your website, you should fight for it. Sooner or later, it will get what it deserves.

Sophie Lee runs the irritable bowel syndrome support site IBS Tales and is the author of Sophie’s Story: My 20-Year Battle with Irritable Bowel Syndrome.