Today your task in the 31 Day Project is something that most bloggers who have been blogging for a while could probably benefit from doing – go on a dead link hunt.
Blogging is built on the ‘link’. One blog links to another blog who links to another who makes comment on another. This is a wonderful thing – but what happens when one of the blogs that you’re linking to is retired, is deleted, changes it’s link structure, moves etc? The link is a dead one (also known as Link Rot) and can cost your blog on two fronts:
Readability – clicking on a dead link can mean your readers can end up on error pages or being redirected to other irrelevant content to the one they were expected to get to. This can lead to reader frustration or giving the impression that your blog is old and/or out of touch.
SEO – I’m not sure of the technicalities of it or what the latest research shows but from what I can tell a dead link is not looked upon favorably by search engines and you run the risk of penalties.
So how do you detect dead links on your blog?
The most obvious ‘solution’ is to surf every page on your blog and manually check all the links. This is something that might be achievable on a new blog – but on older blogs with hundreds or thousands of posts it’s just not feasible.
There are many link checking tools available but to be honest I’m yet to find one that I’m really happy with. I do hear that Xenu’s Link Sleuth is a good option for those using Microsoft Windows 95/98/ME/NT/2000/XP. I’ve also used the free version of Dead-Links.com (which only checks to a reasonably shallow depth) – but I’d be keen to hear from readers on their suggestions of other options.
I never thought about this. Good post!
-Terra
http://www.BetterForBusiness.com
So true, especially for bloggers that have been around for a while. Yesterday I was reading Lee Dodd’s blog and I was looking in the archives. I think he moved his blog from another domain and kept the links, because he had a ton of links being mentioned in articles that were going to 404 pages.
It’s really annoying to see people linking to an article that interests you and then hitting a brick wall because the link changed.
What about Google Webmaster Tools? I think you can see broken links in Unreachable URLs category or am I wrong?
For those of you who do 301 redirects such as http://michaelmartine.com to http://www.michaelmartine.com, you may find that the link checker will tell you the link is broken to your redirected URL.
The Word Wide Web Consortium, which is the governing standards organization for HTML, etc., has a link checker as well that’s a little bit more nerdy.
Hey Darren,
I’ve use Xenu, but the biggest problem I run into is false positives. Expecially intra-links.
Still and all, it is a very valuable tool, there is no way I can go through over 3000 links manually.
I wrote a php regular expression to load up all the links in about 600 posts on one of my blogs in a list. Its pretty quick (less than half an hour) to click on them all and make sure they go to the right place.
Dead links are bad from an SEO point of view but the worst thing is if the site you are linking to has been redirected to a p0rn site or similar.
When I converted to a new layout I changed the URL style as well, but I found with just a few lines of regex code to check for the old URL style, and direct them to the updated page was not too hard at all.
And it’s will pay off so much especially if you’re already indexed in search engines and such.
Google webmaster tools is amazingly good for this. They will report dead links, 404’s, etc. You can even limit by date, so if you make changes to your site, you can look at dead links in your site that happen only after a certain date. It’s fantastic and incredibly useful.
It’s especially useful if you are making changes to a new design, site structure, etc. It’s very easy to have navigation links point to the wrong url, especially if you use Search Engine Friendly URL’s. Sometimes using SEF URL’s will cause relative url’s to link in the wrong directory. This can be easy to overlook and will cause major usability problems unless you catch them.
There’s a WordPress 404 Notifier plugin which is useful – it sends you an email whenever your blog displays a 404 or page not found message, but lots of comment spam will trigger a 404 so you need to setup some email filters. The advantage of this plugin is that it will find dead links coming from other sites too.
Darren,
I use Dead-Links. Like you said, it only checks to a certain depth. I actually never realized how many links I had.
Any recommendation on frequency as to how often you should check for dead links?
In my six months of blogging, I’ve only run across one “dead-ish” link where I linked to a newspaper article that had been moved into archive. So a 404 doesn’t show up in that situation. Any suggestions for the best way to deal with those?
Nice tool! (I’m using it right now and discovering all the data my blog has).
Quite impressive and very useful: thank you very much for showing it to us today :-)
Regards from Spain :-)
Paquito.
http://paquito4ever.blogspot.com
Darren you write some good posts. What I don’t understand is how John Chow is more popular than you
My question is: What do you do with the links that are dead? The Wayback machine is no good unless the page had been there ten months ago or so. Sure, sometimes you can find a replacement, but more often it’s just gone.
Do you unlink them and leave a note that the link is dead and was removed? Delete the post? (In some cases, the post makes no sense without the link.)
What?
Al – I generally rewrite the post to make it make sense. Sometimes (if the post centered around the link) that means making a note of it – sometimes it simply means rewriting the post slightly and removing the link.
Alex – John and I have very very different approaches and as a result attract quite different readers.
Sheila – I try to go a on a dead link hunt every few months – but it’s getting harder and harder as my archives grow. I tend to do a month at a time.
For me ,it is easy to do.Until now,only 20 or more friend links.
But these tool are useful to check all of my blogs,thanks for advice.
I too use Google webmaster to check for dead links. But I do not know what to do with the dead links.
I have quite a few dead links as I shift to a new domain and change the permalink structure.
I know Google webmaster has a url removal tool, but it can only remove one link at a time.
Anyone had any ideas or plugin to bulk remove these deadlinks?
Hi Darren,
You didn’t install the subscribe to comments plugin? Cant find the check box…
For those on linux, klinkstatus seems to be a good option to check for broken links.
http://klinkstatus.kdewebdev.org/
Alex,
John’s blog is popular cos it feeds on hype and little substance. Darren’s is all substance and no hype – typical aussie.
If you hav Dreameaver you can select the Results Tab, then click on “Link Checker” it will run through and tell you broken links, orphaned files and external links. Pretty slick feature.
Hi Darren I have been a regular problogger reader… and have gathered loads of my knowledge from here… you have been doing a great job… thx for pointing out the dead link concept… m just getting started with blogging and will certainly keep track of your postings… great going :)
Would recommend Xenu, used it at work several times and always works…
blogmunch – took the subscribe to comments button off a while ago as I had some bugs with it. Hopefully will bring it back soon!
i use Google Webmaster tool. But you can’t check your links “on-a-fly” with it. So thanks for advice! :)
As someone mentioned already, W3C provides a link checker. I’ve also used the one at http://www.iwebtool.com/broken_link_checker, which works on a page-by-page basis.
The beta version of wget 1.11 can also be used for this, with the –spider command-line option, but it’s not so user-friendly, being a command-line tool.
This is interesting. I used to author an appliction for this (called Java ALiVe!), but abandoned it a few years ago. I just started thinking about bringing it back yesterday.
Maybe now is a good time… :-)
Take care with ads and affiliate programmes when checking Links! Google is not amused if it got such false hits from your Linkchecker, you NEED to exclude these somehow.
Shameless plug: http://blog.oncode.info/2007/08/25/tote-links-in-einem-blog-oder-einer-website-finden/ describes how to do that with the commandline app “linkchecker”.
Last week or so, I got to know about Xenu from John of V7N. It’s an excellent tool. It even gives a report on missed images.
Google ads won’t be hit by them unless it also pulls in javascript, so that is fairly safe.
Please pardon the plug, but WebLight http://www.illumit.com/weblight is one of the few link checkers that is practical to use on large or complicated sites.
It’s not free, but it also identifies non-standard markup. It can save you a lot of time for $25.
There is an extraordinary Plugin for WordPress! Just amazing!
http://w-shadow.com/blog/2007/08/05/broken-link-checker-for-wordpress/
@Better Blogging with Michael Martine: Thank you! This was actually what I was looking for when I arrived here via Google.
-=-=-=-
Another good link checker to try is KLinkStatus if you are on Linux.
very informative posts and haven’t ever think of it. thanks to share with us.