Post Penguin – bring back the lost link love


In the post Penguin world, it seems Google is going to actively penalise sites with poor back link profiles. It’s more important than ever before to carefully cultivate and curate your backlinks. Raven Tools stepped up to the plate with some great upgrades to the Link Manager built into their great SEO toolset, but a major source of lost link juice comes from incoming links to content you have moved or deleted. I’m going to show you an easy way to reclaim these potentially powerful back links with just Google Webmaster tools and Google Docs.

The problem with a mature site is that over time, URLs change and pages come and go. Unfortunately when a URL changes you lose the value of incoming links to that page and your wider domain. On small sites you can probably keep on top of this. For larger sites, that have been active for some time, this is a tricky task. You may well have hundreds or even thousands of great incoming links that now point to missing pages!

But all is not lost! You just need to know what URL these great incoming links are pointing to, and set up a 301 redirect to a live page on your site that is as close to the now missing page as possible.


Big sites are going to have hundreds of such “lost” links, so we need to automate the process. Here’s how:

Google Webmaster tools is going to provide the list of the lost links that we are going to redirect. You get this from “Health > Crawl Errors > Not Found” which will present you with a possibly formidable list of 404s like this:


For each 404 they’ll actually tell (some) of the external sites link to it, so you can get an idea of the lost link love.


Download the entire list and import it into a Google Docs spreadsheet. Next, for each 404 URL, we need to create a 301 redirect which will be added to our server’s .htaccess file like this:

Redirect 301 /old_url.html

Doing hundreds of these would be hard work, but Google Docs can help with the heavy lifting. We just need to find a way to identify what would be a good new URL to redirect the old URL to. Luckily many sites have just this functionality built in, via a custom 404 page or search function. This one is created by sh404SEF, a component of Joomla, but many content management systems can do this out of the box or via a plugin or extension.


So what we can do is get Google Docs to fetch each of the old URLs, parse the page to pull out the top suggested new URL and then create a 301 redirect we can copy and paste straight into our htaccess file.



Column B contains this formula:


Which grabs the old URL with the ImportXML function, which contains our custom 404 page. The regular expression “//ul[@class=’results’]//li[1]/a/@href>” gets the first <l> list item from an unordered list <ul> that has the class “results.” This is the list of suggested new URLs. You would modify this for your own situation, to grab the first link from your list. I actually edited the 404 page to add the class=”results” tag to make it possible to target the correct <ul> unordered list with a simple regular expression.

Column C simply sticks the data together to make a perfect 301 redirect:

=CONCATENATE(“redirect 301 “,SUBSTITUTE(A101,”“,””),”“,B101)

Google Docs will then oblige and create redirects for you to copy and paste into your .htaccess file. You can only do 50 at a time due to a limit set by Google Docs, but this is still way faster than doing them by hand – about 100 redirects a minute!

Reclaim some link love with Google Webmaster Tools and Google Docs.


No comments

Jeremy Webb

Chief & Adventurer

Jeremy WebbPost Penguin – bring back the lost link love

Related Posts