Proper Forwarding
Since I’ve been through a handful of site migration situations, I thought I’d share a few tricks for seamless forwarding.
The Nuclear Option: .htaccess
A file names .htaccess can be placed in any location and provides the fastest and easiest option for a redirect:
Redirect permanent / http://newsite.com/
What does it do? It sends every path that comes to this site over to the new site, preserving the URL completely. So if a visitor hits http://oldsite.com/foo/bar, they’ll redirect to http://newsite.com/foo/bar.
Sometimes you’ll have a situation where parts of a link change, and parts stay the same. For example, in a recent move, I switched a Mediawiki installation from using the default URLs to using pretty URLs. The upgrade rule was like this:
Redirect permanent /wiki/index.php/ http://example.com/wiki/
So then the old links, like
http://example.com/wiki/index.php/Main_Page
Were forwarded automatically to
http://example.com/wiki/Main_Page
Redirects of this kind are extremely powerful, but also very broad. There’s little opportunity to fine-tune the redirect, it’s simply all or nothing.
To make a conditional redirect, or a redirect with some processing, we’ll need a little PHP.
Header Redirects
Consider a situation, similar to the wiki example above, where we’re moving from one URL style to another:
http://oldsite.com/index.php?article=34253
To
http://newsite.com/articles/my-article
It’s clear that the second is vastly superior, both for human use and search-engine readability. But would an htaccess rule know how to redirect this?
What needs to happen is an edit on the old site’s index.php:
<?php
$slug = look_up_slug($_GET['article']);
header(’HTTP/1.1 301 Moved Permanently’);
header(’Location: http://newsite.com/articles/’.$slug);
exit;
?>
Before anything else goes to the browser, there’s an opportunity to send some headers. These can set cookies, for example, or tell the browser some information about the content being sent.
One of the possible headers is Location, which simply means “go here instead.”
So why do we have the exit on there? That’s to stop PHP executing once the redirect is sent. Otherwise PHP doesn’t know any better and would happily start spewing output to the browser, potentially causing all manner of confusion.
PHP Redirects
The first solution allowed sweeping redirects. The second one allowed specific redirects, but only from a single file.
What’s really needed is a union of these two approaches. What we’ll do is set up a special rule in .htaccess, which funnels all incoming requests into a single file. It’s really simple:
RewriteEngine On
RewriteRule ^(.*)$ redirect.php?request=$1
And then inside of the redirect.php script, you have all of PHP’s powerful string processing functions (and DB access) from which you can dissect the old URL and assemble a proper redirect out of it.
This is my favourite solution. And let me give you an example of its power.
Cleaning Up For Others
The website I created for my engineering class has lived at three different places over the course of its lifetime:
Check it out, all three of them forward to the present location.
But, there was a brief period where there was just the forum, and no homepage. During this period, the inbound link on the page here was created. It’s a link to an out-dated, arbitrary forum topic. And yet, that’s an important and relevant page at the University, and the logs indicated that people were following that link to our site.
Rather than hassle the webmaster to correct the error, I instead put a special rule in the redirect script. It detects that exact inbound link, and rather than forwarding to the forum, it forwards to the home page.
Classy, eh?
“We’ve Moved”
The one other thing I should mention is Moved pages.
Moved pages are a waste of a mouse-click. 99% of visitors don’t care if you’ve moved. They’ll update their bookmarks at their leisure, and if they don’t, it shouldn’t matter– your redirect scheme should forward them to the relevant content anyways.
But there’s a further reason to not use a Moved page: Google doesn’t know what it means. When you perform a proper redirect, that tells spiders and search engines that the content has moved permanently, which means they’ll update their caches accordingly. (And, I understand, pass on PageRank…)
The reason given for using a Moved page is that it’s to warn users about the redirect, rather than just lurching them to the new location. This, however, is silly, since it’s easy to just let them know, once on the target page.
When redirecting, it’s important to end up at an easy, “bookmarkable” URL. So sending the user to http://newsite.com/?new is probably not ideal. However, for the more ambitious, you can clear the ?new out of there. Simply set a cookie and then redirect again. When the cookie is detected, display the extra welcome notice… and clear the cookie.
Moving at all is non-ideal. But moves happen, and it’s best when it can be as streamlined as possible for the users.
Mike

You can use Markdown for style. I love hearing from readers, but please don’t hijack the discussion, use offensive language, or try to sell anything.