Redirecting with Apache’s .htaccess

When you migrate web sites from one place to another, and the URLS change, you don’t want to lose visitors that still use the old links. If your ‘old’ website ran on Apache, you can use its mod_alias/mod_rewrite functionality to automatically redirect to the new URL. This involves adding redirect rules to the .htaccess file in the base folder of the redirects. Some examples:

Generic structure of the .htaccess redirects

Redirect permanent /(old url) (new url)
Redirect ... (add all your one-2-one redirects here)
RedirectMatch permanent ^/old_stuff/.*html$ http://www.example.com/
RedirectMatch ... (add your catch-all redirects here)

RewriteEngine on
RewriteBase /blog/
RewriteRule ^([regex])$ http://blog.example.com/$1 [R,L]
RewriteRule ... (add all your variable redirects here)

EXAMPLE: old Blogger site (on your own server) to new WordPress site
I’ve done a migration from a blog published by Blogger (via FTP) onto my own webspace, to a blog run by WordPress. I’ve used the following Rewrite rules to handle the redirections.
* HOMEPAGE:
redirect /index.html and / to your new blog URL
Redirect permanent / http://blog.example.com/
Redirect permanent /index.html http://blog.example.com/

* FEED:
redirect e.g. /atom.xml to your Feedburner feed
Redirect permanent /atom.xml http://feeds.feedburner.com/(exampleblog)

* ARCHIVES:
redirect e.g. /archive/2005_03_posts.html to the new WordPress archives
RedirectMatch permanent /archive/([0-9][0-9][0-9][0-9])_([0-9][0-9])_.*$ http://blog.example.com/$1/$2/

* POST PAGES:
This is tricky, because Blogger and WordPress do not use exactly the same rules for constructing the text-like URL (the ‘post slug’). E.g a post called how-to-podcast-with-blogger-and.html on my old Blogger site became how-to-podcast-with-blogger-and-smartcast/ on the new WordPress one. So what I did consisted of 2 type of rules:
a) redirecting individual pages
Redirect permanent 2004/10/how-to-podcast-with-blogger-and.html http://blog.example.com/2004/10/how-to-podcast-with-blogger-and-smartcast/
b) a generic rule for the others (this uses Rewrite instead of RedirectMatch!): each page is redirected to a search on the WordPress blog within the correct month with the two first words of the title:
RewriteRule ^([0-9][0-9][0-9][0-9])/([0-9][0-9])/([a-z0-9]*)-([a-z0-9]*).*$ http://blog.example.com/$1/$2/?s=$3+$4 [R,L]
This method is far from perfect, but will bring visitors a lot closer to the right page. If you use pretty distinctive words for titles (e.g. “Myspace: bulletin and other spam“), chances are the right page show up first. If you start all your posts with “The ten best ways to …” then you will need a more sophisticated rule; e.g. using the 6th and 7th word:
RewriteRule ^([0-9][0-9][0-9][0-9])/([0-9][0-9])/[a-z0-9]*-[a-z0-9]*-[a-z0-9]*-[a-z0-9]*-[a-z0-9]*-([a-z0-9]*)-([a-z0-9]*).*$ http://blog.example.com/$1/$2/?s=$3+$4 [R,L]

Not losing the querystring
Redirect and RedirectMatch cannot redirect to a URL with a querystring (e.g. to newpage.php?param1=val1¶m2=val2). For that you will need to use the RewriteRule. An example: redirect all links like test.asp?param=value on the old domain to the new domain while keeping all querystring parameters:
RewriteRule ^tools/test.asp??(.*)$ http://web.example.com/tools/test.asp [L,QSA]
where the QSA = (query string append) keep existing querystring, and L = (last rule) stop looking further for rule matches.

2 thoughts on “Redirecting with Apache’s .htaccess”

  1. After some time you end up with a long .htaccess file. I wonder whether routing all traffic to one php script and doing the referring/translation there is more performant. Many content management systems, like Drupal (also CakePHP), do this.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.