Sunday, May 26, 2013

.htaccess rules


Redirecting to or from WWW



Part 1 - How do I redirect all links for www.example.com to example.com ?

Create a 301 redirect forcing all http requests to use either www.example.com or example.com:

  • Example 1 - Redirect example.com to www.example.com:




  • RewriteEngine On
    RewriteCond %{HTTP_HOST} !^www.example.com$ [NC]
    RewriteRule ^(.*)$ http://www.example.com/$1 [L,R=301]


  • Example 2 - Redirect www.example.com to example.com:
    RewriteEngine on
    RewriteCond %{HTTP_HOST} ^www\.example\.com$
    RewriteRule ^/?$ "http\:\/\/example\.com\/" [R=301,L]



Explanation of this .htaccess 301 redirect:

Let's have a look at the example 1 - Redirect example.com to www.example.com. The first line tells apache to start the rewrite module. The next line:
RewriteCond %{HTTP_HOST} !^www.example.com$ [NC]

specifies that the next rule only fires when the http host (that means the domain of the queried url) is not (- specified with the "!") www.example.com.

The $ means that the host ends with www.example.com - and the result is that all pages from www.example.com will trigger the following rewrite rule. Combined with the inversive "!" is the result every host that is not www.example.com will be redirected to this domain.

The [NC] specifies that the http host is case insensitive. The escapes the "." - because this is a special character (normally, the dot (.) means that one character is unspecified).

The final line describes the action that should be executed:
RewriteRule ^(.*)$ http://www.example.com/$1 [L,R=301]

The ^(.*)$ is a little magic trick. Can you remember the meaning of the dot? If not, this can be any character(but only one). So .* means that you can have a lot of characters, not only one. This is what we need because ^(.*)$ contains the requested url, without the domain.

The next part http://www.example.com/$1 describes the target of the rewrite rule. This is our "final" used domain name, where $1 contains the content of the (.*).

The next part is also important, since it does the 301 redirect for us automatically: [L,R=301]. L means this is the last rule in this run. After this rewrite the webserver will return a result. The R=301 means that the webserver returns a 301 moved permanently to the requesting browser or search engine.

Redirect to example.com/index.php



You have a website with the name example.com and you want to redirect all incoming urls that are going to example.com/ to example.com/index.php
RewriteEngine On
RewriteCond %{HTTP_HOST} ^example.com$
RewriteRule ^$ http://example.com/index.php [L,R=301]

Explanation of this .htaccess 301 redirect:

What does this code above do? Let's have a look at Example 1 - Redirect example.com to www.example.com. The first line starts the rewrite module. The next line:
RewriteCond %{HTTP_HOST} !www.example.com$

specifies that the next rule only fires when the http host (that means the domain of the queried url) is not (- specified with the "!") www.example.com.

The $ means that the host ends with www.example.com - and the result is that all pages from example.com will trigger the following rewrite rule. Combined with the inversive "!" is the result every host that is not www.example.com will be redirected to this domain.

The [NC] specifies that the http host is case insensitive. The escapes the "." - because this is a special character (normally, the dot (.) means that one character is unspecified).

The final line describes the action that should be executed:
RewriteRule ^(.*)$ http://www.example.com/$1 [L,R=301].

The ^(.*)$ is a little magic trick. Remember the meaning of the dot? If not, this can be any character(but only one). The .* means that you can have a lot of characters, not only one. This is what was intended. ^(.*)$ contains the requested url, without the domain.

The next part http://www.example.com/$1 [L,R=301] describes the target of the rewrite rule -this is the "final" used domain name, where $1 contains the content of the (.*).

The next part is also important, since it does the 301 redirect for us automatically: [L,R=301]. L means this is the last rule in this run. After this rewrite the webserver will return a result. The R=301 means that the webserver returns a 301 moved permanently to the requesting browser or search engine.

Redirect visitors to a new site



You have an old website that is accessible under oldexample.com and you have a new website that is accessible under newexample.com. Copying the content of the old website to the new website is the first step - but what comes after that? You should do a 301 moved permanently redirect from the old domain to the new domain - which is easy and has some advantages:

  • Users will automatically be redirected to the new domain - you do not have to inform them.

  • Search engines will be redirected to the new domain and all related information will be moved to the new domain (but this might take some time).

  • Google's PageRank รข„¢ will be transfered to the new domain, as well as other internal information that is being used to set the position of pages in the search engine result pages (serp's) - like TrustRank .


Create a 301 redirect for all http requests that are going to the old domain.


    • Example 1 - Redirect from oldexample.com to www.newexample.com:



RewriteEngine On
RewriteCond %{HTTP_HOST} !oldexample.com$ [NC]
RewriteRule ^(.*)$ http://www.newexample.com/$1 [L,R=301]


      This is useful when you use www.newexample.com as your new domain name (see also this article about redirecting www and non-www domains). If not - use the code of example 2.



  • Example 2 - Redirect from oldexample.com to newexample.com:
    RewriteEngine On
    RewriteBase /
    RewriteCond %{HTTP_HOST} !oldexample.com$ [NC]
    RewriteRule ^(.*)$ http://newexample.com/$1 [L,R=301]




How to add a trailing slash



Some search engines remove the trailing slash from urls that look like directories - e.g. Yahoo does it. However it could result into duplicated content problems when the same page content is accessible under different urls. Apache gives some more information in the Apache Server FAQ.

Let's have a look at an example: example.com/google/ is indexed in Yahoo as example.com/google - which would result in two urls with the same content.

The solution is to create a .htaccess rewrite rule that adds the trailing slashes to these urls. Example - redirect all urls that do not have a trailing slash to urls with a trailing slash:
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !example.php
RewriteCond %{REQUEST_URI} !(.*)/$
RewriteRule ^(.*)$ http://example.com/$1/ [L,R=301]

Explanation of the add trailing slash .htaccess rewrite rule:

The first line tells Apache that this is code for the rewrite engine of the mod_rewrite module of Apache. The 2nd line sets the current directory as page root. But the interesting part is:
RewriteCond %{REQUEST_FILENAME} !-f

makes sure that existing files will not get a slash added. You shouldn't do the same with directories since this would exclude the rewrite behavior for existing directories. The line
RewriteCond %{REQUEST_URI} !example.php

excludes a sample url that should not be rewritten. This is just an example. If you do not have a file or url that should not be rewritten, remove this line. The condition:
RewriteCond %{REQUEST_URI} !(.*)/$

finally fires when a url does not contain a trailing slash. Now we need to redirect the urls without the trailing slash:
RewriteRule ^(.*)$ http://example.com/$1/ [L,R=301]

does the 301 redirect to the url, with the trailing slash appended. You should replace example.com with your domain.

==========================================================

Create a plain text .htaccess file (click the link for details on this type of file), or add the lines from the example to the top of your existing .htaccess file.
Add the lines from the appropriate example to your file. Note that you should replace example text with your own information. Replace example.com with your own domain, folder1 with your own folder name, file.html with your own file name, etc. Save your changes.
Use FTP to upload the file to the document root of the appropriate domain. If your domain is example.com, you should upload the file to:
domains/example.com/html/
That's it! Once you've uploaded the file, the rewrite rule should take effect immediately.

Some Content Management Systems (CMSs), like WordPress for example, overwrite .htaccess files with their own settings. In that case, you may need to figure out a way to do your rewrite from within the CMS.
http://example.com/folder1/ to http://example.com/folder2/

http://example.com/folder1/ becomes http://example.com/folder2/ or just http://example.com/.

domains/example.com/html/folder2/ must exist and have content in it for this to work.
.htaccess

This .htaccess file will redirect http://example.com/folder1/ to http://example.com/folder2/. Choose this version if you don't have the same file structure in both directories:

Filename: .htaccess

Options +FollowSymLinks
RewriteEngine On
RewriteRule ^folder1.*$ http://example.com/folder2/ [R=301,L]
This .htaccess file will redirect http://example.com/folder1/ to plain http://example.com/. Choose this version if you want people redirected to your home page, not whatever individual page in the old folder they originally requested:
Filename: .htaccess.

Options +FollowSymLinks
RewriteEngine On
RewriteRule ^folder1.*$ http://example.com/ [R=301,L]
This .htaccess file will redirect http://example.com/folder1/file.html to http://example.com/folder2/file.html. Choose this version if your content is duplicated in both directories:
File name: .htaccess

Options +FollowSymLinks
RewriteEngine On
RewriteRule ^folder1/(.*)$ http://gs.mt-example.com/folder2/$1 [R=301,L]
Test

Upload this file to folder2 (if you followed the first or third example) or your html folder (if you followed the second example) with FTP:

Filename: index.html

<html>
<body>
Mod_rewrite is working!
</body>
</html>
Then, if you followed the first or second example, visit http://example.com/folder1/ in your browser. You should see the URL change to http://example.com/folder2/ or http://example.com/ and the test page content.

If you followed the third example, visit http://example.com/folder1/index.html. You should be redirected to http://example.com/folder2/index.html and see the test page content.

Code explanation

Options +FollowSymLinks is an Apache directive, prerequisite for mod_rewrite.
RewriteEngine On enables mod_rewrite.
RewriteRule defines a particular rule.
The first string of characters after RewriteRule defines what the original URL looks like. There's a more detailed explanation of the special characters at the end of this article.
The second string after RewriteRule defines the new URL. This is in relation to the document root (html) directory. / means the html directory itself, and subfolders can also be specified.
$1 at the end matches the part in parentheses () from the first string. Basically, this makes sure that sub-pages get redirected to the same sub-page and not the main page. Leave it out to redirect to the main page. (It is left out in the first two examples for this reason. If you don't have the same content in the new directory that you had in the old directory, leave this out.)
[R=301,L] - this performs a 301 redirect and also stops any later rewrite rules from affecting this URL (a good idea to add after the last rule). It's on the same line as RewriteRule, at the end.
http://example.com/file.html to http://example.com/folder1/file.html

http://example.com/file.html becomes http://example.com/folder1/file.html.

Note: The directory folder1 must be unique in the URL. It won't work for http://example.com/folder1/folder1.html. The directory folder1 must exist and have content in it.

.htaccess

This .htaccess file will redirect http://example.com/file.html to http://example.com/folder1/file.html:
Filename: .htaccess

Options +FollowSymLinks
RewriteEngine On
RewriteCond %{HTTP_HOST} example.com$ [NC]
RewriteCond %{HTTP_HOST} !folder1
RewriteRule ^(.*)$ http://example.com/folder1/$1 [R=301,L]
Test

Upload this file to folder1 with FTP:

Filename: index.html

<html>
<body>
Mod_rewrite is working!
</body>
</html>
Then, visit http://example.com/ in your browser. You should see the URL change to http://example.com/folder1/ and the test page content.

Code explanation

Options +FollowSymLinks is an Apache directive, prerequisite for mod_rewrite.
RewriteEngine On enables mod_rewrite.
RewriteCond %{HTTP_HOST} shows which URLs we do and don't want to run through the rewrite.
In this case, we want to match example.com.
! means "not." We don't want to rewrite a URL that already includes folder1, because then it would keep getting folder1 added, and it would become an infinitely long URL.
[NC] matches both upper- and lower-case versions of the URL.
RewriteRule defines a particular rule.
The first string of characters after RewriteRule defines what the original URL looks like. There's a more detailed explanation of the special characters at the end of this article.
The second string after RewriteRule defines the new URL. This is in relation to the document root (html) directory. / means the html directory itself, and subfolders can also be specified.
$1 at the end matches the part in parentheses () from the first string. Basically, this makes sure that sub-pages get redirected to the same sub-page and not the main page. Leave it out to redirect to the main page of the subdirectory.
[R=301,L] - this performs a 301 redirect and also stops any later rewrite rules from affecting this URL (a good idea to add after the last rule). It's on the same line as RewriteRule, at the end.
Add www or https

http://example.com becomes http://www.example.com. Or, http://example.com becomes https://example.com.

.htaccess

This .htaccess file will redirect http://example.com/ to http://www.example.com/. It will also work if an individual file is requested, such as http://example.com/file.html:
Filename:.htaccess

Options +FollowSymLinks
RewriteEngine on
RewriteCond %{HTTP_HOST} ^example.com [NC]
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]
This .htaccess file will redirect http://example.com/ to https://example.com/. It will also work if an individual file is requested, such as http://example.com/file.html:
Filename: .htaccess

RewriteEngine On
RewriteCond %{SERVER_PORT} 80
RewriteRule ^(.*)$ https://www.example.com/$1 [R,L]
Test

Visit http://example.com in your browser. You should see that the same page is displayed, but the URL has changed to http://www.example.com (first example) or https://example.com (second example).

Also, http://example.com/file.html will become http://www.example.com/file.html or https://example.com/file.html.

Code explanation

Options +FollowSymLinks is an Apache directive, prerequisite for mod_rewrite.
RewriteEngine On enables mod_rewrite.
RewriteCond %{HTTP_HOST} shows which URLs we do and don't want to run through the rewrite.
In this case, we want to match anything that starts with example.com.
[NC] matches both upper- and lower-case versions of the URL.
RewriteRule defines a particular rule.
The first string of characters after RewriteRule defines what the original URL looks like. There's a more detailed explanation of the special characters at the end of this article.
The second string after RewriteRule defines the new URL. This is in relation to the document root (html) directory. / means the html directory itself, and subfolders can also be specified.
$1 at the end matches the part in parentheses () from the first string. Basically, this makes sure that sub-pages get redirected to the same sub-page and not the main page.
[R=301,L] - this performs a 301 redirect and also stops any later rewrite rules from affecting this URL (a good idea to add after the last rule). It's on the same line as RewriteRule, at the end.
Regular expressions

Rewrite rules often contain symbols that make a regular expression (regex). This is how the server knows exactly how you want your URL changed. However, regular expressions can be tricky to decipher at first glance. Here's some common elements you will see in your rewrite rules, along with some specific examples.

^ begins the line to match.
$ ends the line to match.
So, ^folder1$ matches folder1 exactly.
. stands for "any non-whitespace character" (example: a, B, 3).
* means that the previous character can be matched zero or more times.
So, ^uploads.*$ matches uploads2009, uploads2010, etc.
^.*$ means "match anything and everything." This is useful if you don't know what your users might type for the URL.
() designates which portion to preserve for use again in the $1 variable in the second string. This is useful for handling requests for particular files that should be the same in the old and new versions of the URL.
See more regular expressions at perl.org.

Troubleshooting

404 Not Found

Examine the new URL in your browser closely. Does it match a file that exists on the server in the new location specified by the rewrite rule? You may have to make your rewrite rule more broad (you may be able to remove the $1 from the second string). This will direct rewrites to the main index page given in the second string. Or, you may need to copy files from your old location to the new location.

If the URL is just plain wrong (like http://example.com/folder1//file.html - note the two /s) you will need to re-examine your syntax. (mt) Media Temple does not support syntax troubleshooting.

Infinite URL, timeout, redirect loop

If you notice that your URL is ridiculously long, that your page never loads, or that your browser gives you an error message about redirecting, you likely have conflicting redirects in place.

You should check your entire .htaccess file for rewrite rules that might match other rewrite rules. You may also need to check .htaccess files in subdirectories. Note that FTP will not show .htaccess files unless you have enabled the option to view hidden files and folders. See our .htaccess article for details.

Also, it's possible to include redirects inside HTML and PHP pages. Check the page you were testing for its own redirects.

Adding [L] after a rewrite rule can help in some cases, because that tells the server to stop trying to rewrite a URL after it has applied that rule.

No comments:

Post a Comment