404 errors in Google webmaster tools

Posted on April 30, 2010

Since I’ve redone the whole website I’m getting lots of 404 errors which is expected. What I was less expecting is the 404 errors in Google webmaster tools for the new URLs!!

It looks like Google tries to get the URLs that are folders with an “index.html” at the end.

For example even though http://www.remiphilippe.fr/2010/04/29/bgp-send-label-in-inter-as-scenarios/ is in my sitemap Google will try to get http://www.remiphilippe.fr/2010/04/29/bgp-send-label-in-inter-as-scenarios/index.html

The way I found to overcome this issue is to use the .htaccess file to issue a 301 (permanent redirect) each time there is a request for index.html in a folder. So each time a request is made to http://www.remiphilippe.fr/2010/04/29/bgp-send-label-in-inter-as-scenarios/index.html there will be a reply 301 http://www.remiphilippe.fr/2010/04/29/bgp-send-label-in-inter-as-scenarios/

This is an easy workaround requiring only 2 lines in the .htaccess file

# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /

RewriteCond %{THE_REQUEST} ^.*/index.html
RewriteRule ^(.*)index.html /$1 [R=301,L]

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /wordpress/index.php [L]
# END WordPress

A bit of explanations…

RewriteCond %{THE_REQUEST} ^.*/index.html

This means match the URL without the domain name that start with anything (.*) and finishes by /index.html

RewriteRule ^(.*)index.html /$1 [R=301,L]

Redirect the client to the folder ($1) with the code 301

No Replies to "404 errors in Google webmaster tools"

    Got something to say?

    Some html is OK