Blocking Images from One Domain
I’ve got a fancy 404 page (displayed when you try to access a page that doesn’t exist, like this one) on this blog, built on the principles outlined in the great A List Apart article The Perfect 404. It’s a great piece of logic that Ian Lloyd came up with that I’ve now implemented once in Java taglibs at Radiant Core and again in PHP for this blog. The gist is that the page changes depending on where you came from in order to display a ’smart’ result, and emails me when something is broken so that I can fix it. If there’s enough interest, I could turn this into a WordPress plugin — leave a comment if you’d like to see that.
One of the advantages of the 404 page emailing me when something can’t be found is knowing that someone else has linked directly to an image that used to be on this site but isn’t anymore. I had a last.fm sidebar for a while with the music I’d listened to but removed it after I figured out that most of my listening was on machines or devices that weren’t logging. In the short time it was on the site, youdao.com latched on to one of the images and linked to it directly for a page on their site. This isn’t a particularly bad practice, but since the image doesn’t exist here anymore they aren’t going to see it on their end. Given that we obviously don’t speak the same language, I thought it might be easier to just redirect their requests for images with a 403 error (forbidden). mod_rewrite to the rescue!
If you’re running an Apache web server and have the module enabled, and have it configured to allow .htaccess files to override your http.conf, you can achieve the same simple result by adding two lines to your .htaccess:
RewriteCond %{HTTP_REFERER} ^http://www.requestingsite.com/ [NC]
RewriteRule .*\.(jpg|gif|png)$ - [F]
The first line matches on the domain of the site, with [NC] specified to tell mod_rewrite to ignore case-sensitivity. The second line says requests for anything ending in any of the three major image formats are forbidden (hence the [F]). How do you know it’s working? Try running the page through something like Google Translate, which pops up a JavaScript error when it gets the 403 back. And remember, as always, check your site after making changes to your .htaccess to make sure you haven’t broken anything!
