Auto-Notification of Broken Links, Fantastic Pre-Written Anti-Spam .htaccess File!

×

This post was originally published in 2009
It may contain stale & outdated information. Or it may have grown more awesome with age, like the author.

Recently I’ve been learning a bit about .htaccess, redirecting visitors who come via now-broken links.

To help with this, I inserted a little mailer script into my error 404 page that sends me an email containing the referring URI, the URI the visitor was trying to reach and various other tidbits of information. It works well, allowing me to quickly fix broken links that I would otherwise be unaware of.

It also writes the beginning of a rewrite rule, as if I wasn’t lazy enough.

Here it is:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
<? php 
	//mail me about this error...
	$to = "mike(@)pagesofinterest(.)net";
	$base = "http://pagesofinterest.net";
 
	$referrer = $_SERVER['HTTP_REFERER'];
	$requested_page = $_SERVER['REQUEST_URI'];
	$agent = $_SERVER['HTTP_USER_AGENT'];
	$ip = $_SERVER['REMOTE_ADDR'];
	$subject = "Error 404 - $requested_page";
	$body = "The page: $requested_page, referred to by $referrer does not exist.  Fix this or add a rewrite.\n\nAgent: $agent\n\n";
	$body .= "IP: $ip \n\n";
	mail($to, $subject, $body, $headers);
 
	//append to file:
	$redirect = "Redirect 301 $requested_page http://pagesofinterest.net/\n";
	$filename = "redirects.txt";
	$fh = fopen($filename, 'a');
	fwrite($fh, $redirect);
	fclose($fh);
?>

Use:

sort redirects.txt | uniq > unique_redirects.txt

In a terminal to get a smaller file containing only unique broken links.

One problem was the sheer volume of messages I was getting. I did some research, and it seems this site has become the target of spammers. After a bit of googling on how to discourage this, I came across the best thing a newb could hope for: a perfect, commented example. Aaron Logan has graciously made his .htaccess file available to the world. It handles bad user agents, known bad IP’s, and keywords in referrer URI’s. It is gold.

You can get your own copy of it here: best anti spam .htaccess file.

Thanks Aaron!

No comments | Trackback