SpikeTrap traps spambots that crawl the web looking for email addresses, and feeds them random invalid addresses to pollute their lists. By presenting dozens of links to its own slow-loading pages, this system catches spambots in a loop, often wasting several hours of their time.

Download

SpikeTrap 1.22
2.84 KB
27 January 2007

Requirements

SpikeTrap requires a web server running PHP 4 or higher. Because SpikeTrap uses PHP's PATH_INFO variable to create the illusion of different URLs, it will not work if PHP is running as a CGI. If you are unable to run SpikeTrap on your site, you can link to one of our local copies instead:

Installation

After downloading SpikeTrap, extract spiketrap.php and upload it to your web server. If you don't have a robots.txt file in your site's root directory, create one now. Put this in robots.txt:

User-agent: *
Disallow: /spiketrap.php/

This will keep web crawlers that obey robots.txt from becoming trapped. Most spambots don't honor robots.txt. Next, put a hidden link to spiketrap.php on one of your web pages:

<a href="/spiketrap.php/" style="display: none;"> </a>

Spambots will see this, but regular users will not.

Configuration

SpikeTrap can be run as-is without any modifications. However, certain variables in spiketrap.php can be changed to alter the script's output.

$config['minLength'] and $config['maxLength']
The minimum and maximum length of randomly generated strings like "triami" and "hyughexzwhve".
$config['minEmails'] and $config['maxEmails']
The minimum and maximum number of email addresses to display.
$config['minNames'] and $config['maxNames']
The minimum and maximum number of name/email pairs to display.
$config['minLinks'] and $config['maxLinks']
The minimum and maximum number of SpikeTrap links to display.
$config['chars']
An array of characters to be used in randomly generated strings.
$config['TLDs']
An array of top-level domains to be used in email addresses.
$config['extensions']
An array of file extensions to be used in SpikeTrap links, such as "php" and "html".
$config['sleep']
The time, in seconds, to wait while loading a page. This prevents spambots from overwhelming the server.
$config['logging']
Set this to 1 to enable access logging.
$config['logFile']
The absolute path to SpikeTrap's access log. This must be a writable file if access logging is enabled.
$templates
An array of heredocs containing the HTML for SpikeTrap pages. They use the template bits %title%, %emails%, %emails-nolink%, %names%, %links%, %text%, and %text2% which are replaced by relevant content when the script runs.

If you're running Apache, you can enable MultiViews and access spiketrap.php via /spiketrap/, adding to the illusion of a directory full of files rather than a single script. Add this line to .htaccess:

Options +MultiViews

Additionally, we recommend changing the filename of the script to prevent spambots from recognizing and avoiding it.

Field tests

Does SpikeTrap really work? Absolutely. When an email address harvesting program was directed to a site running SpikeTrap, it gathered over 2,500 invalid addresses from a few dozen pages. Numerous crawlers from elsewhere on the internet have also been caught in our copies of SpikeTrap. 204.249.11.19 was stuck for over an hour, and accessed SpikeTrap nearly 500 times. 212.227.76.195 was trapped for almost five hours and accessed SpikeTrap more than 5,000 times. It retrieved about 650,000 invalid email addresses, while only using 21MB of bandwidth. 80.62.113.42 was caught for nearly seven hours, and gathered about 815,000 invalid addresses.

License

All past, present and future versions of SpikeTrap are released into the public domain. Everyone may freely use, reproduce, publish, distribute, and modify SpikeTrap for any purpose, commercial or non-commercial, with or without attribution, to any extent and in any way. All past, present and future rights over SpikeTrap granted by copyright law, even moral rights, are relinquished.

Links

Wpoison
A similar Perl script that inspired SpikeTrap.
KLOTH.NET - Trap bad bots in a bot trap
A script to detect and block spiders that don't obey robots.txt.
Stopping Spambots: A Spambot Trap
A method of blocking spambots using Linux, Apache, mod_perl, Perl, MySQL, ipchains and Embperl.

Contact

Comments, questions, suggestions and corrections for SpikeTrap may be sent to rmuser@emptv.com. I welcome any ideas or improvements.