They're sneaky. And stealthy. They're quiet and mostly unobtrusive, but once you've been visited by them, you'll know it. Because you'll be inundated with a seemingly never-ending stream of spam-mails.
They're email harvesting robots, and chances are you've been visited by one.
What these insidious creatures do is crawl your site, much like the search engine spiders do, and collect any and all email addresses they find there. Many of them crawl your entire site, following every link, gathering email addresses from your guestbook, your message boards, databases, and everywhere else they can get to.
What happens next is so sinister, so unthinkable; I can barely say it. They put your email addresses on CDRom and sell them- as opt-in lists. You've seen them, "20,000 targeted email addresses for only $29.95!", or my personal favorite, "Send 10 Bazillion emails- WITHOUT SPAMMING!!". What you didn't know was that it was YOUR email address they were selling.
To find out if your site has been visited by an email harvester, you only need to look at your logs. If your web host provides you with your stats, you can look in the Browser report for any of the following:
If you don't have a stats program, you can examine your logs for visits from these agents. The easiest way to do this is to download them and open them in a program with a search function (like Wordpad). Then you can search for the names listed above.
So, what can you do to protect your site from these evil robots? Unfortunately, there's no single magic solution. There are, however steps you can take to discourage them.
The first thing you can do is create a Robots Exclusion file. This is simply a text file named robots.txt that you place in your root directory. What this file does is tells robots where they can and cannot go (as well as which robots can and cannot visit your site). The drawback of using this file to combat email harvesting robots is that as a rule, the robots.txt file is based on a sort of robot honor system. That is to say that you are assuming that any robot that visits will ask for and comply with the directives that you put there. Unfortunately, harvesting robots are typically ill-mannered robots that ignore this file. For more information on Robot Exclusion, visit the Robots Exclusion Standard
A really fun solution is to use a cgi-script that punishes bad robots. What these do is to direct the robot to a page full of fake email addresses- lots and lots of them. So, what the spammer gets is a whole lot of bounced email messages, which will discourage them from visiting you again. The downside of this method is that they do also collect the valid email addresses. Also, most scripts of this type have a little disclaimer attached to them stating that they won't be held responsible for any legal issues that arise from the use of their script- and that has to make you wonder.
There are other scripts that hide your email address from the robots, but not your site visitors. This is a great solution for smaller sites that don't have more than one or two addresses listed. You can find both types of scripts at the CGI Resource Index
Another handy script is one that will check to see if a robot is friendly, and if not it will put it to sleep for say, 10,000 minutes. This will cause the robot to terminate the request and move on to another victim. $number = $ENV{REMOTE_ADDR};
($a,$b,$c,$d)=split(/./,$number);
$ipadr=pack("C4",$a,$b,$c,$d);
($name,$aliases,$addrtype,$length,
@addrs)=(gethostbyaddr("$ipadr", 2));
if ($name =~ /foo.com/i) {
$ENV{HTTP_USER_AGENT} =~ /emailsiphon/i;
$access_denied++;
sleep(10000);
}
The last option is, in my humble opinion, the best option. If you have the ability to modify your .htaccess file, you can specify certain host agents that are not allowed to visit your site using the mod_rewrite file. This effectively blocks the offending robots from ever touching your site. You should definitely check with your hosting provider to see whether or not you can make such a modification. Most hosts will be more than happy to make the modification for you.
For those of you willing and able to make the changes yourself, just add the following to your.htaccess file:
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR]
RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*NEWT [OR]
RewriteCond %{HTTP_USER_AGENT} ^Crescent [OR]
RewriteCond %{HTTP_USER_AGENT} ^CherryPicker [OR]
RewriteCond %{HTTP_USER_AGENT} ^[Ww]eb[Bb]andit [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebEMailExtrac.* [OR]
RewriteCond %{HTTP_USER_AGENT} ^NICErsPRO [OR]
RewriteCond %{HTTP_USER_AGENT} ^Telesoft [OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus.*Webster [OR]
RewriteCond %{HTTP_USER_AGENT} ^Microsoft.URL [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla/3.Mozilla/2.01 [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailCollector
RewriteRule ^.*$ /badspammer.html [L]
While these are all effective measures to fight the Email Snatchers, there are new robots evolving every day. It's important to stay informed with the latest tools that the spammers are using. Some excellent sources of information can be found at:
Search Engine World
http://searchengineworld.com/engine/denied.htm
Apache Today
"Restricting Access by Host"
SpiderHunter.com
http://www.spiderhunter.com/
--------------------------------
© Copyright 2001 Sharon Davis. When she is not waging war on spammers, she is the owner of 2Work-At-Home.Com, Work At Home Articles.net and the Editor of the site's monthly ezine, America's Home. In her spare time she reminisces about what it was like to have spare time. To subscribe to her free ezine, Click Here
![]() |
|
![]() |
|
![]() |
|
![]() |
A blacklist, as the name implies, is a list of... Read More
It's a nightmare isn't it? You fire up your email... Read More
It seems like the volume of email spam has doubled... Read More
Below is a letter I wrote to the following organizations:S.H.U.... Read More
Although there still seem to be some differences among the... Read More
Even being as careful as possible with my email address,... Read More
News last week that Internet service provider Verizon settled its... Read More
One of our Australian clients sent out a campaign using... Read More
Spam can bring down your website faster than a speeding... Read More
Effectively stopping spam over the long-term requires much more than... Read More
The temptation among internet marketers to SPAM is greater than... Read More
Spam. Those annoying, time-consuming emails that clog your Inbox and... Read More
Block Spam and Other Email Threats From Entering Your Gateway... Read More
Junk mail works. Why does it work? How does it... Read More
Most of us have opened our email program and found,... Read More
While the Federal Trade Commission is busy fighting over definitions... Read More
Q: I am so sick of all the spam that... Read More
I must be the luckiest person alive! My inbox is... Read More
Spam filters are responsible for deleting a high percentage of... Read More
Effectively stopping spam over the long-term requires much more than... Read More
In 1998, nearly 10% of all email traffic on the... Read More
Are you tired of spam stealing your time, your money,... Read More
The first thing I do every morning when I wake... Read More
Your message is not being delivered.If you send emails to... Read More
I'm sure you find spam just as frustrating and annoying... Read More
Spam is everywhere. It's the "in-box lunch meat" nobody likes,... Read More
Phishing is rapidly becoming on the largest threats to your... Read More
You may have already received a do not spam list... Read More
English, German, Italian - It's All SPAM To MeHas anyone... Read More
Death by spam is now possible with a new device... Read More
Very often SPAMMERS take advantage of catch-all email setup on... Read More
Spam is out of control! I guess that would be... Read More
Spam, spam, spam. It's terrible not only for those of... Read More
Over the past few years you've all become familiar with... Read More
We've all become familiar with the term spam. It's become... Read More
Spam, as defined in the context of computers, the Internet... Read More
Although there still seem to be some differences among the... Read More
A friend of mine received a chilling email message from... Read More
What comes to your mind when you think about your... Read More
... and you'd better sit up and take notice! Customers... Read More
Do you get bounced, or rejected emails sent by someone... Read More
In today's spam-filled email world, it's sometimes VERY difficult to... Read More
Am I Just Being Paranoid Or Are The Robots Out... Read More
No. I'm not talking here about the outdoor activity enjoyed... Read More
Stop intrusive pop-up ads and regain control of your online... Read More
I'm really, truly fed up with spam. Every day when... Read More
The first thing I do every morning when I wake... Read More
Are you getting too much spam? We all are, but... Read More
Email is the quintessential business communication tool, so when it... Read More
While we all admit that unsolicited commercial email is a... Read More
Do you like spam? No, I'm not kidding. Everybody knows... Read More
Junk mail works. Why does it work? How does it... Read More
For years I didn't worry much about spam.But lately it's... Read More
I'm sure I'm not the only person on the planet... Read More
The temptation among internet marketers to SPAM is greater than... Read More
Spam is annoying. Period. Why people would want to send... Read More
Like everybody who will ever read this, I get spam... Read More
A hearty welcome to all the spam fighting filters and... Read More
You must be one of them experiencing a lot of... Read More
Microsoft scores one for the good guysScott Richter, the self-proclaimed... Read More
I. BACKGROUNDThe CAN-SPAM Act of 2003 (Controlling the Assault of... Read More
5 Ways Spam Is Affecting Your Business And what we... Read More
If you are buried in SPAM then you're not alone.... Read More
1. Ignore Spam EmailDo not open an unsolicited email. Spammers... Read More
I'm sure you find spam just as frustrating and annoying... Read More
They're sneaky. And stealthy. They're quiet and mostly unobtrusive, but... Read More
Spam Blocking |