Spambot Beware

Background and Information

(This is a part of the Spambot Beware site)

This section gives some background and general information about spam, spammers, and spambots, as well as some other things you will need to know to use the information on the Spambot Beware site.

Common terms used throughout this site can be found in the spam glossary. (All glossary terms are also hyperlinked for quick lookup in the glossary)

Topics:

What is spam?

Spam is another term for unsolicited commercial email (UCE). Most people with an account on the Internet are familiar with spam - it usually is advertising "spamware" (software for spammers), pornography, shady MLM (multi-level marketing) deals, and other scams. Spam was originally used to describe unwanted, off-topic, excessive posting on usenet, but has come to include email as well. For the purposes of this site, spam is email that is sent to other people without being requested. Note that this does not have to be a commercial message - the content of the spam is irrelevant. Here are some other sites that explain the spam problem in much more detail:

What is a spammer?

A spammer, simply, is a person who sends spam. Usually, these are people who think that they are going to get rich on the Internet by flooding it with messages and hoping for a response. They often do get a response. However, the response is from outraged people who receive the spam and complain to the ISP of the spammer, which usually gets the spammers dial-in accounts, email addresses, and/or web pages cancelled. Instead of deleting your spam, become one of those who fight spam. Just try fighting one spam per week, at first. Here are some excellent guides on how to fight spam:

How do spammers get email addresses?

Spammers generally gather email addresses in the following ways:
Using spambots to scour web pages
This is the main focus of this page. Spambots basically follow links and grab email addresses from "mailto" links, storing them as they go along. See the section on spambots below.

Using spambots that scour usenet
If you've been on usenet (aka newsgroups) you know the deal: you have to hide your email address or you will be swamped with spam. Not only do you have to disguise it in the body of your post, but in your newsreader client settings as well. Spambots love to grab those email messages. Some of the techniques described here can be used, or adapted to, usenet posting.

Specialized spambots
Some spambots are more specifically designed to scour certain places, such as a local bulletin boards, chatrooms on AOL, etc. These will not really be discussed as they are really too specialized to worry about. Usually it is up to the company running a service to discourage or prevent use, not the users.

Buying lists from other spammers or companies
You may have seen the spam - "Over 1 million email addresses on a CD!" Not just CDs but on ftp sites, web pages, etc. Once your email is harvested, it may get copied around for years. The only good news is that they want to charge other people for their hard work, so it does usually cost them some money to buy the addresses. This site will help prevent your email from ever getting on a CD in the first place.

From a mailing list
This is a partiularly despicable way. Spammers join a mailing list, then gather the email addresses of the members, either from a list of the members provided by the mailing list software, or from people as they post. It's hard to avoid this, short of not joining the list. On some mailing lists, you can "lurk", that is, hide your existence so that nobody knows that you are on the list. Until you decide to post, that is. :)

By people themselves
Commonly seen as part of a spam message: "To stop any future mailings, just reply to this message with a subject of REMOVE". Yeah, right. If you reply to the spammer, you accomplish three things:
  1. You verify an email address for the spammer as valid.
  2. You verify to the spammer that you actually read the mail, and took the time to reply to it.
  3. You demonstrate your lack of anti-spam knowledge to the spammer, by falling for this trick.

All of this means that you are more likely to receive more spam by replying. This scheme is also known as a opt-out mailing list and is a terrible alternative to opt-in.

Other ways
There are probably some other ways, but this list covers most of the common ones. Web pages and usenet are the main ways. The Center for Democracy & Technology has written a very good report entitled Unsolicited Commercial E-mail Research Six Month Report.

What is a spambot?

A Spambot is a piece of software, a program that someone has written. Which language it was written in does not matter, but most are probably written in C for speed and portability reasons. A spambot should not be confused with regular robots, also known as spiders or web-crawlers.

A spambot starts out on a web page. It scans the page for two things: hyperlinks and email addresses. It stores the email addresses to use as targets for spam, and follows each hyperlink to a new page, starting the process all over. Spambots also usually do not follow the guidelines in the robots.txt file, like civilized robots are supposed to. Most spambots are a part of a larger program, allowing them to send out the spam to email addresses as it find them. Others merely store the email addresses for later use.

Spambots vary in their intelligence and sophistication, but even the smartest can be fairly easily fooled by the tricks on this site. The simplest spambot would simply find mailto links, and follow each hyperlink as it comes up, until it reaches a dead end. The smartest ones can recognize email addresses in many forms, recognize dead links, avoid certain types of email addresses (such as *.edu and *.gov) and track many pages at once.


Spambot Beware: Main page <> Detection <> Avoidance <> Harassment <> Glossary

Written by Greg Sabino Mullane (greg "at" turnstep.com). Last update March 30, 2003.

Valid XHTML 1.0!