For most of the last decade it's been obvious that adding a
mailto: link to a Web site is an invitation to be spammed. While their purpose is to make it easy for people to contact us, it also makes the job of spammers easier. Email addresses and
mailto: links are so easily identified that automated address harvesting programs traverse the Web, adding whatever addresses they find to spammers' lists. The result: what began as a way to encourage contact has become a liability. Not only is the simple
mailto: link long dead; these days, simply posting an email address publicly is guaranteed to draw down the deluge.
Junk email is an annoyance in both public and private spheres. It wastes public resources (the capacity required to store and transmit the message across the Internet) and private resources (mailbox space, the time required to download it, and your attention).
More important and insidious, the pollution of public space discourages citizens from participating. The knowledge that a third party could disclose your email address has a chilling effect on public speech. For example, some Web archives of mailing lists do not remove the email addresses of contributors. What was disclosed to a private group has now become public (and thus harvestable). This discourages some people from participating, simply to protect their email addresses from pollution. Thus the chilling effect extends to private discussions, and to participation in any forum that requires disclosure of an email address. Worse, this chilling effect disproportionately affects those who care about issues of privacy and security -- precisely those who have informed opinions on the subject.
How can we prevent this from being a problem? In general, strategies for preserving email addresses fall into two categories: keeping an address secret, and increasing the cost of sending email. Of the nine techniques to be discussed, the most successful all raise the cost of email in one way or another.
Use HTML's numeric entities to build a
mailto: link. That is, replace 'm' with 'w', 'a' with, 'a', etc.
Advantages: Simple to code (there are Web sites that translate plain text into numeric entities).
Drawbacks: Trivial for address harvesters to crack.
Rather than providing an email address in text, provide an image of text (e.g., ).
Drawbacks: Annoys correspondents; doesn't work if the browser doesn't render images; vulnerable to man-in-the-middle type attacks.
Advantages: Simple and effective.
Address mangling. One can provide the pieces of an address for visitors to assemble (e.g.,
foo @ bar.com), or add junk that must be removed (
Drawbacks: Annoys correspondents.
Rely on correspondents' common sense. If your address is obvious, don't publish it. If I were user
bar.com, it would be reasonable to try the email address
Drawbacks: Annoys correspondents.
CGI. Correspondents submit a message via a form on a Web page. A CGI script mails the message to the recipient.
Advantages: Secure; prevents disclosure of email addresses to the public.
Drawbacks: Doesn't work offline (not a big problem); requires ability to add & execute a script on a Web server, a privilege usually restricted to Webmasters; inoperability with standard email clients makes prolonged correspondence difficult (that is, you can't easily search old messages).
Address-hiding methods have a glaring weakness: if they allow a visitor to determine an address, then it is only a matter of time before that address escapes into the public and falls victim to address harvesters. The only address-hiding technique that avoids this problem is the use of a CGI script, but most people on the Internet can't use this technique.
Increasing the cost of email only works if the likelihood of getting through to a randomly chosen address is decreased. This raises the cost of unsolicited email while decreasing its success rate.
Pass phrases. Reject mail that does not include a special keyword or pass phrase.
Advantages: Simple; easy to change if the pass phrase becomes poisoned.
Drawbacks: Annoys correspondents; doesn't discourage spam, just automates ignoring it.
Challenge/response protocols. Mail from an unknown sender generates an automatic reply asking the sender to perform some task: responding with some key phrase, filling out a Web form, et cetera. Completion of the task adds the sender's address to the recipient's whitelist.
Advantages: practically eliminates spam.
Drawbacks: Annoys correspondents; doesn't discourage spam, just automates ignoring it; unusable by those with primitive email clients (for example, Web mail); vulnerable to man-in-the-middle type attacks.
Requiring a pass phrase collapses the challenge/response protocol into one step.
Contact tokens. Jakob Nielsen discusses these on p. 369 of Designing Web Usability. A contact token is a bit of executable code that regulates access to an individual. Each token has its own cost, rules, and lifetime. For example, a token keyed to family members could have zero cost and no expiration date, another for the general public could cost a small amount and expire after a month, while one for immediate access could have a significant cost.
Advantages: Flexibility. Each person can set his or her own rate for email access.
Drawbacks: Contact tokens won't be adopted without widespread support in email clients, but it is very difficult to encourage software companies to support a new technology that isn't ubiquitous. It's a classic chicken-and-egg problem.
Another drawback is that most users do not want to actively manage email access. (This is part of what has kept PGP and its descendants in their tiny niche.)
(Note that a pass phrase is a primitive contact token.)
Hash cash. Adam Back created hash cash, which attaches a variable cost to each email message, paid by the sender. The cost isn't money, but rather CPU time. For example, you could set the cost of messages from senders you don't know to be five seconds of the sender's CPU time, while messages from friends are free. The price of the message is a small inconvenience to the sender.
Implementing hashcash is a fundamental change to the nature of email. Assigning a small CPU cost to messages wouldn't burden most people, while simultaneously making spam not only ineffective but unprofitable -- the surest way to kill the junk email industry.
Advantages: Flexibility; transfers cost of email to sender.
Drawbacks: Same as for contact tokens.
Abandon email. Radical, some have been done it (e.g., Donald Knuth). Raises the bar of contact to the cost of creating and mailing a letter via the postal service.
Advantages: Completely effective.
Making email contact more expensive raises a number of problems. For example, many of the techniques on this page assume the existence of another means of transmitting clues to one's email address. If you don't have a Web page, many of these techniques are useless.
A more serious problem is that many of the techniques outlined above attack only part of the problem. Once your email address becomes public, it will be a target for spam. Even if your email filters are perfect, and you never see a single message you don't want to, spam sent to your address still wastes Internet resources. Indeed, the cost of spamming is so low that decreasing success rates might encourage spammers to increase their efforts. The Internet could become a place where personal messages can't get through mail servers overwhelmed by ever-increasing mountains of spam. This is where John Gilmore's open-relay/smart-email-programs approach fails.
What's most frustrating is that the technology exists for putting spammers out of business while preserving the best features of the current email system, yet it hasn't been put in place. Am I going to have to start writing hashcash plugins for popular email clients?