Spam on Agora

Spam is an epidemic of sewage polluting the entire Internet. Words cannot do justice to my contempt for the slimeballs doing their level best to make the Internet unuseable. Here are some graphs showing how I'm managing the filth (and I don't mean porn!).

My primary tool has been SpamAssassin. Installing this was a major breath of fresh air --- I was getting 100-150 spams a day, and it seemed like I had to check my mail every hour or two just to delete the crap. Real email was in danger of getting lost. I installed SpamAssassin and it went down to 5-10. It took a little while to realize why I suddenly had more time! But over the next year or two, it built up. The following graph shows a very linear growth of spam being filtered by spamassassin (just to me!) over the first half of 2003, with a huge peak around 3000! messages/day, and for the month of May/June running 1000+. Agora was melting down under the load.

That sudden cliff just before day 200 occurred when I discovered greylisting. Installing that to block spam before it even got into the system made Agora useable again. This graph shows the amount of mail that greylisting has blocked or delayed since I installed it (unfortunately, there's not an easy way to count what never comes back for a retry to get through, so this counts first time legitimate mail too; I believe it to be an order of magnitude approximation of the spam blocking though):

A few months ago, I decided to start tracking how much was leaking through as well. It wasn't too bad, but I noticed that quite a bit was still getting through, and when I looked at the spamassassin reports on them, it turned out that most were getting through because the Bayesian rules reported them unlikely to be spam. I realized that all that random crap spammers are putting into their messages is working and rendering the Bayesian rules helpless. They're just consuming vast quantities of disk space. So I turned them off, just before day 100 in this graph of spam getting through to my inbox:

So far, I'm holding in the 10-20 leak range, which is at least tolerable, if still annoying. We'll see where the next battleground will be... I do notice that in the last month, there's been an upward trend in spam getting through the greylist, so apparently spammers are bringing on more resources in order to do the retries...