Paul Wouters's (paul@xtdnet.nl) personal spam statistics 1997-2004

Total amount of spams received in my life as of January 1st, 2005: 141.329 spams

Most of the current graphs are made with Mail::Graph, which can be downloaded through CPAN. Some are made manually with gnuplot.


Eight years of spam

I started putting up these graphs to show people why exactly spam (or UCE) is such a very bad thing. Sadly, I was right. I only wish people had realised this a few years ago. The result is now very disheartening:

All spams available

All the spam I have received are available verbatim in my spam archive. Unix mbox format of all my spam is available upon request, with proper motivation.

Note that some spam was directly addressed to me, while others got to me because I'm the "postmaster" for various domains.

Also, during 1999 I enabled the RBL list on top of the previous anti-spam meassures. In 2000 we switched off the RBL, after many RBL's with weird policies appeared and moved to a tagging only system. We are currently using SpamAssassin to mark potential spam as such, which uses various blacklists through dns. We also use a virus scanner to remove viri.
2003 marks the year where I had to give up checking my spam folder for false positives, it just became undoable.

If someone knows what happened in september 2004 that explains the decrease in spam, I would be very interested. Perhaps this is related to a software upgrade (eg spamassassin), but if so, then I cannot remember it.

2005 marks the year where I have given up collecting spam. The distinction between a virus, a bounce, an error, or a piece of spam has become inseperable, and I think by now my original point has been made clear.

  • my old xs4all/hacktic spam
  • 1997 (541 spams)
  • 1998 (661 spams)
  • 1999 (598 spams)
  • 2000 (1230 spams)
  • 2001 (1629 spams)
  • 2002 (7796 spams)
  • 2003 (33469 spams)
  • 2004 (95405 spams) <--- Clicking this link will likely kill your browser!
  • Total amount of spams received in my life as of 01-01-2005 00:00 is 141.329 spams
(discrepancies in numbers are due to the difference in processing email between Mail::Graph and Hypermail. Hypermail drops messages with the same message-id, and apparently some spamruns happen multiple times with the same message-id. Therefor Hypermail sees 95386 spams in 2004 while Mail::Graph sees 96882 spams))


Effect of Versign Wildcard and SoBig virus

The graph below is a closeup of the last few months, where a few major things happened. The closure of some blacklist DNS services, the SoBig virus, and the Verisign Wildcard issue. The big peak on August 26 is the result of the SoBig virus. Especially interesting tidbit is that we never return to before-Sobig levels! Therefor one has to conclude that the last few months of insane spamlevels are mostly due to SoBig-infected machines still running and spamming. I do not see any noticable spam increase as a result of the Verisign Wildcard. That does NOT mean I think the wildcard is a good idea! I am strongly opposed to Verisign's stupid wildcard idea to spam people who make a typo with their ad-driven search engine!

Viri statistics

NOTE: I have two DVDs full of viri collected over the years until 31-12-2004, but they still need to be processed. Any volunteers? Since the SoBig virus went beyond any previous levels, it flattened out the virus statistics completely. Below is the graph which is topped at 200 viri/day.
Here is the unlimited graph, including the first days of MyDoom/Novarg

SpamAssassin


I am receiving regular comments, feedback and threats. Some from people who have now seen the light about their past mistake and apologised, some from people who asked me how to improve their search engine hit to get more visitors (duh!) and the most amusing ones are the legal threats people have sent me. Some even try to use copyright infringement as the base of their threat :) These people invariable never contact me again after I inform them all my email (including their complaint) and any public and/or legal documents arising out of a lawsuit are going to be public and if needed put on this same page as well. Ofcourse, if you appear on these pages, and this is a mistake or otherwise misleading information about you or your company, feel free to contact me with more information, and if appropriate, your spam will be anonimised.

The UEFF

2003 was also the year of the attack of the "United Email Freedom Front" (I bet these guys never saw the Life of Brian, or they wouldn't pick that name). During April they threatened to DDOS me off the net if I didn't remove my spam archive. Ofcourse I didn't comply, and went through a few DDOS attacks. The archive survived, and is still online. Though the motivation of this group/person was not clear, there are a few hints that it might have a relationship to the MegaMania 'pump and dump' stock-fraud operation. But it gave me some nice "spammer statistics". As a result of those spamruns forged in my name, we received: Now let's assume that 25% of the addresses used in the spamruns were valid email addresses. That would mean that the spamrun size would be 10.000. Of those, 400 people WOULD HAVE GIVEN their address to the spammers in some way. That would be 4 percent! We also had to block our info@ address for a while, and blocked over 55.000 attempts to mail us. Where our website normally serves about 10.000 hits/day, it was doing between 180.000 and 300.000 hits/day during the attack.
If anyone has any real statistics on validity of addresses in an average run, I would be intrested in those. Rejo Zengers of Spamvrij, a Dutch anti-spam organisation analysed a spamCD which he received from a spammer.

See also: The UEFF page


Spamming is profitable

And for all those people who wonder why spammers do it, the reason is obviously very simple:

Anonymous pornsite webserver hits since they started spamvertising

Please remember, make spam unprofitable. NEVER buy anything that has been spamvertised. If you do, you are PART OF THE PROBLEM.

Conclusion

The only conclusion I can reach, after interpreting spam related events that have happened to me in the last year, is that spam has become a matter of organised crime. It should be cracked down. It also makes me look in fear at VOIP, especially if combined with ENUM. If we don't design it to be spamproof, we will not only lose email, we will also lose the usefulness of our mobile phones!


Most popular topics by spam

To end with a more happy thought, let me provide you with the most popular search engine phrases I have received as a result of publishing all my spams verbatim on the web, Google indexing them, and people using Google to find them.

One month stood out in which one search word ruled the statistics: Xanax in august. For entries which occured in the top 20 at least in 5 months, the (absolute) total becomes:
ranksearch term# of hits
1(free) naughty cards8061
2r3167
3tonya harding (wedding) sex tape/video2722
4el*mono*mario2370
5nikole kidman2271
6(free) ps2 games (download)2174
7nip slips1793
8amandacam1763
9princess diana pictures1511
10disney x1316
11milf1164
12see through clothes945
13rüyatabirler810
14carol shelby698
15angelina jolie 630


Paul Wouters