At the time of this writing (December 5th, 2006) a significant portion of spam comes from windows operating systems. This has been linked to increases in virus infected windows machines connected to the internet remotely controlled by spammers. By only greylisting email coming from windows operating systems we significantly lower the impact to the system and reduce customer complaint levels.
With this implementation of greylisting record the ip address of the sender and the to and from addresses. We record this information in a hashed directory structure filename using the timestamp on the file as the initial connection attempt. The first attempt to send an email always results in a deferral message from the qmail machine (while the information is recorded). If the sender is a real email server they will handle resending the email for up to a week. Most spammer software will not attempt resending the email, or they will only wait a few seconds or a minute. By default we picked 110 seconds or just under two minutes. In testing we saw hotmail, which sends with windows machines, tries once, then a second time 60 seconds later, then a third time 2 minutes later then 10 minutes again after that. With 110 seconds wait time hot mail will get through within two minutes.
Passive Finger Printing
Passive Finger Printing is nify technique that can determine the operating system of a remote machine by examining the bits, or finger print, of the network traffic. Since this does not require any network traffic to probe the remote system it is called Passive. By limiting the problem space to only windows machines, which send the bulk of spam traffic, we narrow the impact of delayed sending problems caused by broken windows mail servers.
With a cron job run every 2 hours or so we parse the smtp log files looking for windows server ips that successfully send email through the greylist system. We learn these IPs and build a file /home/vpopmail/etc/whitelist then cat the contents of /home/vpopmail/etc/tcp.smtp.perm with this newly generated whitelist file to create /home/vpopmail/etc/tcp.smtp and compile it into tcp.smtp.cdb with either vpopmail's clearopensmtp program when vpopmail is configured with roaming users feature. Otherwise we use the regular /usr/local/bin/tcprules program.
The tcp.smtp whitelisting sets an environment variable similar to how we set the RELAYCLIENT variable. We add the string WHITELIST="" to the ip's allow line. Shen the machine connects again the tcpserver program will set the WHITELIST environment variable. Setting environment variables is an elegant way to pass settings to child programs. When qmail-smtpd starts up it checks to see if the WHITELIST variable is set. If it is set then qmail-smtpd skips windows greylisting and accepts the email immediately.
Automatic whitelisting needs no maintance needed since the smtp log files are maintained by multilog that enforces a maximum log file usage, meaning old log files automatically are deleted. If a whitelisted machine never sends email again the IP will fall off the end of the logs and not be included in the whitelist file the next time the cron job is run. Active whitelisted windows sites that send email everyday will remain whitelisted since their IP will keep being added to the log each time they send email.
Impact on Spam Filters
Since Windows Greylisting drastically reduces the number of emails scanned it also reduces the need for highly accurate spam filtering. This will also reduce the system resources consumed, especially for spamassassin. Dspam higher efficency does not consume as many resources and the lower number of spam emails also makes it simplier for dspam to develope new rules.
Using the old adage to try the simplist solution first we trying deferring 80% of all windows connections. This had a great impact as 80% of all spam was instantly removed from the system, freeing up much needed and previously available system resourses. Unfortunately, as they know in vegas, with this type of statistical method it is possible that valid email always gets refused.
So instead we switched to a greylisting method. We chose to key off remote_ip, mail_from and rcpt to information in the evelopeheader and store in a hashed directory tree. A one level deep tree proved to be too small so we started a second level. That proved to be fine for a small server but a busy ISP server was eating the disk alive so we switched to a mysql backend. The mysql backend was faster on the busy server. We then tried it on a busy 6 machine ISP cluster sharing a centralized mysql server with success. Watching the smtp logs we regularly saw valid email from hotmail or exchange servers being accepted after being greylisted. On average 85% of all email is blocked while 100% of customer email and all valid email from remote sites is accepted.
Windows Greylisting is currently included in The Qmail Engineered Project