[ BitWizard B.V. ] [ BitWizard B.V. ]

Greenlist

Greenlist implements a greylist for qmail. The name is a pun on "grey" in "greylist" which in turn is a pun on "black" from "blacklist".

Why?

I was told implementing greylisting would make a difference in the amount of spam I'd get. So I got "qgreylist" and installed it. This worked fine.... for a while.

After a while I found that things were slowing down. it turns out that qgreylist was putting empty files in one big directory, and using the two user accessable timestamps for nefarious purposes. As my system recieves quite a lot of mail, this directory had grown to enormous proportions. Accessing the directory became slow, slowing down qgreylist and the whole system.

Apparently the expires were not working. But even if they were, the directory was too large to be a "quick direct access to the info for a host". Things might have improved by upgrading to a 2.6 kernel and turning on directory indexing, but large directories are always a pain.

Currently, every hour I'm expiring about 500 to one thousand greylisted hosts that haven't retried within 25 hours. So apparently this saves my server having to process over 15 thousand spams a day.

How it works

I decided to at first simply clone external behaviour of qgreylist. This includes just keeping state for whole class C networks (ehhm. the internet has gone classless. I mean /8 networks). All hosts within a class C are treated equally.

This leads to the realization that there are only 2^24, or about 16 million class C networks. A timestamp is 4 bytes, so all 16M timestamps can be stored in a single direct access 64Mbyte file.

Add to this a second timestamp, increases the disk-space requirements to 128Mb. This is quite acceptable in modern times. (in contrast to about 20 years ago, in 1987, when my first harddrive was 20 Mbytes).

Geenlist mmaps the file, and simply accesses the timestamp it needs. Possible changes are written to the shared mapping, ensuring immediate ripplethrough to other greenlist processes.

In contrast to qgreylist, the expiry is not run from the smtp-invoked process, but from cron. In my case, the expiry was taking too long, causing nagios to occasionally flag my mailserver as being down.... To prevent this, expiry should be run from cron. Scanning all 16 million class C networks takes about 7 seconds on my outdated machine.

Installing and configuration

There is no utility yet to create the firstseen and lastseen files. These need to be present to allow greenlist to run.
    cd /var/qmail
    mkdir greenlist
    cd greenlist
    dd if=/dev/zero of=firstseen bs=1M count=64
    dd if=/dev/zero of=lastseen bs=1M count=64
    chown -R qmaild . 
    
It's best to do this before inserting the call to the binary of course. You can test if it works by calling print_greenlist.

To install qgreylist, you need to get it inbetween the SMTP listener (tcpserver) and qmail-smtpd. This is done by adding the path to greenlist just before "qmail-stmp" in the call to tcpserver. greenlist does not honour the PATH variable, so you might have to expand the call to the qmail-smtp binary by prepending it's path.

runtime parameters go one at a time into configuration files. For example, there is "smtptimeout", which defaults to 300 seconds, that lives in a file called "smtptimeout" (in /var/qmail/greenlist).

The default for "maxageonce" is set to 25 hours. This gives hosts configured to retry every 24 hours a fair chance. The default for "maxagegood" is set to 32 days. This means that greenlist will not bother (after the first time) hosts sending a mailing-list reminder every month.

Room for improvements.

Patches welcome! If you pick something to work on, please Email me beforehand. Maybe things lower on the list are not a good idea after all. If you pick something high up the list, I might already have done the work....

Download


Rogier Wolff
Last modified: Wed Mar 5 17:32:04 CET 2008