Rspamd Spam Filter

Rspamd (Rapid spam filtering system) is a fast, free and open-source spam filtering system.

Topology 

In this example we run 3 mail servers.

           +-------------------+
           |  Charlotte (MAS)  |
           |                   |
           |    Postfix        |
           |    Dovecot        |
           |    Rspamd         |
           +-------------------+
              |             |
             VPN           VPN
              |             |
+----------------+       +----------------+
|  Dolores (MX)  |       |   Maeve (MX)   |
|                |       |                |
|    Postfix     |--VPN--|    Postfix     |
|    Rspamd      |       |    Rspamd      |
+----------------+       +----------------+

Mail Access Server (MAS)

Charlotte, in our home network, is where our mailboxes are stored and where mail clients connect to.

Rspamd on Charlotte scans outgoing mails for viruses, increases user privacy (by removing user-agent and client IP address from the mail-headers) and signs outgoing mail with DKIM and possibly ARC.

Users can also train the filters from within there mail account on Charlotte, by moving mail in to “ham” and “spam” folders, where Rspamd will pick them up, analyze the contents and use it to increase its accuracy in classifying mails as spam or ham in the future.

Mail Exchangers (MX)

Dolores and Maeve, running on cheap cloud hosting providers outside of our home network, are the MX hosts responsible for receiving incoming mails for our domains. With Rspamd installed there, incoming mail is analyzed at the edge and can be rejected if it is unsolicited, unwanted or dangerous before entering our network.

Host	Role	IPv4 VPN Address	IPv6 VPN Address
Charlotte	MAS	10.195.171.241	fdc1:d89e:b128:6a04::29ab
Dolores	MX	10.195.171.142	fdc1:d89e:b128:6a04::7de4
Maeve	MX	10.195.171.47	fdc1:d89e:b128:6a04::961

The 3 instances of Rspamd share their knowledge and experience about good and bad messages, senders, hosts and networks by storing nearly every bit of data they gather in the Redis key-value database, which is then replicated between them over encrypted VPN connections.

They also provide fail-over services for each other in the event of a outage.

Prerequisites 

DNS Resolver

Virtual Private Network

Redis

Installation 

Rspamd is not available in the Ubuntu packages repository, but the Rspamd project provides its own software package repository for various Linux and BSD UNIX distributions:

$ wget -O- https://rspamd.com/apt-stable/gpg.key | sudo apt-key add -
$ echo "deb [arch=amd64] https://rspamd.com/apt-stable/ $(lsb_release -c -s) main" \
    > /etc/apt/sources.list.d/rspamd.list
$ echo "deb-src [arch=amd64] https://rspamd.com/apt-stable/ $(lsb_release -c -s) main" \
    >> /etc/apt/sources.list.d/rspamd.list
$ apt-get update
$ apt-get --no-install-recommends install rspamd

Configuration 

After the installation, a lot of configuration files are found in /etc/rspamd.

These configuration files use the Universal Configuration Language (UCL). The syntax is somewhat similar to Nginx and JSON.

The installed configuration files are not intended to be changed by the user, rather they should be extended with your own files in the directory /etc/rspamd/local.d or overridden with files in the directory /etc/rspamd/override.d. files. This structure ensures seamless upgrades, if any default value changes or new options are introduces by new versions of Rspamd.

Rspamd consists of multiple workers and modules, each of which can be enabled or disabled and configured according to ones needs and topology.

Options 

/etc/rspamd/local.d/options.inc:

#
# Charlotte IP Addresses
# Mails coming from these addresses originate from this machine.
#
local_addrs [
    "127.0.0.0/8",
    "::1",
    "172.27.88.10",
    "fdc1:d89e:b128:13a6::10"
    "2001:db8:3414:6b1d::10",
    "10.195.171.241",
    "fdc1:d89e:b128:6a04::29ab"
]

#
# example.net Subnets
# Mails from these addresses originate from our network.
#
local_networks [
    "203.0.113.54/32",
    "198.51.100.7/32",
    "2001:db8:48d1::/64",
    "2001:db8:2d07:5b57::/128",
    "172.27.0.0/16",
    "2001:db8:3414::/48",
    "fdc1:d89e:b128::/48",
    "10.195.171.0/24",
]

#
# Domain Name Resolvers
#
dns {
    nameserver = "master-slave:172.27.88.10:53:40, 172.27.88.1:53:30, 10.195.171.142:53:20, 10.195.171.47:53:10";
}

#
# Rspamd WebUI server access
#
neighbours {
    Charlotte {
        host = "https://mail.example.net:443";
        path = "/rspamd/charlotte";
    }
    Dolores {
        host = "https://mail.urown.net:443";
        path = "/rspamd/dolores";
    }
    Maeve {
        host = "https://mail.urown.net:443";
        path = "/rspamd/maeve";
    }
}

Workers 

Proxy Worker 

The proxy worker interacts with the MTA (postfix) via the milter protocol.

As a new message arrives, the MTA hands the mail over to the Rspamd proxy for scanning. The Rspamd proxy passes the message on to a Rspamd scanner (normal worker). Multiple scanners can be defined for fail-over or load-balancing purposes.

We define the locally running Rspamd scanner as default scanner and scanners running on other servers as remote fail-overs.

The proxy by itself listens on localhost only, since the local postfix MTA is the only instance connecting here.

The proxy worker listens on TCP port 11332.

Create or modify the file :download: /etc/rspamd/local.d/worker-proxy.inc </server/config-files/etc/rspamd/worker-proxy.inc>:

#
# Rspamd proxy worker on Charlotte
#
# Interacts with MTA using Milter protocol, forwards messages to scanning
# layer.
# https://rspamd.com/doc/workers/rspamd_proxy.html
#

# localhost
bind_socket = "127.0.0.1:11332";
bind_socket = "[::1]:11332";

milter = yes;
timeout = 120s;

# Available scanners
upstream "scan" {
    default = yes;
    hosts = "master-slave:localhost:11333:30,dolores.vpn.example.net:11333:20,maeve.vpn.example.net:11333:20";
}

Normal Worker 

The normal worker is the daemon process, which does the scanning. Its configuration is rather simple.

Apart from “localhost”, this workers also listens to the Wireguard VPN interface for incoming connections. With this other mail servers Rspamd proxy workers, can use this scanner as fail-over.

The normal worker listens on TCP port 11333.

Create or modify the file :download: /etc/rspamd/local.d/worker-normal.inc </server/config-files/etc/rspamd/worker-normal.inc>:

#
# Rspamd normal worker on Charlotte
#
# Scans messages for spam.
# https://rspamd.com/doc/workers/normal.html
#

# Listen on localhost
bind_socket = "127.0.0.1:11333";
bind_socket = "[::1]:11333";

# Listen on Wireguard VPN
bind_socket = "10.195.171.241:11333";
bind_socket = "[fdc1:d89e:b128:6a04::29ab]:11333";

Controller Worker 

The controller worker is used to manage rspamd stats, to learn rspamd and to serve WebUI.

The controller worker listens on TCP port 11334.

Some operations available on the WebUI might change configuration values or data of the Rspamd environment. Password protection is therefore recommended. To generate a password use the following command:

$ pwgen --secure 32 1
86lucpetQ6V8B4dYbsIERC5T0owOvUZ7

The password is to be stored as cryptographic hash in the configuration file. Rspamd provides a utility to safely generate the hash:

$ rspamadm pw
Enter passphrase: ********************************
$2$ozqwiewyd5uym7cdbr7jo6xxg8yuqsee$sk4h4y6geqmqzqo15d6zfti8q5x8cxrnjbbngsrfqd999je95ddy

Create or modify the file /etc/rspamd/local.d/worker-controller.inc:

#
# Controller worker on Charlotte
#
# Manages rspamd stats, learn spam and ham message and serves the WebUI.
# https://rspamd.com/doc/workers/controller.html
#

# Listen on localhost
bind_socket = "127.0.0.1:11334";
bind_socket = "[::1]:11334";

# Listen on WireGuard VPN interface
bind_socket = "10.195.171.241:11334";
bind_socket = "[fdc1:d89e:b128:6a04::29ab]:11334";

# Unix Socket
bind_socket = "/tmp/rspamd-controller.sock mode=0666 owner=_rspamd";

# Allow passwordless access from this host
secure_ip = "127.0.0.1";
secure_ip = "::1";

password = "$2$ozqwiewyd5uym7cdbr7jo6xxg8yuqsee$sk4h4y6geqmqzqo15d6zfti8q5x8cxrnjbbngsrfqd999je95ddy"

Fuzzy Storage Workers 

Fuzzy hashes are used by Rspamd to find similar messages. Unlike normal hashes, these structures are targeted to hide small differences between text patterns allowing to find common messages quickly.

Rspamd uses the fuzzy storage worker to store these hashes and allows to block spam mass mails based on user’s feedback that specifies message reputation.

We will store the data for this worker in Redis.

The fuzzy storage worker listens on UDP port 11335.

The Fuzzy Storage Worker follows a Master/Slave model. The data is created and updated by the users on Charlotte while training the spam filter.

Later on this data is used by Dolores and Maeve for analysis when new messages arrive there.

Another difference is that the Fuzzy Storage Worker is disabled by default and has no default configuration. Therefore the configuration file is kept in the /etc/rspamd.d/override.d directory and not in /etc/rspamd/local.d

Master:

Create or modify the file /etc/rspamd/override.d/worker-fuzzy.inc:

#
# Fuzzy Storage Master Worker on Charlotte
#
# Stores fuzzy hashes of messages.
# https://rspamd.com/doc/workers/fuzzy_storage.html
#

# number of worker instances to run
count = 1; # Disabled by default

# Listen on localhost
bind_socket = "127.0.0.1:11333";
bind_socket = "[::1]:11333";

# Listen on Wireguard VPN
bind_socket = "10.195.171.241:11333";
bind_socket = "[fdc1:d89e:b128:6a04::29ab]:11333";

# Store the hashes in Redis.
backend = 'redis';
write_servers = "localhost:6383";
read_servers = "master-slave:localhost:6383:30,dolores.vpn.example.net:6383:20,maeve.vpn.example.net:6383:20";
timeout = 1s;
db = 0;
password = 'opHequ75iJgKnc7AyNJp995jhbzTKOSr';

# Expiration time of stored fuzzy hashes (default: 2 days)
expire = 90d;

# List of slaves (Dolores and Maeve)
slaves "10.195.171.142, fdc1:d89e:b128:6a04::7de4, 10.195.171.47, fdc1:d89e:b128:6a04::961";

# Allow Charlotte to perform changes to fuzzy storage.
allow_update = "127.0.0.1, ::1, 10.195.171.241, fdc1:d89e:b128:6a04::29ab";

Slaves:

#
# Fuzzy Storage Worker (slave)
#
# Stores fuzzy hashes of messages.
#

# number of worker instances to run
count = 1; # Disabled by default

# Localhost
bind_socket = "127.0.0.1:11335";
bind_socket = "[::1]:11335";

# WireGuard
bind_socket = "10.70.37.215:11335";
bind_socket = "[fd24:20d2:519e:94da:3a9b:18fd:1359:48d7]:11335";

# Store the hashes in Redis
backend = 'redis';

# Allow master/slave updates from the following IP addresses
masters = "";

# Hosts that are allowed to perform changes to fuzzy storage
allow_update = "";

Postfix Integration 

Dovecot Integration 

Nginx Integration 

Razor and Pyzor Integration 

Create the /etc/rspamd/local.d/external_services.conf

# default pyzor settings
pyzor {
    servers = "127.0.0.1:5953"
}

# default razor settings
razor {
    servers = "127.0.0.1:11342"
}

Configuration Syntax Check 

$ rspamadm configtest
syntax OK

Systemd Service Dependencies 

Since our Rspamd server we use various Redis databases to store data, we want the Redis cache to be up and running, before the Rspamd service starts. To make the rspamd.service dependent on the redis-server@rspamd.service, the redis-server@rspamd-bayes.service and the redis-server@rspamd-fuzzy.service.

You can create a Systemd override file easily with the help of the systemctl command:

$ sudo systemctl edit rspamd.service

This will start your editor with an empty file, where you can add your own custom Systemd service configuration options.

[Unit]
After=unbound.service
After=redis-server@rspamd.service redis-server@rspamd-bayes.service redis-server@rspamd-fuzzy.service
BindsTo=redis-server@rspamd.service redis-server@rspamd-bayes.service redis-server@rspamd-fuzzy.service

After you save and exit of the editor, the file will be saved as /etc/systemd/system/rspamd.service.d/override.conf and Systemd will reload its configuration.

Reference 

Rspamd Documentation