7

I have installed and configured and trained my spamassassin and all seemed to work just fine. Then when I tried to deploy it via spamc I get partial results.

Why is this happening?

I like spamc for the fact i can get it to output just the report but it seems to be missing checks: SPF, DKIM, BAYES.

I have not managed to figure it out or find any similar reports online. This has been going on for days now and I am out of ideas.

spamassassin works:

# spamassassin -t < /path/to/spam.eml

Content analysis details:   (3.3 points, 5.0 required)

 pts rule name              description
---- ---------------------- --------------------------------------------------
 0.0 FSL_HELO_NON_FQDN_1    FSL_HELO_NON_FQDN_1
 0.7 SPF_SOFTFAIL           SPF: sender does not match SPF record (softfail)
 0.8 BAYES_50               BODY: Bayes spam probability is 40 to 60%
                            [score: 0.5000]
 0.5 MISSING_MID            Missing Message-Id: header
 0.0 HELO_NO_DOMAIN         Relay reports its domain incorrectly
 1.4 MISSING_DATE           Missing Date: header

spamc only partial:

# spamc -R  < /path/to/spam.eml

Content analysis details:   (1.5 points, 5.0 required)

 pts rule name              description
---- ---------------------- --------------------------------------------------
 0.0 FSL_HELO_NON_FQDN_1    FSL_HELO_NON_FQDN_1
 0.1 MISSING_MID            Missing Message-Id: header
 0.0 HELO_NO_DOMAIN         Relay reports its domain incorrectly
 1.4 MISSING_DATE           Missing Date: header
transilvlad
  • 12,220
  • 12
  • 41
  • 77

2 Answers2

3

I figured the same problem.

The bayes databases are saved in the home directory of the user which runs spamassassin:

bayes_path /path/filename   (default: ~/.spamassassin/bayes)
This is the directory and filename for Bayes databases. Several databases will be created, with this as the base directory and filename, with _toks, _seen, etc. appended to the base. The default setting results in files called ~/.spamassassin/bayes_seen, ~/.spamassassin/bayes_toks, etc.

By default, each user has their own in their ~/.spamassassin directory with mode 0700/0600. For system-wide SpamAssassin use, you may want to reduce disk space usage by sharing this across all users. However, Bayes appears to be more effective with individual user databases.
  • And here is the solution that worked for me:

According to this wiki: http://wiki.apache.org/spamassassin/SiteWideBayesSetup , I added in /etc/mail/spamassassin/local.cf the following two lines:

bayes_path /var/spamassassin/bayes_db/bayes
bayes_file_mode 0777

and I created the needed directory: /var/spamassassin/bayes_db/

Please note that the last "bayes" in the path is the prefix for the database files (bayes_journal, bayes_seen, etc.)

Ok, after I restared the spamassassin, nothing happened. No Bayes test yet. Hmm...

So, I copied the already created databases from /root/.spamassassin/* to /var/spamassassin/bayes_db

Update: It seems that I had to change the permissions to these 4 bayes_* files to 0666. Otherwise the autolearner will not save the new data. I don't agree with 0666 permission, but I hope I will find another solution soon.

I ran another test in spamc and... I got the Bayes!! :)

Results for spamassassin

# spamassassin -t -D spf,dkim < /path/to/spam.eml

Content analysis details:   (8.2 points, 5.0 required)

 pts rule name              description
---- ---------------------- --------------------------------------------------
 3.5 BAYES_99               BODY: Bayes spam probability is 99 to 100%
                            [score: 1.0000]
 1.3 RCVD_IN_BL_SPAMCOP_NET RBL: Received via a relay in bl.spamcop.net
                [Blocked - see <http://www.spamcop.net/bl.shtml?141.146.5.61>]
 1.0 DATE_IN_PAST_12_24     Date: is 12 to 24 hours before Received: date
-0.0 SPF_PASS               SPF: sender matches SPF record
 1.3 TRACKER_ID             BODY: Incorporates a tracking ID number
 0.2 BAYES_999              BODY: Bayes spam probability is 99.9 to 100%
                            [score: 1.0000]
 0.0 HTML_MESSAGE           BODY: HTML included in message
 0.8 RDNS_NONE              Delivered to internal network by a host with no rDNS

Results for spamc:

# spamc -R  < /path/to/spam.eml

Content analysis details:   (8.2 points, 5.0 required)

 pts rule name              description
---- ---------------------- --------------------------------------------------
 1.3 RCVD_IN_BL_SPAMCOP_NET RBL: Received via a relay in bl.spamcop.net
                [Blocked - see <http://www.spamcop.net/bl.shtml?141.146.5.61>]
 3.5 BAYES_99               BODY: Bayes spam probability is 99 to 100%
                            [score: 1.0000]
 1.0 DATE_IN_PAST_12_24     Date: is 12 to 24 hours before Received: date
-0.0 SPF_PASS               SPF: sender matches SPF record
 1.3 TRACKER_ID             BODY: Incorporates a tracking ID number
 0.2 BAYES_999              BODY: Bayes spam probability is 99.9 to 100%
                            [score: 1.0000]
 0.0 HTML_MESSAGE           BODY: HTML included in message
 0.8 RDNS_NONE              Delivered to internal network by a host with no rDNS

Content analysis details:   (8.2 points, 5.0 required)
Cristian Ciocău
  • 944
  • 2
  • 9
  • 12
0

If spamd is running under a dedicated user account, it will use the preferences found by that user and you can additionally have some access rights issues (e.g. that user not allowed to read a site-wide Bayes database).

You can also have options given to spamd that affects other behaviour (e.g. -L that disables DNS and network tests).

How are you running spamd? You can also run spamd with -D and see if anything interesting pops up.

krisku
  • 3,676
  • 1
  • 16
  • 10
  • I checked the config files and local mode is not enabled. Is there any particular place (config) I should look? – transilvlad May 22 '14 at 09:02
  • Running "spamd -D" should tell you exactly what is happening while you process a message with spamc. Compare that to what "spamassassin -D" is telling you (try search for anything bayes-related). – krisku May 22 '14 at 10:39
  • bayes: no dbs present, cannot tie DB R/O: /tmp/spamd-19125-init/.spamassassin/bayes_toks :: I cannot figure out why it doe snot read /root/.spamassassin like spamassassin does directly. – transilvlad May 22 '14 at 20:41
  • I guess @ceakki has an adequate solution if you really wish to have a site-wide bayes database. Otherwise spamd first starts in a temporary directory, but when you run spamc it should detect which user spamc is run as and switch to the configuration and bayes database for that particular user, i.e. what does the debug output of spamd say when you run your spamc command? – krisku May 23 '14 at 09:41
  • The above. And I ran everything spamassassin, spamd and spamc as root. So I am very curious why the hell did spamd and spamc not load the bayes database. I have a Centos server that only runs a python based service under root that uses spamassassin to scan emails. Why would there be any difference when it calls spamassassin and spamc? I like spamc because it can return just the report without the message. I just need the report. ceakki's solution made spamc work, but the question as to why this happen is still unanswered. – transilvlad May 23 '14 at 10:24
  • Can you provide the full output from spamd -D that is generated while you run spamc (not when starting spamd)? When I start spamd as root it indeed first tries to open a bayes database under a tmpdir, but when i run spamc as root I can see that spamd indeed opens /root/.spamassassin/bayes* database. You could also try to run "spamc -u root" to force it run as user root. – krisku May 23 '14 at 12:39
  • Here's the last log I got before fixing it. http://pastebin.com/raw.php?i=iDqK0r2K – transilvlad May 23 '14 at 12:45
  • Unfortunately that log is only from the startup phase of spamd, so it doesn't say what happens when you actually scan a message.. – krisku May 26 '14 at 07:52