Training Amavis

Discussion:

Training Amavis

(too old to reply)

@lbutlr

2016-02-01 04:07:51 UTC

I get daily mails from wordpress verifying backups and these are all tagged as spam (at a very high score in the 7-13 range).

How do I train amavis? Do i just run normal sa-learn as root? As the user? as the scan user?

--
'The only reason we're still alive now is that we're more fun alive than
dead,' said Granny's voice behind her. --Lords and Ladies

Olivier Nicole

2016-02-01 04:42:44 UTC

Permalink

Post by @lbutlr
I get daily mails from wordpress verifying backups and these are all tagged as spam (at a very high score in the 7-13 range).
How do I train amavis? Do i just run normal sa-learn as root? As the
user? as the scan user?

You could simply whitelist the sender.

Olivier

--

l***@bitrate.net

2016-02-01 04:49:51 UTC

Permalink

you don't train amavis. you train spamassassin. they are two different pieces of software, which work well together. while training spamassassin is good to do regardless of if you are having a problem or not, blindly training it to solve a specific problem is not a sensible approach. instead, look at the *actual* scoring the message was given [X-Spam-Status header], and see which rule[s] are the ones which significantly contributed to the score. then you can determine the right way to solve the problem. i'd recommend setting $sa_tag_level_deflt = undef; so that this detail will always be present.

-ben

@lbutlr

2016-02-01 07:32:30 UTC

Permalink

Post by l***@bitrate.net

I ma not blindling trainmen it. i wam training false positives as ham.

What I need to know is what user to train them as so that amavis will use the bases database that I am training to.

They all hit BAYES_99 and BAYES_999, some hit other rules as well.

X-Spam-Status: Yes, score=10.2 required=5.0 tests=BAYES_99,BAYES_999,
HEADER_FROM_DIFFERENT_DOMAINS,NO_RELAYS,TVD_SPACE_RATIO,TVD_SPACE_RATIO_MINFP
autolearn=no autolearn_force=no version=3.4.1

Post by l***@bitrate.net
instead, look at the *actual* scoring the message was given [X-Spam-Status header], and see which rule[s] are the ones which significantly contributed to the score.

Yes, that’s what I’ve done.

Post by l***@bitrate.net
then you can determine the right way to solve the problem.

Training falsely classified mail is *always* a good idea.

The question still remains, do I train SA as root, as the user (which is a problem for most of the users since they are virtual users in a database) or as the vscan user?

That is to say:

sa-learn -u *WHAT* --ham /path/to/ham

--
Stone circles were common enough everywhere in the mountains. Druids
built them as weather computers, and since it was always cheaper to
build a new 33-Megalith circle than to upgrade an old slow one, there
were generally plenty of ancient ones around --Lords and Ladies

@lbutlr

2016-02-01 07:50:16 UTC

Permalink

Post by @lbutlr
I ma not blindling trainmen it. i wam training false positives as ham.

Wow. I have no idea how that happened.

I am not blindly training it, I am training false positives as ham.

--
"We're philosophers. We think, therefore we am."

btb

2016-02-01 13:43:36 UTC

Permalink

Post by @lbutlr

Post by l***@bitrate.net

Post by @lbutlr
I get daily mails from wordpress verifying backups and these are
all tagged as spam (at a very high score in the 7-13 range).
How do I train amavis? Do i just run normal sa-learn as root? As
the user? as the scan user?

I ma not blindling trainmen it. i wam training false positives as
ham.
What I need to know is what user to train them as so that amavis will
use the bases database that I am training to.
They all hit BAYES_99 and BAYES_999, some hit other rules as well.
X-Spam-Status: Yes, score=10.2 required=5.0
tests=BAYES_99,BAYES_999,
HEADER_FROM_DIFFERENT_DOMAINS,NO_RELAYS,TVD_SPACE_RATIO,TVD_SPACE_RATIO_MINFP
autolearn=no autolearn_force=no version=3.4.1

Post by l***@bitrate.net
instead, look at the *actual* scoring the message was given
[X-Spam-Status header], and see which rule[s] are the ones which
significantly contributed to the score.

Yes, that’s what I’ve done.

Post by l***@bitrate.net
then you can determine the right way to solve the problem.

Training falsely classified mail is *always* a good idea.
The question still remains, do I train SA as root, as the user (which
is a problem for most of the users since they are virtual users in a
database) or as the vscan user?
sa-learn -u *WHAT* --ham /path/to/ham

you must train the database that is used during message evaluation.
that is to say, whatever using is running amavis - their spamassassin
bayes_path setting. this may be undefined, in which case it is the
default of ~/.spamassassin/bayes, it may defined in the global
spamassassin config, or it may be defined in
the amavis user's spamassassin config [e.g. ~/.spamassassin/user_prefs].

see
https://spamassassin.apache.org/full/3.4.x/doc/Mail_SpamAssassin_Conf.html
for further detail on bayes_path

once you have identified this detail, the simplest way may be to just
run sa-learn as the user running amavis - but as with anything, there
are numerous methods, and the one which best fits your conditions can
vary greatly. all that matters is that the database files which are
worked on are the ones amavis uses when running spamassassin.

for reference, i use the follow setting for spamassassin, which i find
helpful in keeping clear the files which make up the db and keeping them
organized/separated from other spamassassin files:

/etc/spamassassin/99_local-config.cf:
# note: the value specified here is *not* a directory. it is
# a directory plus a prefix used in the names of the various
# files that comprise the entirety of the bayes database
bayes_path ~/.spamassassin/bayes_db/bayes

@lbutlr

2016-02-01 14:53:52 UTC

Permalink

you must train the database that is used during message evaluation. that is to say, whatever using is running amazes

Thank you.

--
<http://en.wikipedia.org/wiki/TOFU>