Rspamd bayes filter oddity

Discussion in 'Installation/Configuration' started by teves, Sep 24, 2019.

Tags:
  1. teves

    teves Member

    Hello,

    I have another question regarding the new rspamd integration in ISPConfig.

    About a week ago we migrated our ISPConfig webserver from spamassassin to rspamd by this howto: https://www.howtoforge.com/replacing-amavisd-with-rspamd-in-ispconfig/

    Apart from a small issue (see https://www.howtoforge.com/community/threads/minor-problem-with-rspamd.82847/) it is working fine.

    Now I noticed a curious thing: sometimes mails of the same source and even (nearly) the same content are being handled differently by the spam filter. See this log output:
    [​IMG]
    The mails all contain backup infos an are coming from the same account and the same ip.
    When I look into the log details I find that the only difference is the bayes filter value (same order as above):
    [​IMG]

    Can someone tell me what I am missing here? Is this bug or feature? It certainly looks like bug...

    Kind regards, Tom

    PS:
    System is:
    - Debian GNU/Linux 9.9 (stretch)
    - ISPConfig 3.1.15
    - Dovecot 2.2.27
     
  2. till

    till Super Moderator Staff Member ISPConfig Developer

    Take a look at the rspamd log, it contains detailed information of the scan and scoring process. According to your screenshot, the scoring happened due to the Bayes filter. I doubt that there is a bug as what you posted does not indicate any kind of bug, the emails must contain different words which triggered different Bayes levels and the emails were not scored at the exact same time point, even the Bayes rule set must be different for those emails. Bayes is a self-learning filter, so which word are triggering which levels are specific to your own server and to the emails that got processed by your server. In any case, if you still think that Rspamd has a bug, then contact the Rspamd developer as ISPConfig is not involved in the scoring process of Rspamd at all. Rspamd is connected with postfix via milter.
     
  3. till

    till Super Moderator Staff Member ISPConfig Developer

    And if your Bayes filter learned something wrong, then you can probably reset it. I guess you can find more details in the Rspamd docs or use a search engine of your choice to find instructions on how to reset the Bayes filter of Rspamd.
     
  4. teves

    teves Member

    I probably should have expressed myself better: I do not think rspamd has a bug.
    Nevertheless, this is an unexpected behaviour and I would like to know how I can improve it.

    We already uploaded some of these mails as ham to train the filter, but I suppose bayes needs a certain quantity of 'ham mails' to see an effect.
     
  5. till

    till Super Moderator Staff Member ISPConfig Developer

    If its unexpected or not depends on the view that you take on this issue. You have a self-learning filter there, so if an email is scored as spam due to other problems like wrong headers etc, then this mail is learned as spam by the Bayes filter so it must get higher spam scores every day exactly as seen on your server. Find out which other rules are causing a spam score in these mails and try to solve the reasons for the scoring or whitelist them. Then either clear the Bayes filter or try to feed them as ham.
     
  6. teves

    teves Member

    Thank you, that might be the crucial clue.
    These mail had not been scored as spam so far, but they all got 3.5 points due to a SPF issue. I had already fixed this earlier, but I did not get any mails from this source after that so far. So hopefully bayes will calm down in the future.

    Anyway, thank you very much for your help!
     
  7. elmacus

    elmacus Active Member

    Last edited: Sep 25, 2019

Share This Page