SPAM / HAM address support for spam and good email

Discussion in 'Feature Requests' started by fbarcenas, Apr 26, 2016.

  1. fbarcenas

    fbarcenas Member

    My old control panel(DTC by GPLHOST) used to automatically create a [email protected] and [email protected] so that if a user wanted to designate something as spam he would just forward the email to [email protected], and if the email was a good email but selected as SPAM you could email it to [email protected] to tell the server that this is not spam.

    I miss that feature as users constantly ask how to signal SPAM or HAM(good mail) on the webmail. So there is currently no way for them to update spam settings without being the domain admin and going into the control panel and setting filters.
     
  2. Jesse Norell

    Jesse Norell Well-Known Member Staff Member Howtoforge Staff

    I'm new to ISPConfig (coming from DTC as well), so certainly don't want to preempt any better answers from anyone else on what fits ISPConfig setups, but what comes to mind is using HAM/SPAM mail folders, so (some?) users specifically move/copy a message to one folder or the other to be trained.

    In addition to the DTC-created ham/spam email addresses (which almost nobody uses, in my experience), we've been using dovecot-antispam to automatically train when people move mail into or out of a SPAM folder. It's simple, and is probably better than nothing, but our bayes/training does not work nearly as well as it could/I would like, so I'd like to consider a better setup.

    Right now all training we have is one-time, and messages aren't kept. It seems most of the better functioning (albeit probably also higher maintenance) mail systems I see described keep a HAM and a SPAM message corpus, which can be used to rebuild bayes when needed (eg. change/improve some message clean-up scripts, and rebuild from scratch learning all messages), and could be reviewed/moderated to correct any training mistakes. (Also the HAM corpus could be used for clamav-unofficial-sigs' ham_dir to help clamav false positives.)

    End users are often bad at marking messages they simply don't want as SPAM, even for eg. lists they signed up for, or other legitimate mail that they don't remember the sender, etc. A setup that automatically trains on that (eg. dovecot-antispam), and/or using bayes autolearn, can end up working against you.

    Allowing only a few users who you trust to correctly categorize mail as HAM/SPAM (which admittedly is hard to do correctly at times) might work for some systems/hosts, but probably isn't an option for all. Some smaller systems could probably have the sysadmin perform that task. It'd be nice to be able to assign specific email accounts as trusted to perform that, though of course there's not currently any flag in ISPConfig for that; perhaps there is existing software/scripts one could adapt to run in the server back-end (I'm not aware of any specifically, but there sure could be).

    That's what I'm considering, and I'd love to hear other's experiences with this or alternatives. I'm not at all opposed to contributing code/improvements to ISPConfig if better out-of-the-box functionality could be obtained here. Or maybe just a HOWTO if the best solution is all done in the backend (nothing for ISPConfig to integrate with).
     
  3. till

    till Super Moderator Staff Member ISPConfig Developer

    Having a spam / ham email address causes a false Bayes training very easily in my opinion. The problem is that the users mostly use the wrong forwarding function for spam mails. While most email programs have a forward function to forward an email while keeping all headers intact, most users will use the "standard" forward function to send the mail to the spam address instead, so the spam mail gets their own address headers instead of the ones of the original spam email. So they tell the Bayes filter that their own address is bad and not the one of the spammer. The imap folder method that Jesse suggested works around that issue and is to be preferred in my opinion.

    Having a spam training system based on imap folders together with trusted accounts is a good idea in my opinion.
     
    DDArt likes this.
  4. fbarcenas

    fbarcenas Member

    That problem can be solved by whitelisting your own domain names. So stupid people won't influence the spam filter at all.
    Does it work straight out of the box? Just run APT-GET INSTALL DOVECOT-ANTISPAM or is there some configuration required? If so please add it to the ispconfig faq/wiki.
     
  5. Jesse Norell

    Jesse Norell Well-Known Member Staff Member Howtoforge Staff

    It does take some configuration, iirc you have to configure the possible names of folders (eg. trash, spam, ham...), and what commands to run to train the spam scanner. It wouldn't be hard to steal the scripts from DTC to handle the message processing and just use an identical setup, though it'd be more efficient to just call sa-learn directly. I've not done either on an ispconfig box offhand; I'll try to post some howto info when I have a spam training setup I'm happy with (as mentioned probably will be retaining a spam/ham corpus for retraining purposes, and I'm guessing won't involve dovecot-antispam at all, but we'll see).
     
  6. Jesse Norell

    Jesse Norell Well-Known Member Staff Member Howtoforge Staff

    I'm getting closer to having the aforementioned spam training via folders in place. I'll probably publish a full howto on it eventually, and maybe create a github project for it, but is anyone interested in testing early versions and reporting feedback? I'm particularly interested in how it will scale to many thousands of users, but am sure open to any level of testing.
     
  7. florian030

    florian030 Well-Known Member HowtoForge Supporter

  8. jnsc

    jnsc rotaredoM Moderator

    Just out of curiosity Florian, are you writing theses rules manually, or do you have a mail parser that generate a spamassassin rule. I always wanted to write a parser, but I newer found the time.
     
  9. Jesse Norell

    Jesse Norell Well-Known Member Staff Member Howtoforge Staff

    I'll get that sent "soon," there are a few things to finish and reworking some things to hopefully scale better.
     
  10. florian030

    florian030 Well-Known Member HowtoForge Supporter

    Both. We have some scripts to create rules from spam-mails but we always check the created rulesets and make some changes (if needed).
     
  11. vk3heg

    vk3heg Member

    I've been using scrollout f1 for some of my domains (Ie: mine and a couple of client domains that should never get any email) and it uses a IMAP spam collector setup. An email account is created on the ispconfig server. and then in the email program a imap account is created. Then a GOOD/BAD folder. Any email that comes in is marked, and the operator just has to move the email to the correct folder. Scrollout F1 will then every few minutes scan those two folders and do it's thing.

    https://sourceforge.net/projects/scrollout/
     
  12. Jesse Norell

    Jesse Norell Well-Known Member Staff Member Howtoforge Staff

    This training script is coming along fairly well, there's a github project at https://github.com/jnorell/train-spam-scanner and hopefully a full tutorial soon with exactly how to configure ispconfig/dovecot to use it. I have total training time under 15 minutes still with >6400 training messages, but I'd be interested in the results of systems with a lot of users and much larger training corpus.
     
    till likes this.

Share This Page