So I'm trying to help eliminate spam. However, every time I run "sa_learn" I get the following error: /usr/bin/sa-learn: Argument list too long Here are the contents of my /bin/sa_learn file: Code: #!/bin/bash /usr/bin/sa-learn --spam /var/vmail/*/*/*/.Junk/*/* /usr/bin/sa-learn --ham /var/vmail/*/*/*/cur Seems like I have too many e-mails in the directory. Is there any way around this? Otherwise spamassassin will only learn spam e-mails and become biased
I guess the problem is that you try to learn all mailboxes at once (wildcards on several directory levels. You can e.g. try to add loop in your shell script and then feed the maldirs one by one to sa-learn.
Nope.... tried a couple mailboxes individually and got the same error. Just one of the mailboxes is too much... Is there any way to modify the search capacity of this script?
Thats quite specific, I guess you might have to ask at the spamassassin mailinglist. Or you feed the emails one by one to the script. basically run the find command on the maildir: find /var/vmail/domain.tld/user/Maildir/.Junk and pipe the output to the sa-learn command
I was having the same issue and the resolution is SO SIMPLE it evaded me for a while. I found your post while looking for an answer and figured that I would help out. The answer is just a matter of using quotes... #!/bin/bash /usr/bin/sa-learn --spam "/var/vmail/*/*/*/.Junk/*/*" /usr/bin/sa-learn --ham "/var/vmail/*/*/*/cur" I also have to give credit to "mikeserv" for the not as direct answer to another question that gave me this resolution. Apparently I can't link to it on the site so maybe I can out wit their filter... unix.stackexchange.com/questions/215530/argument-list-too-long-in-for-loop
I also wanted to expand further on this as I have sub folders and my email client auto-sorts by rules to help my mail stay tidy. Here is my whole script and a string that will run the entire process and email you the results! I just have a small server with a few users so this may not work so well on a larger email system and you may not want to email the results as the ham learning echos results for each dir scanned though that may be changeable however I haven't put the time into it yet. <EmailSA> /home/USER/LearnSpam > output ; mail -s "SA-Learn Output" "USER@DOMAIN" < output <LearnSpam> #!/bin/bash echo "Forcing Expire..." /usr/bin/sa-learn --force-expire echo "Learning from Junk folders..." /usr/bin/sa-learn --spam "/var/vmail/*/*/*/.Junk/*/*" echo "Cleaning Junk Folders..." /bin/rm /var/vmail/*/*/*/.Junk/cur/* -rf echo "Learning from Inbox folders..." find /var/vmail/*/*/Maildir/ -maxdepth 1 -type d -not -name .Junk -not -name .Spam -not -name .Trash -not -name tmp -not -name "." -not -name .Sent -not -name .Drafts -not -name new -not -name Maildir -exec /usr/bin/sa-learn --ham {} \; echo "Current Bayes Info..." /usr/bin/sa-learn --dump magic