In an effort to combat ever returning spam I'm trying to add a header check in postfix that triggers on multiple keywords in the subject. Subject always contain a combination of returning (parts of) words. Testing my rule on several online regex test sites all give the result I'm looking for. But in postfix the rule seems to do nothing. Code: /^Subject: .*(word-A|word-B|word-C|word-D).*(word-E|word-F).*/ DISCARD I would think "This is a subject containing word-A and word-F" should get discarded. And also "This is a subject containing word-Cword-E" should be. But none are discarded. Could it be it has something to do with characters within the words, like "ü"?
Did you look at the raw email source in your mail client, how does the subject line looks there? The subject is often encoded, so a regex for the 'clear text' won't match then.
Looks like you've hit the jackpot @till This is what I see in Outlook: Code: Subject: =?UTF-8?B?U3BhcmVuIFNpZSBqZXR6dDogTGVkZXJiw7xyb3N0w7xobGUgaW4gNyBGYXJiZW4gZsO8ciBudXIgMTA5LC0gc3RhdHQgMzk5LC0=?= Is there any way to use regex when the subject is encoded? Or is there any other way?
I don't think so. You can create a regex for the encoded string of course, but this will likely not help as you want to search for words in the subject. Maybe there is something available for Sieve filtering, but at a first quick search, I have not found something on the web. But might be worth searching for Sieve in regard to filtering encoded subjects, maybe there exists a solution or Sieve plugin for that. Another possibility might be to filter via Rspamd like creating a custom rule or so as I guess a spam filter like rspamd should be able to work on decoded headers, but I never tried that or looked into that topic. So I can just post it as an idea for further research here.