It is my opinion that spamassassin has had a lot of problems of late. Love the framework for building a scoring engine/model for our type of spam but not as happy with the rules. First we had no updates to the rules in April/May ( exacts dates or how long fail me now) because of a new physical server build and after a few weeks the new rules stopped coming because the scores were incorrect and they had to hand edit them I believe to get something working again. They are still not sure why at this point so automatic updates have been stopped for the past 30+ days. I dug deeper on our zimbra systems and what I found was that a lot of the plugins have not been updated to handle modern html techniques that spammers were using to get around the rules. If that parsing doesn't work then the rules don't work well and the scoring is off and/or more false positives. I currently have patches for these:
Code: Select all
mail:~/Zimbra/SpamAssassinAttatchPlugin:46> ls patches/*.pm
patches/HTMLEval.pm patches/HTTPSMismatch.pm patches/PerMsgStatus.pm
patches/HTML.pm patches/MIMEEval.pm
I also noticed they were finding new obfuscation methods because of the above problems which was resulting in more success in Bayes poisoning so a lot of focus went into that. Finally, Zimbra's spamassassin has a lot of whitelist RBL's included so weird stuff like this RCVD_IN_MSPIKE_H2 was -5 or something crazy since RCVD_IN_MSPIKE_H3 wasn't as high. As a result, I had to adjust some of the scoring in the whitelisting. Nothing fancy just disabling the big negative scores that were causing us problems.
Code: Select all
#whitelists
score RCVD_IN_MSPIKE_H2 0.1
score RCVD_IN_RP_CERTIFIED 0.1
score RCVD_IN_RP_SAFE 0.1
score RCVD_IN_IADB_OPTIN 0.1
score RCVD_IN_IADB_VOUCHED 0.1
score RCVD_IN_IADB_DOPTIN 0.1
score USER_IN_DEF_DKIM_WL 0.1
I don't know the answer, but in the end I wrote a few plugin's, modified a few sendmail milters that are in the front end on a different machine to add some headers so spamassassin could score better. Previously, I had just a few rules but now find myself in full development mode. I recently started to mark incoming connections from foreign countries (! US/CANADA/UK/IE ) to help us make additional choices with certain types of tracking spam for one customer that is US based for their email mix. I began following both the developers and users spamassassin mailing groups but none of what I am seeing appears to be a priority that I have seen or we are the only one with this type of problem... I wanted more reputation checks based on the structure, envelope, etc. Our clients don't want false positives and they want all their email so it's been a battle for the past few months. Something changed for us. I would love to get some of these patches into Zimbra but spam is a very specialized thing these days so what works for someone isn't best for everyone. My initial thought process was to get these patches in the main trunk of spamassassin but that is proving difficult.
I tried to update HTML.pm with the patches to their spamassassin developers group but believe they see my changes as a start to opening a can of worms and have not confirmed my bug. They obviously have a deeper understanding of their system than I do. We found once we got in there, the more we changed the more problems that existed. For us HTML_FONT_LOW_CONTRAST needed to work better or bayes was at risk. That is what started the journey. It's a mess and its a lot of work but it's rewarding work to increase accuracy for your users, delivery email and tag spam. I would like to think we are in the business of delivering email not rejecting email so simply choosing more blacklists when spam seems to be winning isn't as helpful. We did add a milter to add headers of various BL's and then score the message on aggregate with additional envelope and other meta checks. Crazy what we have been doing.
I would be interested if others have found spam to be a bigger problem in the past 2-3 months.
One thing that has come out of this is that customization for your mix of spam is what I believe the spamassassin community expects you to do. Previously, I was content with just a few custom rules but the current set of rules let in too much spam and created too many false positives. My milter stuff is on sendmail and I have modified both blackmilter and dnsbl-milter. Blackmilter has a very fast lookup in comparison to DNS lookups so I went with that after rewriting the cidr's into /8, /16, and /24. I chose the negative case... I list the countries I am interested in and anything else gets a header we use later. Again these are only parts of a larger scoring engine so country lookups are a very small part unless they are on 3 or 4 other blacklists. That was my change to dnsbl-milter so that we could see how many blacklists some of these incoming connections were on. Again, just a small fraction of the score since blacklists like zen and sorbs can at time have gmail listed.
I saw that Bill posted something about rspamd which I hadn't seen before. I was looking at mailscanner given a few of the spamassassin developers like to use it to feed spamassassin more data. No shortage of solutions for sure.