On Saturday, I just upgraded my SpamAssassin to the latest release (3.2.1). For a few weeks, more and more bad emails were not properly tagged as “spam”.
The upgrade went smoothly as usual but, a few hours later, my /var file system was fulfilled by the MTA logfiles. Thousands of errors like:
Jul 2 10:00:08 boogey spamd[6426]: Malformed UTF-8 character (unexpected \ non-continuation byte 0x00, immediately after start byte 0xce) in pattern match \ (m//) at /etc/mail/spamassassin/drugads.cf, rule LOCAL_OBFU_EPHEDRA, line 1.
It seems to be a known bug! As recommended, I also upgrade my HTML::Parser perl module. Still the same problem!
In fact, I had to patch my Perl (my system has 5.8.6). The patch was provided in the SpamAssassin bug report:
# cat perl-5.8.6-utf8.patch --- utf8.c.orig 2005-12-15 08:06:59.000000000 -0600 +++ utf8.c 2005-12-15 08:06:32.000000000 -0600 @@ -1976,7 +1976,7 @@ if (u1) to_utf8_fold(p1, foldbuf1, &foldlen1); else { - natbuf[0] = *p1; + uvuni_to_utf8(natbuf, (UV) NATIVE_TO_UNI(((UV)*p1))); to_utf8_fold(natbuf, foldbuf1, &foldlen1); } q1 = foldbuf1; @@ -1986,7 +1986,7 @@ if (u2) to_utf8_fold(p2, foldbuf2, &foldlen2); else { - natbuf[0] = *p2; + uvuni_to_utf8(natbuf, (UV) NATIVE_TO_UNI(((UV)*p2))); to_utf8_fold(natbuf, foldbuf2, &foldlen2); } q2 = foldbuf2;
Since I restarted spamd with the new Perl, no more error!