[#5843] spam-filtering for maildrop
Summary spam-filtering for maildrop
Queue Ingo
Queue Version HEAD
Type Enhancement
State Resolved
Priority 2. Medium
Owners Chuck Hagenbuch <chuck (at) horde (dot) org>
Requester steinkel (at) ctinetworks (dot) com
Created 10/30/2007 (194 days ago)
Due
Updated 11/20/2007 (173 days ago)
Assigned 10/31/2007 (193 days ago)
Resolved 11/20/2007 (173 days ago)
Attachments ingo.maildrop.php.patch Download
maildrop.diff Download
Milestone
Patch

History
11/20/2007 Chuck Hagenbuch Comment #11
State ⇒ Resolved
Reply to this comment
Looks fine. Committed, thanks!
11/20/2007 horde (at) nospam (dot) obeliks (dot) de Comment #10
New Attachment: maildrop.diff Download
Reply to this comment
>> Thanks for catching that. I've taken another shot at it based on your
>> example. Can you please check:
>> http://cvs.horde.org/diff.php/ingo/lib/Script/maildrop.php?r1=1.17&r2=1.18&ty=u
>>
>> Thanks!
>
> Looks much better, but it still won't work, if the conditions are
> joined by OR. I'll take a shot at it these days, some of the
> refactoring I've planned to do sometime... Hang on ;) If it's urgent,
> I'm sure we can find a fix without the restructuring.

I'm sorry, I just learned that && has a higher precedence than ||, so your code should work fine even if conditions are joined by OR.

However the 'not equal' test should use != as operator, instead of
! /Header:.../ && $MATCH1 == condition,
because that would indeed not work. Furthermore, 'not equal' must then be excluded from the negation above (line 557 in HEAD). Patch attached.

PS: The patch also removes some duplicated code by defining the numerical relation operators in an instance variable. Remove this if you don't find it appropriate.
11/19/2007 horde (at) nospam (dot) obeliks (dot) de Comment #9 Reply to this comment
> Thanks for catching that. I've taken another shot at it based on your
> example. Can you please check:
> http://cvs.horde.org/diff.php/ingo/lib/Script/maildrop.php?r1=1.17&r2=1.18&ty=u
>
> Thanks!

Looks much better, but it still won't work, if the conditions are joined by OR. I'll take a shot at it these days, some of the refactoring I've planned to do sometime... Hang on ;) If it's urgent, I'm sure we can find a fix without the restructuring.
11/19/2007 Chuck Hagenbuch Comment #8
State ⇒ Feedback
Reply to this comment
Thanks for catching that. I've taken another shot at it based on your example. Can you please check:
http://cvs.horde.org/diff.php/ingo/lib/Script/maildrop.php?r1=1.17&r2=1.18&ty=u

Thanks!
11/18/2007 horde (at) nospam (dot) obeliks (dot) de Comment #7 Reply to this comment
I'm sorry, I was mislead by the commit log for the revision that referenced this bug...
I was referring to the numerical comparisons added in rev 1.17, not to the spam action. My bad. The point remains still valid of course...
11/18/2007 horde (at) nospam (dot) obeliks (dot) de Comment #6 Reply to this comment
Actually, this patch will not work at all.

The way maildrop condition generation works atm, each condition starts with /^
(see this line: $string .= '/^' . $condition['field'] . ':\\s*';),
which starts a regular expression.
Like I said in bug #5816, implementing numerical comparisons would require more changes to the script.

In a maildrop script, a numerical comparison (e.g. >=) for a certain header field (e.g. HeaderName) and a certain value (e.g. 5, from (int)$condition['value']) would best be implemented as
(/^HeaderName:\s*(\d+(\.\d+)?)/h && $MATCH1 >= 5)

This would work fine with the current script generation algorithm, except an opening parenthesis would have to be added at the beginning of a condition for numerical comparisons, and the second part of the condition would have to be added AFTER the regular expression is closed and also AFTER the flags are added (which is why I didn't implement this in the above mentioned bug/patch, but tried to restructure the whole thing to make it more readable).

However, If I got that right, the current HEAD of the maildrop.php however will produce something like
/^HeaderName:\s* >= 5/h
which is of course complete bogus.

Please revert the changes until a proper patch is provided.
10/31/2007 Chuck Hagenbuch Comment #5
State ⇒ Resolved
Reply to this comment
Committed, thanks!
10/31/2007 Chuck Hagenbuch Comment #4
State ⇒ Assigned
Assigned to Chuck Hagenbuch
Reply to this comment
> Certainly, I did not know where or how to set that.  We use string
> comparisons, so numeric was not of such importance to us.  I had not
> gotten to seeing whether maildrop was capable of them, but figured it
> would be useful to post what I had.

According to http://www.courier-mta.org/maildrop/maildropfilter.html >= should be just fine.
10/30/2007 steinkel (at) ctinetworks (dot) com Comment #3 Reply to this comment
> Can it do string comparisons unconditionally instead of breaking if
> numeric comparisons are requested, then?

Certainly, I did not know where or how to set that.  We use string comparisons, so numeric was not of such importance to us.  I had not gotten to seeing whether maildrop was capable of them, but figured it would be useful to post what I had.
>
> Also, what's the +2 for? (after $spam->getSpamLevel())

On the list I commented that we add two to normalize spam scores for our customers and should not have been included.
10/30/2007 Chuck Hagenbuch Comment #2
State ⇒ Feedback
Reply to this comment
Can it do string comparisons unconditionally instead of breaking if numeric comparisons are requested, then?

Also, what's the +2 for? (after $spam->getSpamLevel())
10/30/2007 steinkel (at) ctinetworks (dot) com Comment #1
Summary ⇒ spam-filtering for maildrop
Type ⇒ Enhancement
New Attachment: ingo.maildrop.php.patch Download
State ⇒ New
Priority ⇒ 2. Medium
Queue ⇒ Ingo
Reply to this comment
Attached for your consideration:  a patch to ingo/lib/Script/maildrop.php to allow maildrop scripts to use the INGO_STORAGE_ACTION_SPAM action.  As of now, it does 'string' and not 'numeric' comparisons.