6.0.0-beta1
7/27/25

[#4814] vlad.horde.org timeout after EHLO
Summary vlad.horde.org timeout after EHLO
Queue Horde.org Servers
Type Bug
State Resolved
Priority 2. Medium
Owners
Requester vilius (at) lnk (dot) lt
Created 12/28/2006 (6786 days ago)
Due
Updated 01/29/2007 (6754 days ago)
Assigned
Resolved 01/18/2007 (6765 days ago)
Github Issue Link
Github Pull Request
Milestone
Patch No

History
01/29/2007 09:27:13 PM vilius (at) lnk (dot) lt Comment #31 Reply to this comment
It is not happening anymore. Thanks!
01/19/2007 02:55:11 PM vilius (at) lnk (dot) lt Comment #30 Reply to this comment
Now I see another problem.



Jan 19 15:11:37 mail postfix/smtp[30232]: 91DCF10E03A0: conversation 
with lists.horde.org[199.175.137.231] timed out while sending MAIL FROM

Jan 19 15:11:50 mail postfix/smtp[30232]: 91DCF10E03A0: 
to=<horde@lists.horde.org>, relay=smtp.easydns.com[205.210.42.52]:25, 
delay=389, delays=0.08/0/383/5.1, dsn=2.0.0, status=sent (250 Ok: 
queued as 188DC5054D)



Telnet test hangs just right after EHLO:

[root@mail ~]# telnet lists.horde.org 25

Trying 199.175.137.231...

Connected to lists.horde.org (199.175.137.231).

Escape character is '^]'.

EHLO mail.lnk.lt


01/18/2007 11:38:36 PM Jan Schneider Comment #29
State ⇒ Resolved
Reply to this comment
Actually, disabling TCP window scaling *did* help. We had the same 
issues not long ago with a broken router but this one was replaced. 
And I didn't know that turning off window scaling on our side would 
help too.

Anyway, for the record, on BSD this is:

sysctl net.inet.tcp.rfc1323=0
01/18/2007 11:28:22 PM cbs (at) cts (dot) ucla (dot) edu Comment #28 Reply to this comment
More information:  http://kerneltrap.org/node/6723
01/18/2007 11:22:32 PM cbs (at) cts (dot) ucla (dot) edu Comment #27 Reply to this comment
I'm looking at  http://lwn.net/Articles/92727/



The window scaling in linux changed in 2.6.17.  If I disable scaling 
on my inbound MX, messages come through.  With scaling enabled, I get 
nothing.



Unfortunately (for me), I can't leave scaling disabled on my inbound MX.



Thinking about it more, disabling scaling on vlad/coyote may not do 
anything. If the horde machines are behind a firewall/router that's 
stripping tcp scaling from the headers one-way, then any scaling that 
my MX tries to do will be stripped, leaving me with a large size 
window and vlad/coyote with a very small window, and few or no packets 
transferred past a certain point.



This would suggest that anybody running a linux 2.6.17+ kernel on 
their MX host with tcp window scaling enabled (the default) and the 
default values for tcp_wmem and tcp_rmem should be having the same 
problems receiving mail from the horde mailing lists.
01/18/2007 11:08:12 PM cbs (at) cts (dot) ucla (dot) edu Comment #26 Reply to this comment
Can you try disabling TCP window scaling?



Under linux with a 2.6 kernel,



echo 0 > /proc/sys/net/ipv4/tcp_window_scaling



I'm not sure how to disable it under other OSes.
01/18/2007 01:43:09 PM vilius (at) lnk (dot) lt Comment #25 Reply to this comment
Seems like you are right :(
01/18/2007 01:30:53 PM Jan Schneider Comment #24 Reply to this comment
Nope, we don't use mxbackup at the moment at all.
Sure you do:



$ host -t MX lnk.lt

lnk.lt mail is handled by 5 mail.lnk.lt.

lnk.lt mail is handled by 10 mxbackup.data.lt.



If mail.lnk.lt doesn't work, postfix tries mxbackup the next time and 
vice versa.
Also I see direct connections from vlad and coyote in mail.lnk.lt.
Successful connections? Because this is what I see on vlad:



Jan 18 05:13:07 vlad postfix/smtp[80652]: A33F214D2: 
to=<vilius@lnk.lt>, relay=mxbackup.data.lt[213.197.128.83]:25, 
delay=302, delays=0.28/0.17/301/0.42, dsn=2.0.0, status=sent (250 
2.0.0 Ok: queued as 9425F385)
01/18/2007 01:08:03 PM vilius (at) lnk (dot) lt Comment #23 Reply to this comment
Nope, we don't use mxbackup at the moment at all.



Also I see direct connections from vlad and coyote in mail.lnk.lt.
01/18/2007 01:00:57 PM Jan Schneider Comment #22 Reply to this comment
But only because mxbackup.data.lt fixed its IP blacklists.
01/18/2007 12:47:00 PM vilius (at) lnk (dot) lt Comment #21 Reply to this comment
Seems like I started to recieve mail about an hour ago. Whuupee :)
01/18/2007 12:41:08 AM cbs (at) cts (dot) ucla (dot) edu Comment #20 Reply to this comment
The only change from where I sit is that now I can't ping with any 
packets larger than 1420 bytes.  You might as well put the MTU back to 
1500.  Thanks for checking.
01/17/2007 11:37:34 PM Jan Schneider Comment #19 Reply to this comment
Nope
01/17/2007 11:23:30 PM cbs (at) cts (dot) ucla (dot) edu Comment #18 Reply to this comment
Can you try 1448?  I just tested pinging vlad with varying packet 
sizes, starting at 1500 bytes and working my way down.  I didn't get a 
response until I dropped to 1448 bytes per packet.
01/17/2007 11:08:32 PM Jan Schneider Comment #17 Reply to this comment
Doesn't seem to make a difference.
01/17/2007 10:32:17 PM cbs (at) cts (dot) ucla (dot) edu Comment #16 Reply to this comment
Can you try an MTU of 1480 or 1476?
01/17/2007 09:29:12 PM Jan Schneider Comment #15 Reply to this comment
Excuse my ignorance, but would it help if we change the MTU on vlad?
01/15/2007 10:18:44 PM cbs (at) cts (dot) ucla (dot) edu Comment #14 Reply to this comment
Anything that is short and keeps packet sizes under 1480 will come 
through.  Mail generated by hand (either telnet to port 25 or using 
"mail" or "sendmail" or whatever) is generally going to be mall enough 
due to a lack of normal amount of list-related headers.



The problem is caused because a) something (either directly on coyote 
and vlad, or something in the network) is filtering ICMP traffic, and 
b) something in the network path has an MTU of 1480.  Because of the 
ICMP filtering, ICMP MUST-FRAGMENT isn't being sent, breaking PMTU.
01/15/2007 01:08:43 PM vilius (at) lnk (dot) lt Comment #13 Reply to this comment
Yep. One with "This is a test." and second with "Another test."



I'm scraching my head now. We are definitely missing something here.
01/15/2007 01:00:54 PM Jan Schneider Comment #12 Reply to this comment
Yeah, I recieved your test mail. Strange enough. Have you greated
mail.lnk.lt with HELO or EHLO request?
Both, you should have received two mails from coyote.
01/15/2007 12:45:24 PM vilius (at) lnk (dot) lt Comment #11 Reply to this comment
Yeah, I recieved your test mail. Strange enough. Have you greated 
mail.lnk.lt with HELO or EHLO request?
01/15/2007 12:22:10 PM Jan Schneider Comment #10 Reply to this comment
The same with coyote.horde.org

Dec 28 11:05:15 mail postfix/smtpd[11975]: timeout after EHLO from
coyote.horde.org[199.175.137.230]
Dec 28 11:05:15 mail postfix/smtpd[11975]: disconnect from
coyote.horde.org[199.175.137.230]
I was able to connect and send from coyote to mail.lnk.lt just fine. 
But mxbackup.data.lt catched me as a spammer from an ip blacklist.
01/09/2007 03:56:29 AM Chuck Hagenbuch Comment #9 Reply to this comment
It doesn't have anything to do with postfix; it's a TCP-level problem, 
either with MTU size and/or window scaling, and ICMP filtering. We're 
looking at a new list server, on a different network, so that should 
resolve this.
01/09/2007 03:48:18 AM Michael Rubinsky Comment #8 Reply to this comment
Could if be that vlad.horde.org is not compatible with new postfix or
doesn't understand EHLO response?
FWIW, the machine that I *am* having the issue on is postfix 2.2.2, 
but so is my backup MX, which accepts the mail, no problem.




01/05/2007 05:21:00 PM cbs (at) cts (dot) ucla (dot) edu Comment #7 Reply to this comment
We're asking someone to look into it, but please understand that we
have no "staff". Our hosting is donated; we've asked them to look at
it, but no one there works "for" Horde.
If it helps to pin-point whatever changed, I've been having problems 
since some time between the 20th and the 24th.  If there was a change 
made in that time frame to drop or filter ICMP traffic, that's 
probably the cause.
01/05/2007 04:55:08 PM Chuck Hagenbuch Comment #6
State ⇒
Reply to this comment
We're asking someone to look into it, but please understand that we 
have no "staff". Our hosting is donated; we've asked them to look at 
it, but no one there works "for" Horde.
01/05/2007 12:59:39 PM vilius (at) lnk (dot) lt Comment #5 Reply to this comment
Confirmed.



Disabling MTU discovery and setting interface MTU to 1400 does help.



Could someone from Horde staff please look into this issue? As I'm 
activelly using HEAD versions it is getting very hard to follow cvs 
changes and ticket waches.
12/30/2006 01:07:00 AM cbs (at) cts (dot) ucla (dot) edu Comment #4 Reply to this comment
This thread might be relevant:



http://msgs.securepoint.com/cgi-bin/get/postfix9904/37.html

http://msgs.securepoint.com/cgi-bin/get/postfix9904/37/1.html



Basically, it looks like something in the path is filtering ICMP, and 
broke PMTU discovery.



I tested by sending ICMP echo request packets with varying sizes.   
Anything over 1480 bytes for the total packet size was dropped.  I 
tested from two different network paths, to verify that it wasn't 
something on my end.  The two routes share the last 4 hops only:



19  154.11.4.129  36.329 ms  36.818 ms  36.967 ms

20  208.181.86.221  37.716 ms  39.063 ms  41.819 ms

21  209.53.254.36  43.078 ms  41.967 ms  41.206 ms

22  199.175.137.231  38.667 ms  38.190 ms  36.957 ms



and:



  7  154.11.4.129  143.938 ms  144.835 ms  144.333 ms

  8  208.181.86.221  197.612 ms  148.102 ms  145.105 ms

  9  209.53.254.36  145.927 ms  144.991 ms  145.115 ms

10  199.175.137.231  145.803 ms  148.961 ms  147.354 ms


12/28/2006 09:34:04 AM cbs (at) cts (dot) ucla (dot) edu Comment #3 Reply to this comment
I can confirm this.  I'm seeing the same thing with delivery to a 
sendmail MX.  Connections from vlad are timing.  I haven't checked 
logs for anything from coyote.
12/28/2006 09:06:56 AM vilius (at) lnk (dot) lt Comment #2 Reply to this comment
The same with coyote.horde.org



Dec 28 11:05:15 mail postfix/smtpd[11975]: timeout after EHLO from 
coyote.horde.org[199.175.137.230]

Dec 28 11:05:15 mail postfix/smtpd[11975]: disconnect from 
coyote.horde.org[199.175.137.230]


12/28/2006 09:00:11 AM vilius (at) lnk (dot) lt Comment #1
Priority ⇒ 2. Medium
State ⇒ Unconfirmed
Queue ⇒ Horde.org Servers
Summary ⇒ vlad.horde.org timeout after EHLO
Type ⇒ Bug
Reply to this comment
I recently upgraded my mail server's SMTPD from postfix 2.0.x to 
postfix 2.3.3 and vlad.horde.org can not send me email anymore. 
Everything else works ok except for vlad.horde.org. This is what I see 
constantly in the logs:



Dec 28 10:36:49 mail postfix/smtpd[16383]: connect from 
vlad.horde.org[199.175.137.231]

Dec 28 10:41:47 mail postfix/smtpd[14568]: timeout after EHLO from 
vlad.horde.org[199.175.137.231]

Dec 28 10:41:47 mail postfix/smtpd[14568]: disconnect from 
vlad.horde.org[199.175.137.231]

Dec 28 10:41:49 mail postfix/smtpd[24168]: timeout after EHLO from 
vlad.horde.org[199.175.137.231]

Dec 28 10:41:49 mail postfix/smtpd[24168]: disconnect from 
vlad.horde.org[199.175.137.231]



I verified ant MTA is working perfectly:



[root@mail root]# telnet 213.197.188.3 25

Trying 213.197.188.3...

Connected to mail.lnk.lt (213.197.188.3).

Escape character is '^]'.

220 mail.lnk.lt ESMTP Windows 2003 Server

EHLO test.lnk.lt

250-mail.lnk.lt

250-PIPELINING

250-SIZE 51199999

250-VRFY

250-ETRN

250-STARTTLS

250-AUTH PLAIN LOGIN

250-AUTH=PLAIN LOGIN

250-ENHANCEDSTATUSCODES

250-8BITMIME

250 DSN

MAIL FROM: <vilius@lnk.lt>

250 2.1.0 Ok

RCPT TO: <vilius@lnk.lt>

250 2.1.5 Ok

DATA

354 End data with <CR><LF>.<CR><LF>

Subject: asd

asdasd

.

250 2.0.0 Ok: queued as 7E97F10E0AC0

quit

221 2.0.0 Bye

Connection closed by foreign host.



Could if be that vlad.horde.org is not compatible with new postfix or 
doesn't understand EHLO response?

Saved Queries