6.0.0-git
2019-08-25

[#13747] Speed up ActiveSync synchronisation handling
Summary Speed up ActiveSync synchronisation handling
Queue Synchronization
Queue Version Git master
Type Enhancement
State Resolved
Priority 1. Low
Owners mrubinsk (at) horde (dot) org
Requester patrick (at) spamreducer (dot) eu
Created 2014-12-09 (1720 days ago)
Due
Updated 2014-12-11 (1718 days ago)
Assigned 2014-12-10 (1719 days ago)
Resolved 2014-12-10 (1719 days ago)
Milestone
Patch No

History
2014-12-11 17:33:39 Michael Rubinsky Comment #27 Reply to this comment
How do you handle this issue on your production server? Are you 
using some FTS like Lucene? Or have you just disabled the maillog 
synchronization?
I haven't had a big performance issue, but I don't have a lot of users 
on my production box, and even less that use OL.
2014-12-11 17:23:54 patrick (at) spamreducer (dot) eu Comment #26 Reply to this comment
Having a config switch gives the admin more
control of preventing performance issues server-wide.
That's true.. I agree.

How do you handle this issue on your production server? Are you using 
some FTS like Lucene? Or have you just disabled the maillog 
synchronization?
2014-12-11 14:16:14 Michael Rubinsky Comment #25 Reply to this comment
..but wouldn't it possible to insert some IF loop arround the line 
for this annoing mailclients like OL2013? Some array with a list of 
"stupid" clients..
Possible? Sure, but why? OL2013 isn't "stupid" (at least in this 
regard). It's just maintaining a lot of folders. This would also 
happen, e.g., if you enabled the same amount of folders to be pushed 
on any other client. Having a config switch gives the admin more 
control of preventing performance issues server-wide.

2014-12-11 14:14:12 Michael Rubinsky Comment #24 Reply to this comment
Is there just no optimization possibility than upgrading the CPU 
power? Could there be using some caching?
Horde_Imap_Client already implements it's own cache. Be sure that you 
have this enabled and set to an appropriate value for your needs (see 
imp/config/backends[.local].php

Otherwise, you should investigate a FTS engine/plugin for Dovecot. I'm 
not sure offhand which one would index header data, as opposed to 
message body data, but I'm pretty sure at least one of them does.

Creating some sort of local Mailbox/Message-ID/UID cache for this 
purpose would be difficult for reasons similar to why the 
History/Maillog system doesn't already cache the data.

2014-12-11 09:52:47 patrick (at) spamreducer (dot) eu Comment #23 Reply to this comment
You can temporarily test if this is the culprit by commenting out 
line 1066 in Driver.php (which should be $folder = 
$this->_getMaillogChanges($folder, $from_ts);).
This is BOOSTER! Commenting that line out, there is virtually no delay 
anymore..

..but wouldn't it possible to insert some IF loop arround the line for 
this annoing mailclients like OL2013? Some array with a list of 
"stupid" clients..
2014-12-11 08:50:28 patrick (at) spamreducer (dot) eu Comment #22 Reply to this comment
No, it's an IMAP search command executed directly against the IMAP 
server (via Horde_Imap_Client).
Is there just no optimization possibility than upgrading the CPU 
power? Could there be using some caching?

2014-12-10 22:08:28 Michael Rubinsky State ⇒ Resolved
 
2014-12-10 21:49:30 Git Commit Comment #21 Reply to this comment
Changes have been made in Git (master):

commit 3e7bd2c22cd8db01e79f00da3fce3848f8ddfbb7
Author: Michael J Rubinsky <mrubinsk@horde.org>
Date:   Wed Dec 10 16:45:00 2014 -0500

     Allow turning off maillog synchronization.

     Bug: 13747

  .../Core/lib/Horde/Core/ActiveSync/Driver.php      |    4 +++-
  horde/config/conf.xml                              |   15 +++++++++++++--
  2 files changed, 16 insertions(+), 3 deletions(-)

http://github.com/horde/horde/commit/3e7bd2c22cd8db01e79f00da3fce3848f8ddfbb7
2014-12-10 21:30:45 Michael Rubinsky Comment #20 Reply to this comment
So there will be visible just the replied / forwarded state or 
nothing at all like in OL2013 with ActiveSync?
Ah, right. I forgot OL2013 doesn't even support displaying externally 
changed message state anyway. Yes, though - if it is disabled, no 
information about the reply/forward state would be displayed on any 
client for emails replied/forwarded via IMP. If the EAS client 
supports looking at message flags to determine this, then at least the 
fact that it was replied to will be shown, but no details such as when.
Is this header search done in SQL against IMAP?
No, it's an IMAP search command executed directly against the IMAP 
server (via Horde_Imap_Client).
This happens for EVERY mailbox that is synchronized, against every
Message-ID the maillog indicates has changed since we can't know if
the message is from the mailbox or not.  We can't really store the
mailbox name in the history log since it's possible the mailbox name
would change, or the message will be moved to a different mailbox  -
making all such lookups useless.
About which "changes" are you speaking about?
"Maillog changes" only contain details about reply/forward state. We 
only check this when the client is asking for all server changes (a 
SYNC request - this does not happen during the more frequent PING 
requests which only ask IF there are changes). After checking the IMAP 
server for new/vanished messages (a very quick operation when MODSEQ 
is active) we then query IMP's maillog to look for messages that have 
had a reply state change and lookup the UID as described previously.

It happens also on
doing some simple changes on 1 message?
When you send ANY change FROM the client (such as a change in read 
flag) it almost always triggers a SYNC request (this is client 
specific), so yes, sending a simple read flag change to the server 
could cause this behavior.

2014-12-10 20:27:24 patrick (at) spamreducer (dot) eu Comment #19 Reply to this comment
I think the only way to make this better for you is to disable 
synchronization of the maillog data (so you won't see information 
about when/if an email was replied to from IMP's UI).
So there will be visible just the replied / forwarded state or nothing 
at all like in OL2013 with ActiveSync?
The issue here is that we only have the Message-ID from the maillog 
and therefore, must perform a header search on whatever mailbox(es) 
we are currently obtaining changes for in order to get the UID.
Is this header search done in SQL against IMAP?
This happens for EVERY mailbox that is synchronized, against every 
Message-ID the maillog indicates has changed since we can't know if 
the message is from the mailbox or not.  We can't really store the 
mailbox name in the history log since it's possible the mailbox name 
would change, or the message will be moved to a different mailbox  - 
making all such lookups useless.
About which "changes" are you speaking about? It happens also on doing 
some simple changes on 1 message?

2014-12-10 20:03:56 Michael Rubinsky Comment #18 Reply to this comment
You can temporarily test if this is the culprit by commenting out line 
1066 in Driver.php (which should be $folder = 
$this->_getMaillogChanges($folder, $from_ts);).
2014-12-10 20:02:42 Michael Rubinsky Comment #17
State ⇒ Assigned
Reply to this comment
I think the only way to make this better for you is to disable 
synchronization of the maillog data (so you won't see information 
about when/if an email was replied to from IMP's UI).

The issue here is that we only have the Message-ID from the maillog 
and therefore, must perform a header search on whatever mailbox(es) we 
are currently obtaining changes for in order to get the UID. This 
happens for EVERY mailbox that is synchronized, against every 
Message-ID the maillog indicates has changed since we can't know if 
the message is from the mailbox or not.  We can't really store the 
mailbox name in the history log since it's possible the mailbox name 
would change, or the message will be moved to a different mailbox  - 
making all such lookups useless.

I'm going to have to add an option to disable the maillog sync, or 
maybe allow only INBOX to synchronize maillog data.
2014-12-10 18:39:46 patrick (at) spamreducer (dot) eu Comment #16 Reply to this comment
One more:

Put

Horde::debug(array($folder, $from_ts, $to_ts));

in same file, line 1057 (right before the line with "try {" on it)
I have closed my OL2013, inserted the debugging log line in file, 
opened OL2013 again and waited till OL is showing "All folders are up 
to date" (german translation.. :-) ); closed OL again.

There is no other client connected to this Horde installation.

This is the link: 
https://www.dropbox.com/s/bpkzi3z1ab9o6ej/horde_debug.txt.gz?dl=0
2014-12-10 18:30:48 Michael Rubinsky Comment #15 Reply to this comment
Is this from a *single* request?

It shouldn't be called more than once per folder, and it appears that 
it's returning the exact same maillog delta each time.

One more:

Put

Horde::debug(array($folder, $from_ts, $to_ts));

in same file, line 1057 (right before the line with "try {" on it)
2014-12-10 18:27:49 patrick (at) spamreducer (dot) eu Comment #14 Reply to this comment
Very interesting,
seems to be some endless loop like repeat through some messages..
2014-12-10 18:12:38 patrick (at) spamreducer (dot) eu Comment #13 Reply to this comment
2014-12-10 18:07:05 Michael Rubinsky Comment #12 Reply to this comment
I mean:

$this->_logger->debug($changes) ?
Exactly as I said:

Horde::debug($changes);

This will output into a file name horde_debug.txt in your temp directory.
2014-12-10 18:05:21 patrick (at) spamreducer (dot) eu Comment #11 Reply to this comment
I mean:

$this->_logger->debug($changes) ?
2014-12-10 18:02:00 patrick (at) spamreducer (dot) eu Comment #10 Reply to this comment

[Show Quoted Text - 21 lines]
What is the exact code to insert?

Horde_Log_Logger->debug($changes)?

2014-12-10 17:42:33 Michael Rubinsky Comment #9 Reply to this comment
Take a look at the screenshot,
don't know if you can see this time consuming statistics..
Yes, I saw that already in the cachegrind file you sent me.
2014-12-10 17:41:01 patrick (at) spamreducer (dot) eu Comment #8
New Attachment: CacheGrind_most_timeconsuming_tasks.PNG Download
Reply to this comment
Take a look at the screenshot,
don't know if you can see this time consuming statistics..

Are they useful?
2014-12-10 17:32:54 Michael Rubinsky Comment #7
Assigned to Michael Rubinsky
Reply to this comment
More detail:

Also, this is exacerbated by the fact that OL synchs every folder. So, 
if any email in any folder has had a change in the maillog, the imap 
server must perform a header search for the Message-ID.

Still, something looks screwy:

The cachegrind file shows that Horde_Core_ActiveSync_Driver::get 
ServerChanges() is called 195 times - which should match the number of 
folders that were checked for changes. During those checks, it looks 
like the Horde_Core_ActiveSync_Driver::_getMaillogChanges() method was 
called over 6k times. I seriously doubt there were THAT many changes 
during a single sync cycle.

I would next try to figure out why the maillog is returning so many 
changes since the last sync. Start in 
Horde_Core_ActiveSync_Driver::_getMaillogChanges() and use 
Horde::debug($changes) after line 3234 to see what is returned from 
the maillog.

2014-12-10 17:19:18 Michael Rubinsky Comment #6 Reply to this comment
..just to compare, same "mark as read 1 message" with an Android client..

https://www.dropbox.com/s/g15h0mc6gto70ki/mark_1_message_as_read_android.tgz?dl=0
The culprit appears to be the mailLog handling.

Because IMP's maillog (where things like when you replied to which 
message are stored) is keyed by Message-ID we need to search the 
mailbox for the message's UID. Your server is spending a very log time 
performing this search. This should only be triggered, though, if a 
message in the current folder has had a change to the maillog.

2014-12-10 17:14:49 Michael Rubinsky Comment #5 Reply to this comment
..just to compare, same "mark as read 1 message" with an Android client..

https://www.dropbox.com/s/g15h0mc6gto70ki/mark_1_message_as_read_android.tgz?dl=0
I don't think this is the complete cachegrind file, as it's missing 
quite a number of important method calls.
2014-12-10 10:00:59 patrick (at) spamreducer (dot) eu Comment #4 Reply to this comment
..just to compare, same "mark as read 1 message" with an Android client..

https://www.dropbox.com/s/g15h0mc6gto70ki/mark_1_message_as_read_android.tgz?dl=0
2014-12-10 09:38:41 patrick (at) spamreducer (dot) eu Comment #3 Reply to this comment
Hi Michael,
I've done some debugging..

Your question first:
   - we are using DOVECOT, yes. (so MODSEQ should work.., HIGHMODSEQ 
is showing up on manual selecting IMAP folders over a telnet session)
   - my client is OL2013
   - we use apache as our server with a mysql backend (no special 
configuration, just default settings)

I have uploaded a zipped cachegrind file:
https://www.dropbox.com/s/c0zixz004o27ajd/mark_1_message_as_read.tgz?dl=0

I am showing just *1* thing in this debugging file: Mark 1 message as 
read in OL2013. It produces a tremendous delay of about *119seconds* 
where the CPU is spiking at 100%.

Where could I investigate further, or have you any speedup 
optimizations I could test?
2014-12-09 18:04:31 Michael Rubinsky Comment #2
State ⇒ Feedback
Priority ⇒ 1. Low
Reply to this comment
Dear Horde team,
I have noticed some performance issues in Horde with email "power users"..

"Power users" are users having > than 150 message folders and 25k 
messages in them.. (just to have an idea about what I'm speaking)

Every time a power user logs in via ActiveSync (Outlook 2013, ..) the
I would fall into this category, though not ALL of my folders have 
that many messages. I don't see such a slowdown, but I do not use 
Apache.
apache service rises up to 100% CPU usage for 3-6 seconds.
Also, every time a power user mark just 1 message as read/unread the 
same problem arise.
The bottleneck seems to be just the Horde service (apache).
There is practically no access to harddisks and/or other hardware 
during the CPU usage spike.
FIrst, I am going to assume you are using an efficient  IMAP server 
such as Dovecot that supports, among other things, modification 
sequences. Otherwise, your performance for such a large mailbox will 
degrade.

The way the mail synchronization works for such an IMAP server (and 
I'm heavily simplifying) is that we remember the last MODSEQ of each 
mailbox and then basically perform a time-escaped loop that queries 
the IMAP server for changes in each folder since that folder's MODSEQ. 
  After each folder is queried (assuming no changes were found) we 
sleep a configurable number of seconds and then repeat the process. 
The loop (and eventually the request) exit either when a change is 
found, or the heartbeat interval is reached. There is no "looping over 
every message in the folder" as you had asked in the mailing list. 
Checking each folder for changes (should be) a very fast operation - 
limited by your IMAP server's performance. For large number of 
folders, each loop iteration might take some time, but there is 
nothing we can about that since it is the only way to detect changes. 
A unique-ish thing about OL is that it ALWAYS polls ALL folders. It's 
not configurable like just about every other EAS client out there.

On the Horde side, the only thing I can think of for such a large 
number of messages is that it might be related to the state storage 
since even for MODSEQ servers, we need to cache a (compressed) list of 
message uids that are on the device.
Is there any possibility to investigate further?
You should investigate using a tool such as xdebug to profile where 
this time is being spent.
Could there be some needless loops in Horde that could be optimized?
Anything is possible. Again, xdebug will be your friend here.
2014-12-09 17:19:46 patrick (at) spamreducer (dot) eu Comment #1
Type ⇒ Enhancement
State ⇒ New
Priority ⇒ 2. Medium
Summary ⇒ Speed up ActiveSync synchronisation handling
Queue ⇒ Synchronization
Milestone ⇒
Patch ⇒ No
Reply to this comment
Dear Horde team,
I have noticed some performance issues in Horde with email "power users"..

"Power users" are users having > than 150 message folders and 25k 
messages in them.. (just to have an idea about what I'm speaking)

Every time a power user logs in via ActiveSync (Outlook 2013, ..) the 
apache service rises up to 100% CPU usage for 3-6 seconds.
Also, every time a power user mark just 1 message as read/unread the 
same problem arise.

The bottleneck seems to be just the Horde service (apache).
There is practically no access to harddisks and/or other hardware 
during the CPU usage spike.

Is there any possibility to investigate further?
Could there be some needless loops in Horde that could be optimized?

Thank you!

Saved Queries