Summary | Improve EAS performance with large History tables |
Queue | Synchronization |
Queue Version | FRAMEWORK_5_2 |
Type | Enhancement |
State | Resolved |
Priority | 1. Low |
Owners | mrubinsk (at) horde (dot) org |
Requester | hannes.brunssen (at) ewetel (dot) de |
Created | 05/21/2016 (3368 days ago) |
Due | |
Updated | 06/01/2016 (3357 days ago) |
Assigned | 05/26/2016 (3363 days ago) |
Resolved | 05/27/2016 (3362 days ago) |
Milestone | |
Patch | No |
New Attachment: database-load.png
than i thought. Our database load was between 6 and 8 before, now it's
below 1.
object_uid was a bad idea, the index was fast for history_action add,
but slower for modify and delete,
We will test your new version on monday.
current syncStamp, we now update the state with the new value (only
implemented in the Sql state driver, not the Mongo driver).
It should be noted that this is only an issue when the client and/or
user is multiplexing all collections and not using separate
collections for each calendar/addressbook etc... In that case, the
modseq is tracked for the individual collection, so there would never
be that large of a difference in syncstamps unless the client was
turned off for years....
commit f89c2cf8fe6d6ae4f27ae244484167f72a303793
Author: Michael J Rubinsky <mrubinsk@horde.org>
Date: Thu May 26 21:39:52 2016 -0400
Ticket: 14368Update the syncstamp in the collection state.If there are no changes, but a large difference between _lastSyncStamp
and _thisSyncStamp, update the syncstamp while leaving the rest of
the state untouched.
Helps with backends that don't track syncStamps per collection
and have extremely large history datasets.
.../lib/Horde/ActiveSync/Collections.php | 6 +++
.../lib/Horde/ActiveSync/Request/Sync.php | 10 +++++
.../ActiveSync/lib/Horde/ActiveSync/State/Base.php | 9 +++++
.../ActiveSync/lib/Horde/ActiveSync/State/Sql.php | 45
++++++++++++++++++++++
4 files changed, 70 insertions(+)
http://github.com/horde/horde/commit/f89c2cf8fe6d6ae4f27ae244484167f72a303793
Assigned to Michael Rubinsky
State ⇒ Assigned
Since mysql uses the index that returns the least number of rows,
this should improve performance when the modseq range is extremely
high.
query on an idle mysql server:
700k 120k 30k
1 history_modseq 1.61 0.34 0.08
2 history_action 2.25 2.16 2.13
3 object_uid 0.62 0.65 0.64
4 history_action, object_uid 0.60 0.43 0.42
5 history_action, history_modseq, object_uid 0.50 0.13 0.02
6 history_modseq, object_uid 0.39 0.13 0.02
This was order of the preferred indexes on a mysql 5.6 server:
700k: 4, 2
120k: 5, 4, 2
30k: 5, 1
The history_action index seems to cause more problems than it helps,
but it might be useful for other queries. The fasted index was never
selected, so we will probably patch our local Horde installation to
force the last index for these queries.
It would still be nice to prevent these long modseq ranges. Would it
be possible to use a specific UPDATE query that only changes the
modseq if every other field is unchanged and the sync_timestamp is a
week in the past or could that still lead to data loss?
MySQL to use the object_uid index it's still over 1 second. The
moqseq 956528 is from April 2015. I hoped there is a way to prevent
checking the same history entries every 15 seconds for a year, at
least in these cases when a user doesn't use a collection.
Since mysql uses the index that returns the least number of rows, this
should improve performance when the modseq range is extremely high.
SELECT DISTINCT object_uid, history_id FROM horde_histories WHERE
history_modseq > 956528 AND history_modseq <= 2191866 AND
history_action = 'add' AND object_uid LIKE
'kronolith:_J9sL_cb_Xuk0gpnaFfIGA9:%';
Our MySQL server uses the history_action index, but even if I force
MySQL to use the object_uid index it's still over 1 second. The moqseq
956528 is from April 2015. I hoped there is a way to prevent checking
the same history entries every 15 seconds for a year, at least in
these cases when a user doesn't use a collection.
State ⇒ Feedback
Priority ⇒ 1. Low
This confuses me. We filter the history query on the calendar(s) being
checked, so the current value of modSeq should be inconsequential
since the query will only return entries with higher modSeq values AND
matching the calendar. Could you clarify where you see loq modSeq
values increasing "the number of history entries that has to be
checked"?
things, including preventing mirroring client-sourced changes back to
the client and the handling of pending changes. It would possible
(though admittedly a very small chance) that this could overwrite any
existing pending changes in the state. Basically this changes the sync
state without changing the synckey, which we cannot do.
but the connector is not available from the state. I don't know if
it would cause problems with mail folders, every 10000 seconds would
be 8 times per day which would be more than necessary.
Patch ⇒ No
State ⇒ New
Milestone ⇒
Queue ⇒ Synchronization
Summary ⇒ Improve EAS performance with large History tables
Type ⇒ Enhancement
Priority ⇒ 2. Medium
like calendar oder contacts, because there are no changes. Without
SYNC requests the activesync_state entries are not updated and the
sync_mod values stay relatively low. With the low sync_mod values the
number of history entries that has to be checked on every PING
requests grow larger and larger. It takes a few seconds to check for
updates in over a million history entries for a single calendar which
is a growing part of our total database load. I don't know if this is
a client bug or just normal behavior.
These lookups can be reduced if we update the sync state from time to
time. This could be done if we override the getChanges method in
Horde_ActiveSync_State_Sql:
public function getChanges(array $options = array())
{
parent::getChanges($options);
if ($this->_type == Horde_ActiveSync::REQUEST_TYPE_PING &&
empty($this->_changes) &&
$this->_lastSyncStamp < $this->_thisSyncStamp - 10000) {
$this->_lastSyncStamp = $this->_thisSyncStamp;
$this->_type = Horde_ActiveSync::REQUEST_TYPE_SYNC;
$this->save();
$this->_type = Horde_ActiveSync::REQUEST_TYPE_PING;
}
return $this->_changes;
}
The type change is a bit ugly, but the save method would clear the
sync_data otherwise. That could be changed in the save method.
The update might only be needed on collections that support modseq,
but the connector is not available from the state. I don't know if it
would cause problems with mail folders, every 10000 seconds would be 8
times per day which would be more than necessary.