Summary | Hashtable session handler doesn't unlock on session close |
Queue | Horde Framework Packages |
Queue Version | Git master |
Type | Enhancement |
State | Assigned |
Priority | 1. Low |
Owners | |
Requester | arjen+horde (at) de-korte (dot) org |
Created | 05/28/2014 (4004 days ago) |
Due | |
Updated | 05/29/2014 (4003 days ago) |
Assigned | 05/29/2014 (4003 days ago) |
Resolved | |
Milestone | |
Patch | No |
commit fe943741f61c4d80764c87fefb4ec07a11252451
Author: Michael M Slusarz <slusarz@horde.org>
Date: Thu May 29 14:06:22 2014 -0600
[mms] In the Predis driver, ramp up to the maximum unlock wait
interval (Request #13231).
framework/HashTable/lib/Horde/HashTable/Predis.php | 4 ++--
framework/HashTable/package.xml | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)
http://github.com/horde/horde/commit/fe943741f61c4d80764c87fefb4ec07a11252451
commit 5cd6d790db2d55491025306583d01369a4143a03
Author: Michael M Slusarz <slusarz@horde.org>
Date: Thu May 29 14:02:26 2014 -0600
Ticket #13231: session backend handling must be handled through
Horde_Session
ansel/lib/Ansel.php | 3 ++-
kronolith/lib/Ajax/Application/Handler.php | 6 ++++--
2 files changed, 6 insertions(+), 3 deletions(-)
http://github.com/horde/horde/commit/5cd6d790db2d55491025306583d01369a4143a03
But obviously no if it is set and enforced. You cut the time down
from 10 minutes -> 30 seconds in that example.
already DO close the session while fetching the calendars...or maybe
I misunderstand what you mean.
Horde_Session (i will fix this), so $session is invalid after the
session is reopened.
Second... I question whether we even need to have a writable session
at all for this call. Seems like we could add listEvents to the
$_readOnly list. The only places we are setting values in $session in
Kronolith seem to be:
attendees.php
lib/Ajax/Application/Handler.php (in saveCalendar, so irrelevant for
listEvents)
lib/View/EditEvent.php
lib/CalendarsManager.php
new.php
Other than Handler.php, the rest is UI related code and doesn't seem
pertinent to listEvents activity.
Priority ⇒ 1. Low
State ⇒ Assigned
Type ⇒ Enhancement
fairly heavy usage. Your solution bumps that up to 20
requests/second (you have to assume worst-case scenario).
assuming mt_rand is not biased towards certain values
stabilise towards 10 requests/second.
other. Sure enough, running a sample size of 20 mt_rand() calls
created an average time of 82806ns. 18% from the expected "median" is
not "stabilizing" in that sample size. And if we are running
thousands/millions of retries, the problem is not in the random
distribution but rather in the session access in the first place.
And the original problem remains: you are potentially delaying the
**normal** scenario - a single request is queued - in an attempt to
make the outlier work better - since you potentially increase this
common case to 50% slower than it currently is. We need to optimize
for the normal scenario. I'm going to commit a slightly better
algorithm for this purpose, by ramping up to the the 0.1 second delay.
locking handler instead of an event-based handler. WHICH IS EXPECTED
(and one of the reasons why we don't recommend memcache
sessionhandler). That doesn't mean the assumption doesn't hold ...
and the fact that the other sessionhandlers do support this
generalization.
Randomizing the queue order is something you do to prevent deadlocks.
It doesn't do anything to make a single access any faster. In some
instances it might be faster than the current strategy, but in some
instances it might be worse.
The limiting factor remains the median polling time.
I'm with Mike in that there definitely is no bug here - this is an
enhancement request at best. It's possible that the 0.1 sec poll time
is not ideal - but I'm not going to change that without evidence to
prove a better value, especially since the sessionhandler works and
the performance issues are known and have been known since the
beginning.
fairly heavy usage. Your solution bumps that up to 20
requests/second (you have to assume worst-case scenario).
assuming mt_rand is not biased towards certain values. So the with
increasing number of requests, the average will quickly stabilise
towards 10 requests/second.
requests SHOULD be handled in a FIFO basis. If Kronolith for some
reason sends 15 listEvents requests in a row, it is a reasonable
assumption that the 15th is the least important request. It should
not, by virtue of random luck, suddenly jump to 2nd in the queue.
handled by separate processes and it is required to get a lock on a
shared resource. If that is needed, a new request can only be
initiated when the previous one is know to be active (not waiting to
get a lock on a resource). Since this is not done, requests can (and
will be) served in random order. As the cachegrind output I sent you
also indicates, where the third request that was started is served last.
calendars into a single request, and all "other" calendars into a
second request.
other ticket.
together? We still need to wait for the responses from all the
calendars before we send the response. The user would, in this case,
not see ANY calendars in their interface for 10 minutes in your above
example, no?
Seems to me a better solution, from a performance perspective, is to
manually close the session while the remote calendars are being
fetched.
already DO close the session while fetching the calendars...or maybe I
misunderstand what you mean.
into a single request, and all "other" calendars into a second request.
what needs enhancement, not the session storage.
way the javascript code was written." isn't correct. It was written
that way because we don't want the user waiting for EVERY SINGLE
calendar to be loaded into memory before we send anything back to
the client and update the UI.
from a UI perspective. Aside: it would be really nice for some sort
of PHP solution to allow for bi-directional communication regarding
this kind of polling. see http://wiki.horde.org/Project/H6TODO and/or
reactphp Obviously this isn't useful now, but something to look at in
the future.
e.g., any slowly responding remote calendars.
possibly/easily create a pretty bad DoS situation for the current user.
Say a user has 20 remote calendars defined. (It sounds like Arjen has
~15, so this is not an excessive number). Worst case scenario: those
20 remote calendars are reachable ... meaning they won't reject
network connection attempts ... but the actual services on those
servers don't return anything. The connections just sit in wait states.
If you send 20 requests, and each request has to time out, this is 20
request * 30 seconds (the default) = 10 minutes. Some of those PHP
requests may timeout... but from my experience you can't count on this
- and this can also be disabled locally in php.ini. So you've
essentially locked the user out of accessing Horde for 10 minutes.
user because one of the (not even recommended) session handlers
doesn't work satisfactorily.
Seems to me a better solution, from a performance perspective, is to
manually close the session while the remote calendars are being
fetched. (Even better, if the session data is unchanged in the
listEvents call, you don't even need to reopen the session after it is
completed.) This is what we do in IMP when:
* listing Mailboxes
* building mailbox list
* building message display data
to prevent these kind of DoS issues.
it is a reasonable assumption that the 15th is the least important
request. It should not, by virtue of random luck, suddenly jump to
2nd in the queue.
that are (supposed) to take the least amount of time to complete.
I.e., internal calendars. The later requests are those that may take a
longer time, such as remote calendars.
what needs enhancement, not the session storage.
way the javascript code was written." isn't correct. It was written
that way because we don't want the user waiting for EVERY SINGLE
calendar to be loaded into memory before we send anything back to the
client and update the UI. This would create an unnecessary wait time
for users that have, e.g., any slowly responding remote calendars.
IMO, it doesn't make any sense to force this extra wait time on the
user because one of the (not even recommended) session handlers
doesn't work satisfactorily.
fairly heavy usage. Your solution bumps that up to 20 requests/second
(you have to assume worst-case scenario).
And there is a general (albeit not mandatory) requirement that
requests SHOULD be handled in a FIFO basis. If Kronolith for some
reason sends 15 listEvents requests in a row, it is a reasonable
assumption that the 15th is the least important request. It should
not, by virtue of random luck, suddenly jump to 2nd in the queue.
From a practical perspective, this is the ONLY place in Horde where
we are having this issue. Nowhere else can I think of a place where
we have more than 1 or maybe 2 session requests queued. So we should
not be vastly altering the queue system based on a single use-case ...
especially since I am not convinced that the use-case here is what
needs enhancement, not the session storage.
New Attachment: Horde_Memcache.patch
(per request), that is already quite a bit of network/protocol
overhead.
Maybe it would be better to implement a logarithmically increasing
delay mechanism. Or for true performance, a hybrid flock/memcache
system.
lock on the session data is released. I noticed that frequently it
takes a while before the first six listEvents that are started manage
to start a session. This is due to the fact that the poll delay for
all of them is identical, so if they are started at the same time, the
subsequent polls to retry to get a lock on the session data will also
coincide. In that case, it may take up to six times 100 ms before all
processes get going. Adding some randomness in the polling would help
to minimize that, by retrying at different intervals (see attached
patch).
didn't seem to make sense to issue X # of listEvents requests as
opposed to packaging all of them together in a single call. (Can't
find a ticket, and searching my e-mails didn't uncover anything
either).
Bug: 12729I remember somewhere bringing up exactly this issue ... that it didn't
seem to make sense to issue X # of listEvents requests as opposed to
packaging all of them together in a single call. (Can't find a ticket,
and searching my e-mails didn't uncover anything either).
I remember being told at the time that this was necessary due to the
way the javascript code was written. Which didn't seem to be a
satisfactory resolution, IMHO, since that responded only to why it
wasn't currently being done as opposed to whether it was a good idea
or not.
Still feel that this is a valid concern.
commit e284e4ec3d7d2a2aff1b4936d9f58fbe3c3dc63a
Author: Michael M Slusarz <slusarz@horde.org>
Date: Wed May 28 18:24:57 2014 -0600
Again make clear that nobody should be using anything other than
file-based sessionhandler backends unless there is a demonstrable
reason for it
Bug #13231
horde/config/conf.xml | 5 +++++
1 files changed, 5 insertions(+), 0 deletions(-)
http://github.com/horde/horde/commit/e284e4ec3d7d2a2aff1b4936d9f58fbe3c3dc63a
polling choice - 0.1 seconds.
You can't compare memcache with other sessionhandlers re: locking,
since other sessionhandlers (i.e. file, sql) have built-in lock
mechanisms (flock, SQL row locking) that are going to be
instantaneous. For memcache, our only solution is a poll-based
solution.
Lowering the poll delay is not a viable option. At 10 polls/second
(per request), that is already quite a bit of network/protocol overhead.
Maybe it would be better to implement a logarithmically increasing
delay mechanism. Or for true performance, a hybrid flock/memcache
system.
Moral of the story: as I've tried to tell some of my clients, memcache
is NOT the ideal session storage backend. Memcache might be useful if
you are running a pool of PHP servers, but on a single server setup
there remains zero reason to use anything other than file-based
sessions.
ago, which shows session_start() is spending a lot of time in
php::usleep (waiting for Horde_Memcache->lock). Would it be useful
to add them to this ticket (or send them to someone else?)
unlock at the end of each access.
working during the session.
*before* or *after* processing the calendar resource?
https://github.com/horde/horde/blob/master/kronolith/lib/Ajax/Application/Handler.php#L113
with Kronolith. Closing a session obviously just doesn't work in the
Memcache implementation. For whatever reason.
not working during the session.
*before* or *after* processing the calendar resource?
from PHP, and that should delete() the key from memcache again in
Horde_Memcache::unlock().
The problem I see (but this may be by design, I don't know) is that in
the implementation of the current Hashtable sessionhandler, the lock
seems to be removed only when the listEvent completed. Which means
that effectively, only one listEvent is handled at the same time,
while other listEvent processes are waiting for the lock to release.
I expected similar behavior as for the Sql or builtin sessionhandlers,
where all listEvent processes are running concurrently and don't seem
to have to wait for locks to be released. Maybe these don't lock the
session data and is that the reason why they are a lot faster in
updating the calendar view.
less at the same time, at most one will succeed in doing so, while
the others sleep for 100 ms. If after that the session data is still
locked (or locked again), this repeats until all have completed.
working during the session.
Horde_SessionHandler_Storage_Hashtable::close() should be called from
PHP, and that should delete() the key from memcache again in
Horde_Memcache::unlock().
at the same time, at most one will succeed in doing so, while the
others sleep for 100 ms. If after that the session data is still
locked (or locked again), this repeats until all have completed.
probably nothing that Horde can handle.
object doesn't add() this lock key, that's probably nothing that Horde
can handle.
ago, which shows session_start() is spending a lot of time in
php::usleep (waiting for Horde_Memcache->lock). Would it be useful to
add them to this ticket (or send them to someone else?)
Queue ⇒ Horde Framework Packages
Type ⇒ Bug
State ⇒ Feedback
Priority ⇒ 1. Low
events are loaded. As you already noticed by using a different session
handler this works just fine. If it doesn't work with the Hashtable
backend, the bug is there.
Priority ⇒ 1. Low
State ⇒ New
Patch ⇒ No
Milestone ⇒
Summary ⇒ listEvents will run sequentially when listing calendar events
Type ⇒ Enhancement
Queue ⇒ Kronolith
number of simultaneous connections to the same server (in this
version, six). When listing the events for my calendar, it will fire
off 15 processes, each handling a single calendar resource. It will
run 6 processes in parallel, so far so good.
Now the hairy part. With the builtin-, File- and Sql-sessionhandlers
these 6 processes will run concurrently and all 15 are handled in
about half a second. Very acceptable performance and switching between
month views is pretty much instantaneous.
With the Hashtable sessionhandler things are quite different. Each
process will attempt to get a lock on the (same) session data that is
stored during session_start(). Obviously, only one of 6 listEvents
calls succeeds to get the lock (so the locking mechanism works), the
others have to wait until the session is closed again. So effectively,
the listEvents calls are handled sequentially for the Hashtable
sessionhandler. This means that switching between month views takes
well over two seconds. Not impressive at all.
As far as I can see, the listEvents call from the Calendar will not
have to change the session data. In this case, it might be useful to
open the session data readonly, in which case no locking is used (nor
needed), so multiple processes can use the same data concurrently.