Summary | Use memcache instead of distributed hashtable in $conf[cache][driver] |
Queue | Horde Framework Packages |
Queue Version | Git master |
Type | Bug |
State | Resolved |
Priority | 1. Low |
Owners | slusarz (at) horde (dot) org |
Requester | arjen+horde (at) de-korte (dot) org |
Created | 03/28/2014 (4106 days ago) |
Due | |
Updated | 04/01/2014 (4102 days ago) |
Assigned | 03/31/2014 (4103 days ago) |
Resolved | 04/01/2014 (4102 days ago) |
Github Issue Link | |
Github Pull Request | |
Milestone | |
Patch | No |
grab it later in two different locations ... and one of those
locations may have a different (or no) lifetime argument. So it is
not that simple.
mentioned for Horde_Service_Weather. I fail to understand your
reasoning, since the only place where this value is used, is in these
drivers. There is no way to change the $lifetime of the values that
are stored in the cache from outside the modules, so the chances that
someone will want to use the data despite it being out-of-date is
absolutely zero.
For the record, the Horde_Cache memcache driver seems to ignore the
$lifetime once the results have been placed in the _expirecache. As
far as I can tell, the $lifetime that is passed in get() will be
ignored completely except for the first time a value is retrieved from
the cache.
commit 426cfe7b75a434880dbae5b5ca7e86c60c0ad177
Author: Michael M Slusarz <slusarz@horde.org>
Date: Tue Apr 1 00:15:56 2014 -0600
[mms] Hashtable driver now respects lifetime parameter in the
get() and exists() methods (
Bug #13085)..../Cache/lib/Horde/Cache/Storage/Hashtable.php | 36
+++++++++++++++-----
framework/Cache/package.xml | 4 +-
2 files changed, 29 insertions(+), 11 deletions(-)
http://github.com/horde/horde/commit/426cfe7b75a434880dbae5b5ca7e86c60c0ad177
around for the configured default value for the cache lifetime,
while we already know it will not be used anymore after $lifetime
seconds in the future.
memcache since old entries will automatically be pruned. Dealing with
GC is just adding unnecessary overhead.
And for many purposes, it is advantageous to use the cache if it
exists, regardless of lifetime. You might want to limit the size of
your file-based cache by providing a max lifetime ... not because the
data has an expiration date but simply because you want to limit the
number of entries. But the brilliance of memcache is that it makes
this part of the calculus obsolete: the data is available until you
run out of space.
seems a waste not to use that knowledge.
grab it later in two different locations ... and one of those
locations may have a different (or no) lifetime argument. So it is
not that simple.
All this being said... I believe the issue here is simply that the
Horde_Hashtable driver doesn't currently honor the $lifetime parameter
in get():
/**
* NOTE: This driver ignores the lifetime argument.
*/
public function get($key, $lifetime = 0)
{
return $this->_hash->get($this->_params['prefix'] . $key);
}
Looks like that needs to be written (exists() also).
Client code uses Horde_Cache, which delegates the storage to the
configured storage driver. Horde_Cache::set() defaults to $lifetime
== null, which if you read the code you can see will lead to the
storage driver being sent the configured default value for $lifetime.
But in that case it is still a waste of cache, to keep entries around
for the configured default value for the cache lifetime, while we
already know it will not be used anymore after $lifetime seconds in
the future. We already know that at the time of setting the cache
value, so it seems a waste not to use that knowledge.
code uses Horde_Cache, which delegates the storage to the configured
storage driver. Horde_Cache::set() defaults to $lifetime == null,
which if you read the code you can see will lead to the storage driver
being sent the configured default value for $lifetime.
assumed that the below function would set $lifetime to '0' if the
parameter is not passed (similar to C++):
memcache, not other cache drivers. As already stated, memcache
doesn't use/need the lifetime parameter here so the default value of
zero is irrelevant.
/usr/share/php5/PEAR/Horde/Cache/Storage/Apc.php: public function
set($key, $data, $lifetime = 0)
/usr/share/php5/PEAR/Horde/Cache/Storage/Base.php: abstract public
function set($key, $data, $lifetime = 0);
/usr/share/php5/PEAR/Horde/Cache/Storage/Eaccelerator.php: public
function set($key, $data, $lifetime = 0)
/usr/share/php5/PEAR/Horde/Cache/Storage/File.php: public function
set($key, $data, $lifetime = 0)
/usr/share/php5/PEAR/Horde/Cache/Storage/Hashtable.php: public
function set($key, $data, $lifetime = 0)
/usr/share/php5/PEAR/Horde/Cache/Storage/Memcache.php: public
function set($key, $data, $lifetime = 0)
/usr/share/php5/PEAR/Horde/Cache/Storage/Memoryoverlay.php: public
function set($key, $data, $lifetime = 0)
/usr/share/php5/PEAR/Horde/Cache/Storage/Mock.php: public function
set($key, $data, $lifetime = 0)
/usr/share/php5/PEAR/Horde/Cache/Storage/Mongo.php: public function
set($key, $data, $lifetime = 0)
/usr/share/php5/PEAR/Horde/Cache/Storage/Null.php: public function
set($key, $data, $lifetime = 0)
/usr/share/php5/PEAR/Horde/Cache/Storage/Session.php: public
function set($key, $data, $lifetime = 0)
/usr/share/php5/PEAR/Horde/Cache/Storage/Sql.php: public function
set($key, $data, $lifetime = 0)
/usr/share/php5/PEAR/Horde/Cache/Storage/Stack.php: public function
set($key, $data, $lifetime = 0)
/usr/share/php5/PEAR/Horde/Cache/Storage/Xcache.php: public
function set($key, $data, $lifetime = 0)
So if the $lifetime argument is not set (like in
Horde_Service_Weather) *every* cache driver will default to a value of
'0' and the data that is put into the cache by Horde_Service_Weather
will not be GC'ed for *any* cache driver. This is the point I'm trying
to make.
assumed that the below function would set $lifetime to '0' if the
parameter is not passed (similar to C++):
memcache, not other cache drivers. As already stated, memcache doesn't
use/need the lifetime parameter here so the default value of zero is
irrelevant.
mean that it is never marked for GC. If it is omitted (therefore
passing a null), it is marked for GC based on the configured default
GC lifetime.
assumed that the below function would set $lifetime to '0' if the
parameter is not passed (similar to C++):
public function set($key, $data, $lifetime = 0)
{
$key = $this->_params['prefix'] . $key;
if ($this->_memcache->set($key . '_e', time(), $lifetime) !== false) {
$this->_memcache->set($key, $data, $lifetime);
}
}
Apparently I am wrong here, but in that case I wonder why it would be
needed to set a default at all.
State ⇒ Feedback
the cache if the code enforced lifetime (via ::get()) is reached. GC
is ONLY for the backend and even then there is no guarantee that a
given bit of data WILL be removed once the GC is reached - it only
flags it as being available for GC the next time the backend triggers
it's GC routine.
order for prevent the cache from filling up. For instance, with
WeatherUnderground (if users don't configure a location in their
preferences) the GeoIP lookups of mobile users may result in loads
of entries.
that it is never marked for GC. If it is omitted (therefore passing a
null), it is marked for GC based on the configured default GC lifetime.
will default to a lifetime of '0', which means data will never be GC.
stored in the cache by Horde_Service_Weather doesn't set a lifetime,
which (if I understand correctly) means that it will never expire and
stay in the cache forever. So while memcache may not need this, other
cache drivers may in order for prevent the cache from filling up. For
instance, with WeatherUnderground (if users don't configure a location
in their preferences) the GeoIP lookups of mobile users may result in
loads of entries.
get(), which is used to explicitly enforce a lifetime value (even if
it exists in the backend). So if the weather code needs to expire
its data after a certain period of time, it needs to use get(), not
set().
the time after which the data becomes *available* for GC, not
necessarily the time at which the data is actually invalid. GC !=
data invalidity.
But note that this value is different than the $lifetime value of
get(), which is used to explicitly enforce a lifetime value (even if
it exists in the backend). So if the weather code needs to expire its
data after a certain period of time, it needs to use get(), not set().
In other words:
- set() is used to control cache storage space
- get() is used to control whether the data itself is valid
drivers will default to a lifetime of '0', which means data will
never be GC.
the time after which the data becomes *available* for GC, not
necessarily the time at which the data is actually invalid. GC !=
data invalidity.
will default to a lifetime of '0', which means data will never be GC.
So if that is not set by Horde_Service_Weather, this is what we'll
get. That is not what we want either, because this will mean that the
cache will slowly fill up with entries that are never going to be
removed.
Queue ⇒ Horde Framework Packages
Type ⇒ Bug
State ⇒ Assigned
Priority ⇒ 1. Low
State ⇒ Feedback
time after which the data becomes *available* for GC, not necessarily
the time at which the data is actually invalid. GC != data invalidity.
None of the other cache drivers require this to be set in order for
get()'s lifetime parameter to work, and as far as I know it was like
this since forever. We use it this way in far more than just
Horde_Service_Weather. If changes to hashtable *require* this
parameter to be set in this way, it's a bc break and must be worked
around in hashtable code.
yet applied:
https://github.com/horde/horde/commit/1274e9f50c591bee47bc268df0177fb8b3b8dd8f
So the attached patches should fix Service_Weather. Sorry for the confusion.
configuring memcache as a cache driver. If you want to use
Horde_Service_Weather, don't use the Hashtable as cache. At least, for
now.
New Attachment: Wwo.patch
New Attachment: WeatherUnderground.patch
Service_Weather instead. The Horde_Service_Weather_WeatherUnderground
class should set the timeout value when storing the results in the
cache, rather than checking the value on retrieving only. Attached
patch will fix this.
Just in case: if you came here through a search engine because your
weather information is not refreshed, there will be stored values in
your cache without expiration time set, so you'll need to clear your
cache before this patch will take effect.
commit that changed this:
https://github.com/horde/horde/commit/00750d2d0241231755f412aadb0877787988a0bc
Possibly a better way to fix this, is to let the Hashtable driver take
into account the timeout value.
Priority ⇒ 1. Low
State ⇒ New
Patch ⇒ No
Milestone ⇒
Summary ⇒ Use memcache instead of distributed hashtable in $conf[cache][driver]
Type ⇒ Enhancement
Queue ⇒ Components
and as such is not a very good choice for the $conf[cache][driver]:
/**
* NOTE: This driver ignores the lifetime argument.
*/
public function get($key, $lifetime = 0)
{
return $this->_hash->get($this->_params['prefix'] . $key);
}
This breaks applications that require the cache driver to tell the
information is expired. For instance,
Horde/Service/Weather/WeatherUnderground.php will only refresh the
information if it is expired. This never happens with the distributed
hashtable, because the lifetime is ignored and once the data is
retrieved, the same (stale) data will be served indefinitely.
Before the distributed hashtable was introduced, it was possible to
directly configure memcache as a cache driver (which does support the
timeout variable). It would be great if this functionality was
restored here, since some applications seem to depend on the timeout
to be honored.