6.0.0-beta1
11/5/25

[#14258] File based cache garbage collection broken
Summary File based cache garbage collection broken
Queue Horde Framework Packages
Type Bug
State Resolved
Priority 2. Medium
Owners jan (at) horde (dot) org
Requester arjen+horde (at) de-korte (dot) org
Created 02/16/2016 (3550 days ago)
Due
Updated 02/16/2016 (3550 days ago)
Assigned 02/16/2016 (3550 days ago)
Resolved 02/16/2016 (3550 days ago)
Github Issue Link
Github Pull Request
Milestone Horde_Cache 2.5.3
Patch No

History
02/16/2016 08:00:40 PM arjen+horde (at) de-korte (dot) org Comment #5 Reply to this comment
We didn't lock the GC index in the old version either, so this 
couldn't be a reason for the regression. And we *do* lock the old 
index file while doing the migration, so the migration couldn't 
introduce this regression either.
The cache that broke things wasn't migrated. I started out with a new 
cache about a week ago, when reports came in that people saw huge 
increases in their caches. At that time I took the plunge, stopped 
PHP, cleared out the cache, changed sub from 1 to 3 and started PHP 
again. Initially all looked good, so what I saw today was not caused 
by doing a migration (as there was nothing to migrate).
Do you even see this after wiping the cache completely? This 
shouldn't be necessary of course, but with your current broken index 
files, you would have to do this anyway.
I started out fresh again today with a new cache. For the record, the 
line lenght in the freshly created cache in /var/cache/horde/ 
directory with sub=3 equals 74 characters.

The longest lines in the old cache that ran my Horde installation in 
the ground today was exactly 122304 bytes for the vast majority (> 
90%) of the horde_cache_gc files. This could be due to the fact that 
when things went south the load on the system was very high, due to 
the massive I/O that was taking place at the time and the chances of 
collisions were high.
02/16/2016 07:48:56 PM Jan Schneider Comment #4 Reply to this comment
We didn't lock the GC index in the old version either, so this 
couldn't be a reason for the regression. And we *do* lock the old 
index file while doing the migration, so the migration couldn't 
introduce this regression either.
Do you even see this after wiping the cache completely? This shouldn't 
be necessary of course, but with your current broken index files, you 
would have to do this anyway.
02/16/2016 07:37:45 PM arjen+horde (at) de-korte (dot) org Comment #3 Reply to this comment
I'm not sure if this will fully solve the problem. This patch seems to 
fix the problem of adding cache files to horde_cache_gc where they 
don't belong, but looking at the old horde_cache_gc files, it also 
looks there is an issue with locking the horde_cache_gc files somewhere.

As far as I can see, the format of these files is one cache file 
followed by a timestamp per line. Like the following entry from the 
/var/lib/php5/horde/cache/a/0/0/horde_cache_gc file (showing the 
problem that it contains a cache file that shouldn't be there that is 
fixed by this patch):

/var/lib/php5/horde/cache/0/0/0/cache_000bde1edaebadae73f8069438a0bccd 
  1455770298

But looking further down this same file, I also see lines like

/var/lib/p
5/horde/cache/1/c/6/cache_1c664138a4d63a5d7c5b1e76f81002ae      1455977208

(yes, two lines)

/var/lib/php5/horde//var/lib/php5/horde/cache/1/f/d/cache_1fd41d3ae5f1c30fd29cfc831456143345

(no timestamp and really wonky looking directory)

/var/lib/php/var/lib/php5/horde/ <20000+ characters removed> 
a/f/5/cache_af5590ec1fa2401f6cac6b4cf9a5a5d3  1456247333

It looks to me there might also be a problem with file locking. What 
happens if new entries are added to the horde_cache_gc file while in 
the midst of of GC? The above examples look to me as if two processes 
are writing to the same file simultaneously.
02/16/2016 06:57:20 PM Jan Schneider State ⇒ Resolved
Milestone ⇒ Horde_Cache 2.5.3
 
02/16/2016 06:57:01 PM Git Commit Comment #2 Reply to this comment
Changes have been made in Git (master):

commit 395d10cb8b05d3732f627eda9c7d7280bb2bf1bc
Author: Jan Schneider <jan@horde.org>
Date:   Tue Feb 16 19:56:13 2016 +0100

     Start a fresh index for each directory while migrating (Bug #14258).

  framework/Cache/lib/Horde/Cache/Storage/File.php |    2 +-
  framework/Cache/package.xml                      |    2 ++
  framework/Cache/test/Horde/Cache/FileTest.php    |   17 +++++++++++++++++
  3 files changed, 20 insertions(+), 1 deletions(-)

http://github.com/horde/horde/commit/395d10cb8b05d3732f627eda9c7d7280bb2bf1bc
02/16/2016 05:55:54 PM Jan Schneider Version ⇒
Queue ⇒ Horde Framework Packages
 
02/16/2016 05:55:05 PM Jan Schneider State ⇒ Assigned
Assigned to Jan Schneider
Priority ⇒ 2. Medium
 
02/16/2016 05:29:13 PM arjen+horde (at) de-korte (dot) org Comment #1
Priority ⇒ 3. High
Type ⇒ Bug
Summary ⇒ File based cache garbage collection broken
Queue ⇒ Horde Base
Milestone ⇒
Patch ⇒ No
New Attachment: horde_cache_gc Download
State ⇒ Unconfirmed
Reply to this comment
Since the changes in the garbage collection, the 
Horde_Cache_Storage_File is broken when using the sub parameter. It 
looks like the horde_cache_gc file that is created in each 
subdirectory contains *all* entries for the whole tree and not only 
for the leaf it is in.

I'm not sure if attaching the horde_cache_gc file works (since it is 
so large), but with sub = 3, the horde_cache_gc file in leaf 0/0/0 
will not only contain cache_000<whatever> files, but all files in the 
cache at the time of creation. This means that after some time the 
horde_cache_gc files will far exceed the contents of the cache.

Saved Queries