Summary | A large horde_vfs table can cause large memory usage during GC |
Queue | Horde Framework Packages |
Queue Version | HEAD |
Type | Bug |
State | Resolved |
Priority | 2. Medium |
Owners | |
Requester | sean (at) duke (dot) edu |
Created | 12/18/2006 (6845 days ago) |
Due | |
Updated | 12/21/2006 (6842 days ago) |
Assigned | 12/20/2006 (6843 days ago) |
Resolved | 12/21/2006 (6842 days ago) |
Github Issue Link | |
Github Pull Request | |
Milestone | |
Patch | No |
State ⇒ Resolved
1) Make the first line of the function be '$conn = $this->_connect();'
2) Change '$this->_write_db' to be '$this->db'
the latest FW_3 code. VFS_sql was changed a while ago to not load the
body of every file in the _listFolder() method (it uses the db server
to get the file sizes instead), which is probably what's driving up
your memory usage anyway. I'll keep the changes, but you should really
update.
$this->_db->quote(), but this one doesn't.
a speed/efficiency increase. I'd probably accept a patch for one though.
I was able to get it to work, with two small changes.
1) Make the first line of the function be '$conn = $this->_connect();'
2) Change '$this->_write_db' to be '$this->db'
A couple other comments, I notice a lot of the other functions use
$this->_db->quote(), but this one doesn't. Also, will there be a
similar function put in place for sql_file?
/**
* Garbage collect files in the VFS storage system.
*
* @param string $path The VFS path to clean.
* @param integer $secs The minimum amount of time (in seconds) required
* before a file is removed.
*/
function gc($path, $secs = 345600)
{
$sql = 'DELETE FROM ' . $this->_params['table']
. ' WHERE vfs_type = ? AND vfs_modified < ? AND (vfs_path
= ? OR vfs_path LIKE ?)';
$this->log($sql, PEAR_LOG_DEBUG);
$values = array(VFS_FILE,
time() - $secs,
$this->_convertPath($path),
$this->_convertPath($path) . '/%');
return $this->_write_db->query($sql, $values);
}
... and use it with the latest version of GC.php
(http://cvs.horde.org/framework/VFS/VFS/GC.php - it'll call a gc()
method if it exists in the passed-in $vfs object).
around this bug):
THRESHOLD=`date -d "12 hours ago" +%s`
DEL_COMMAND="DELETE FROM horde_vfs WHERE vfs_path =
'.horde/imp/compose' AND vfs_modified < ${THRESHOLD}"
After doing this, I'm thinking it might be better to replace the
'vfs_path = '.horde/imp/compose'' check with a check on the vfs_type.
The main concern I have here is not deleting the directories.
This works for the IMP application. It may need to be made more
robust for a general use statement.
and didn't grep - my bad.
Do you happen to have the DELETE statement already, to save my lazy
self a bit more time?
State ⇒ Feedback
Queue ⇒ Horde Framework Packages
State ⇒
Priority ⇒ 2. Medium
Type ⇒ Bug
Summary ⇒ A large horde_vfs table can cause large memory usage during GC
Queue ⇒ Horde Base
State ⇒ Unconfirmed
uploads. I'm using MySQL as the Horde VFS backend.
When my horde_vfs table grew to 4GB in size, an apache process would
grow to almost 4GB in size when GC was run. This caused quite a bit
of thrashing on boxes that only have 4GB of memory.
After looking at the code, when GC is run, with the SQL backend, it
ends up doing a select to load the whole table contents into memory.
This presumably gets put into a buffer in the mysql libs before being
loaded into PHP, and thus before the PHP memory limit can take effect.
The unfortunate part is that all SQL needs for doing this cleanup is a
single DELETE statement, then the mysql server will do all the work.
As such, I suggest allowing each Horde VFS backend have its own GC code.