[#3359] Make use of streams for speed and memory efficiency
Summary Make use of streams for speed and memory efficiency
Queue Horde Framework Packages
Queue Version HEAD
Type Enhancement
State Resolved
Priority 2. Medium
Owners slusarz (at) horde (dot) org
Requester chuck (at) horde (dot) org
Created 2006-01-28 (5470 days ago)
Updated 2010-01-13 (4024 days ago)
Assigned 2008-11-09 (4454 days ago)
Resolved 2009-06-30 (4221 days ago)
Milestone Horde 4.0
Patch No

2009-07-01 07:13:14 Michael Slusarz Comment #8 Reply to this comment
Nice! Do you happen to have any numbers about how this works out,
either speed or memory_limit needs or ... ?
FWIW, I opened up a 5 MB text message (approx. 3.5 MB un-base64 
encoded) using IMP 4 and IMP 5.  This is a message that exceeds the 
maximum size filter, so there is only an option to download the 
message.  According to the memory stats from the debug log, IMP 4 
required approx 26 MB to download and IMP 5 required approx 16 MB.   
However, I have no idea how much of this is attributable to streams 
vs. previous improvements in the code (IMAP, MIME, IMP, etc.)
2009-07-01 07:02:46 Michael Slusarz Comment #7 Reply to this comment
Nice! Do you happen to have any numbers about how this works out,
either speed or memory_limit needs or ... ?
Not really.  In general terms - using streams in the Imap library 
(when grabbing body parts/text) eliminates the need to read (some of) 
this data into memory.  As it stands, data under 2 MB is read into 
memory.  Larger files will be spooled to disk above that point.  So 
for smaller data, there will probably be no gain (and my guess is that 
performance might slightly be reduced due to the stream overhead code) 
but for larger files we cap memory usage at 2 MB.

The bigger gains come in Horde_Mime_Part.  For large body parts, we 
can pass the stream object we grab from Imap_Client directly to 
Horde_Part.  If possible, we can reuse this stream without any further 
action (if the data is already in 7bit, 8bit or binary format).  If 
not, we have to copy this stream to a new stream while running through 
a decompression filter.  Again, we are using temp streams so the 
maximum memory usage during this activity should be approx. 4 MB (2 MB 
for each of the streams), and we can easily free up the 2 MB for the 
one stream after we are finished.

Certain memory bottlenecks unfortunately still require strings and 
there is not much we can do about this (DB/VFS still require strings, 
so this doesn't help memory usage for large attachments; Mail doesn't 
allow us to pass a stream so we have to create the entire text of the 
message before we can send).

Right now, my focus is more on making sure I didn't break anything too badly.
2009-07-01 02:34:49 Chuck Hagenbuch Comment #6 Reply to this comment
Nice! Do you happen to have any numbers about how this works out, 
either speed or memory_limit needs or ... ?
2009-06-30 05:00:54 Michael Slusarz Comment #5
State ⇒ Resolved
Reply to this comment
Added to Horde_Mime_Part.  Implemented socket return for body-ish 
return types in Horde_Imap_Client.  Added support for downloads and 
retrieving body parts in imp.  Closing ticket - there can probably 
further optimizations, but this is the low-hanging fruit.
2009-03-26 05:43:28 Michael Slusarz Comment #4 Reply to this comment
Some notes on some testing:

Will want to use this format:


See http://us.php.net/manual/en/wrappers.php.php

Don't use file_get_contents()/stream_get_contents().  On my test file 
(11 MB of text data), file_get_contents() required 23 MB.  This is 
much more efficient:

while (!feof($a)) {

     $b .= fread($a, 8192);


This used only 300-400KB over the file size.

Same with writing - don't use file_write_contents() for large data.   
fwrite in chunks (4096 is probably good).

And use temp stream.  temp will copy to memory for first X MB 
(configurable) and then will write to disk.  Reading the 11 MB file 
via the method above, and then writing to temp stream using fwrite(), 
and then rewinding temp stream pointer and reading entire file using 
fread() took only 14 MB total.
2008-11-09 01:24:37 Chuck Hagenbuch Comment #3
Assigned to Michael Slusarz
State ⇒ Assigned
Reply to this comment
Un-stalling since Michael S. is working on this area now.
2007-02-05 01:07:45 Chuck Hagenbuch Version ⇒ HEAD
Queue ⇒ Horde Framework Packages
2007-02-05 01:04:45 Chuck Hagenbuch Comment #2
Summary ⇒ Make use of streams for speed and memory efficiency
State ⇒ Stalled
Reply to this comment
For instance, imap_savebody can take a stream, and if we need to do 
decoding, the most common cases can be handled by the base64 and 
quoted-printable stream filters:


Some compression can be done on streams as well:


(there's a reader stream for zip, not sure about write support)

All of this will take some significant re-jiggering of the MIME libs, 
but I think it's worth it to focus on streams and other newer PHP 
features that'll help us save memory and speed in the MIME rewrite. 
I'm moving this to a Horde 4 milestone and to the framework queue 
because of that.
2006-01-28 22:10:38 Chuck Hagenbuch Comment #1
Type ⇒ Enhancement
State ⇒ Accepted
Priority ⇒ 2. Medium
Summary ⇒ Look in to using imap_savebody when it's available
Queue ⇒ IMP
Reply to this comment

Saved Queries