6.0.0-beta1
7/22/25

[#648] MIME.php wrapHeaders corrupting filenames
Summary MIME.php wrapHeaders corrupting filenames
Queue Horde Framework Packages
Type Bug
State Resolved
Priority 2. Medium
Owners slusarz (at) horde (dot) org
Requester slusarz (at) horde (dot) org
Created 09/29/2004 (7601 days ago)
Due
Updated 10/26/2004 (7574 days ago)
Assigned 09/29/2004 (7601 days ago)
Resolved 10/26/2004 (7574 days ago)
Github Issue Link
Github Pull Request
Milestone
Patch No

History
10/26/2004 06:34:51 AM Michael Slusarz Comment #4
State ⇒ Resolved
Reply to this comment
Closing - everything seems to be working fine (for now at least).
10/18/2004 05:47:27 AM Michael Slusarz Comment #3 Reply to this comment
I just implemented something that I believe handles the issues raised 
in this report.  I will leave this bug report open for a while to make 
sure this has been fixed correctly.
09/29/2004 04:59:03 AM Michael Slusarz Comment #2 Reply to this comment
This was my reply:

I've just started to take a look at this, but a quick comment on this 
solution.

Although this is the correct way to break these lines according to RFC 2231,

there is a *boatload* of mailers that don't support this.  So implementing it

this way is out of the question, at least for right now (IMP, for example,

supports decoding RFC 2231 encoded strings, but we have to do it in an

extremely hackish way as c-client/PHP doesn't even support this format).



Now that I have been thinking about this for a few minutes... isn't this the

same problem and/or potential solution I discussed here:

http://marc.theaimsgroup.com/?l=horde-dev&m=108334367512331&w=2



If this solution doesn't work, most likely we will have to just have 
the line be

longer than 78 characters since that is the only way I can see right now that

would work with most/all mailers.
09/29/2004 04:58:02 AM Michael Slusarz Comment #1
State ⇒ Assigned
Priority ⇒ 2. Medium
Type ⇒ Bug
Summary ⇒ MIME.php wrapHeaders corrupting filenames
Queue ⇒ Horde Framework Packages
Assigned to Michael Slusarz
Reply to this comment
The following function in the MIME framework module is under certain

circumstances taking long filenames which have spaces in them and

replacing a space in the filename with a tab:



     function wrapHeaders($header, $text, $eol = "\r\n")

     {

         /* Remove any existing linebreaks. */

         $text = preg_replace("/\r?\n\s?/", ' ', $text);



         /* Wrap the line. */

         $line = wordwrap(rtrim($header) . ': ' . rtrim($text), 75, 
$eol . "\t");



         /* Make sure there are no empty lines. */

         $line = preg_replace("/" . $eol . "\t\s*" . $eol . "\t/", "/" 
. $eol . "\t/", $line);



         return substr($line, strlen($header) + 2);

     }



Example:



Horde:

Content-Type: application/msword; name="Mid-Pgm Assessment

         Form000000000000000.doc"

Content-Disposition: attachment; filename="Mid-Pgm Assessment

         Form000000000000000.doc"

Content-Transfer-Encoding: base64



Horde with filename > 78 and no spaces:

Content-Type: application/msword;

         
name="Mid-Pgm_Assessment_Form0000000000000_this_is_a_test_and_this_is_another_test_and_this_is_a_third_test_and_just_one_more_for_kicks.doc"

Content-Disposition: attachment;

         
filename="Mid-Pgm_Assessment_Form0000000000000_this_is_a_test_and_this_is_another_test_and_this_is_a_third_test_and_just_one_more_for_kicks.doc"

Content-Transfer-Encoding: base64





Here are some examples of how other mailers construct this:



Pine:

Content-Type: APPLICATION/msword; name="Mid-Pgm Assessment 
Form000000000000000.doc"

Content-Transfer-Encoding: BASE64

Content-Disposition: attachment; filename="Mid-Pgm Assessment 
Form000000000000000.doc"



Pine with a filename > 78:

Content-Type: APPLICATION/msword; name*0="Mid-Pgm Assessment 
Form000000000000000 this is a test and this is another test and th";

         name*1="is is a third test and just one more for kicks.doc"

Content-Transfer-Encoding: BASE64

Content-Disposition: attachment; filename*0="Mid-Pgm Assessment 
Form000000000000000 this is a test and this is another test and th";

         filename*1="is is a third test and just one more for kicks.doc"



Pine with a filename > 78 and no spaces:

Content-Type: APPLICATION/msword; 
name*0=Mid-Pgm_Assessment_Form0000000000000_this_is_a_test_and_this_is_another_test_and_;

         name*1="this_is_a_third_test_and_just_one_more_for_kicks.doc"

Content-Transfer-Encoding: BASE64

Content-Disposition: attachment; 
filename*0=Mid-Pgm_Assessment_Form0000000000000_this_is_a_test_and_this_is_another_test_and_;

         filename*1="this_is_a_third_test_and_just_one_more_for_kicks.doc"





Mulberry:

Content-Type: application/msword;

  name="Mid-Pgm Assessment Form000000000000000.doc"

Content-Transfer-Encoding: base64

Content-Disposition: attachment;

  filename="Mid-Pgm Assessment Form000000000000000.doc"; size=25088



Mulberry with a filename > 78:

Content-Type: application/msword;

  name="Mid-Pgm Assessment Form000000000000000 this is a test and this 
is another test and this is a third test and just one more for 
kicks.doc"

Content-Transfer-Encoding: base64

Content-Disposition: attachment;

  filename="Mid-Pgm Assessment Form000000000000000 this is a test and 
this is another test and this is a third test and just one more for 
kicks.doc";

  size=24064



Mulberry with a filename > 78 and no spaces:

Content-Type: application/msword;

   
name="Mid-Pgm_Assessment_Form0000000000000_this_is_a_test_and_this_is_another_test_and_this_is_a_third_test_and_just_one_more_for_kicks.doc"

Content-Transfer-Encoding: base64

Content-Disposition: attachment;

   
filename="Mid-Pgm_Assessment_Form0000000000000_this_is_a_test_and_this_is_another_test_and_this_is_a_third_test_and_just_one_more_for_kicks.doc";

  size=24064





The following patch which replaces the tab character with a space at least

does not potentially embed a funky character in the attachment filename

quoted string which some mailers cannot make sense of and therefore

include but it does not deal with a long filename comprised of only

alphanumeric characters:



diff -r1.132 MIME.php

809c809

<         $line = wordwrap(rtrim($header) . ': ' . rtrim($text), 75, 
$eol . "\t");

---
         $line = wordwrap(rtrim($header) . ': ' . rtrim($text), 75, 
$eol . " ");
812c812

<         $line = preg_replace("/" . $eol . "\t\s*" . $eol . "\t/", 
"/" . $eol . "\t/", $line);

---
         $line = preg_replace("/" . $eol . " \s*" . $eol . " /", "/" 
. $eol . " /", $line);
The pine name*<n> notation looks like an interesting way to handle this.
From rfc2822:
    There are two limits that this standard places on the number of

    characters in a line. Each line of characters MUST be no more than

    998 characters, and SHOULD be no more than 78 characters, excluding

    the CRLF.



    The 998 character limit is due to limitations in many implementations

    which send, receive, or store Internet Message Format messages that

    simply cannot handle more than 998 characters on a line. Receiving

    implementations would do well to handle an arbitrarily large number

    of characters in a line for robustness sake. However, there are so

    many implementations which (in compliance with the transport

    requirements of [RFC2821]) do not accept messages containing more

    than 1000 character including the CR and LF per line, it is important

    for implementations not to create such messages.



    The more conservative 78 character recommendation is to accommodate

    the many implementations of user interfaces that display these

    messages which may truncate, or disastrously wrap, the display of

    more than 78 characters per line, in spite of the fact that such

    implementations are non-conformant to the intent of this

    specification (and that of [RFC2821] if they actually cause

    information to be lost). Again, even though this limitation is put on

    messages, it is encumbant upon implementations which display messages





I think since the character limit is a "MUST be no more than 998" and a

"SHOULD be no more than 78" then there are the following options:



  - use spaces instead of tabs to indent continuation lines on MIME part

    headers

  - start a new continuation line each time a semi-colon is encountered

    outside of a quoted-string unless it is the trailing character

  - limit each of these lines to 998 or 78:



    - either truncate the value portion of the header attribute to make the

      overall length of the line less than 998



      or



    - use the attribute_key*<n> syntax to break up quoted-strings so that

      no line exceeds 78 characters



I was thinking that replacing the call with something like the following -

this hasn't been syntactically checked or anything:



     function wrapHeaders($header, $text, $eol = "\r\n")

     {

         /* Remove any existing linebreaks. */

         $text = trim(preg_replace("/\r?\n\s?/", ' ', $text));

         $header = trim($header);



         $line = '';



         if ((strlen($text) + strlen($header)) < 75) {

             $line .= $header . ': ' . $text . $eol;

         } else {

             /* need a more accurate separator regex here but this is 
just for demonstrative purposes */

             $attrs = array_map('trim', preg_split(';', $text, -1, 
PREG_SPLIT_NO_EMPTY));

             for ($i = 0; $i < count($attrs);  $i++) {



                 if ($i == 0) {

                     /* if this is the first line account for the 
length of the header addition */

                     $prefix = $header . ': ';

                 } else {

                     /* otherwise it is just a single whitespace 
indent to account for */

                     $prefix = ' ';

                 }

                 $offset = strlen($prefix);



                 if ((strlen($offset) + strlen($attrs[$i])) < 75) {

                     $line .= $prefix . $attrs[$i] . ';' . $eol;

                 } else {

                     $attrItems = explode('=', $attrs[$i], 1);



                     /* if the separator isn't found in the attribute then

                      * the value should probably not be folded.

                      * just make sure it doesn't exceed 995

                      */

                     if (!$attrItems) {

                         $line .= $prefix . substr($attrs[$i], 0, 995 
- $offset) . ';' . $eol;

                     } else {

                         $attrName = $attrItems[0];

                         $attrVale = trim($attrItems[1], '"');

                         $chunks = chunk_split(trim($attrItems[1], 
'"'), 75 - ($offset + strlen($attrName) + 6))

                         for ($c = 0; $c < count($chunks);  $c++) {

                             $line .= $line .= $prefix . 
"$attrName*$c=" . '"' . $chunks[$c] . '";' . $eol;

                         }

                     }

                 }

             }

             return substr($line, strlen($header) + 2);

         }

     }



I think there should also be some code in place to deal with displaying

these long filenames at the top of the message in HTML.  I think the

anchor tag should be truncated to a certain number of characters and an

alt tag with the full string should be added.



Comments?



--

Sam Nicolary

Saved Queries