[#1609] Incorrect encoding by MIME::encode() on some UTF-8 strings
Summary Incorrect encoding by MIME::encode() on some UTF-8 strings
Queue Horde Base
Queue Version 3.0.3
Type Bug
State Resolved
Priority 1. Low
Owners jan (at) horde (dot) org, slusarz (at) horde (dot) org
Requester horde (at) ndn (dot) no
Created 2005-03-22 (5112 days ago)
Updated 2005-04-22 (5081 days ago)
Assigned 2005-03-25 (5109 days ago)
Resolved 2005-04-22 (5081 days ago)
Patch No

2005-04-22 09:58:14 horde (at) ndn (dot) no Comment #6 Reply to this comment
I'm sorry I havent replied.

I tried to apply the patch, but it didn't apply to the Horde version I 
am using (3.0 and 3.0.2).

I will try the patch later, to solve the problem here I've just 
hardcoded the use of UTF-8, as described earlier.

2005-04-22 09:50:22 Jan Schneider Comment #5
State ⇒ Resolved
Reply to this comment
No feedback.
2005-03-25 01:36:41 Michael Slusarz Comment #4
State ⇒ Feedback
Reply to this comment
I think this is the same issue as Bug #1621.  Could you try the patch 
there and see if that fixes things for you.  I was able to reproduce 
your error, and that patch fixed it for me.
2005-03-23 12:52:03 horde (at) ndn (dot) no Comment #3
New Attachment: phpshell.php.html Download
Reply to this comment
Yet another update - adding the Unicode option of course breaks 
processing headers when sending mail with the ISO-8859-1 charset - 
perhaps MIME::encode() should somehow specify the character set in use 
to enable the string to be split correctly (ie. not inside a 
character. :)

I've included the output from the function when using the PHP shell. 
MIME::_encode, which encodes a single word, gets the encoding right.
2005-03-23 04:43:50 Chuck Hagenbuch Assigned to Jan Schneider
Assigned to Michael Slusarz
2005-03-23 00:13:39 horde (at) ndn (dot) no Comment #2 Reply to this comment
A little update, I just recalled that this problem _has_ been 
previously observed on the system i thought was OK without the fix, 
but I am unable to reproduce it now.
2005-03-22 23:53:48 horde (at) ndn (dot) no Comment #1
Type ⇒ Bug
State ⇒ Unconfirmed
Priority ⇒ 1. Low
Summary ⇒ Incorrect encoding by MIME::encode() on some UTF-8 strings
Queue ⇒ Horde Base
Reply to this comment
While investigating a problem with the norwegian character "Å" (big 
"å"), causing incorrectly encoded headers when sending mail with UTF-8 
(but not ISO-8859-1), i tracked it to line 142 in lib/Horde/MIME.php:

$size = preg_match_all('/([^\s]+)([\s]*)/', $text, $matches, PREG_SET_ORDER);

In my case, adding the Unicode option (/u) to the regex solved the problem:

$size = preg_match_all('/([^\s]+)([\s]*)/u', $text, $matches, PREG_SET_ORDER);

It seems preg_match_all does not always handle multibyte characters 
(e.g. norwegian Å). On a system with PHP 4.3.10, Apache/1.3.33, and, 
the bug appeared, as shown by this Amavis alert with "Åretur" as the 

X-Amavis-Alert: BAD HEADER Non-encoded 8-bit data (char 85 hex) in 
message header 'Subject'

   Subject: Re: =?utf-8?b?ww==?=\205retur\n

A var_dump of $matches would show the mangled first character as the 
first entry in the array, with "retur" in the second entry.

On another system running PHP 4.3.9, Apache/1.3.31 the bug did NOT appear.

I'm not sure whether this is a bug with other character sets, or 
whether turning on multibyte character support in PHP would solve the 

Saved Queries