6.0.0-beta1
7/29/25

[#2988] unable to compose email messages in character sets other than utf-8 or iso-8859-*
Summary unable to compose email messages in character sets other than utf-8 or iso-8859-*
Queue IMP
Queue Version FRAMEWORK_3
Type Bug
State Not A Bug
Priority 1. Low
Owners
Requester leena.heino (at) uta (dot) fi
Created 11/17/2005 (7194 days ago)
Due
Updated 11/20/2005 (7191 days ago)
Assigned 11/17/2005 (7194 days ago)
Resolved 11/20/2005 (7191 days ago)
Github Issue Link
Github Pull Request
Milestone
Patch No

History
11/20/2005 12:54:55 AM leena (dot) heino (at) uta (dot) fi Comment #9 Reply to this comment
Fix your system so that it supports UTF-8.
I wish I could, but unfortunately that is not always possible.



Anyway thanks for your help, you gave good pointers and hints and with 
those I was able write a piece of code that should fix this situation 
locally.


11/20/2005 12:02:23 AM Jan Schneider Comment #8
State ⇒ Not A Bug
Reply to this comment
Fix your system so that it supports UTF-8.
11/19/2005 10:51:57 PM leena (dot) heino (at) uta (dot) fi Comment #7 Reply to this comment
Correct. And this is the expected behaviour. So what?
This feature prevents users from writing email messages in other 
charsets. After all they could be using browsers or systems where they 
are unable to change settings.




11/19/2005 09:24:34 PM Jan Schneider Comment #6 Reply to this comment
Correct. And this is the expected behaviour. So what?
11/19/2005 06:44:38 PM leena (dot) heino (at) uta (dot) fi Comment #5 Reply to this comment
Huh?
if I compose message and choose eg. koi8-r charset from menu in the 
compose window and the interface is set to iso-8859-1 
(NLS::getCharset) then Imp tries to convert the iso-8859-1 to koi8-r.   
String::convertCharset is used in imp/compose.php to do the charset 
conversion. String::convertCharset uses iconv() or 
mb_convert_encoding(). Iconv() and mb_convert_encoding() mess up the 
conversion of from iso-8859-1to koi8-r.



iconv() and mb_convert_encoding() tries to convert input string so 
that resulting string have only those characters that are found in 
both character sets. This means that if 8-bit character sets are 
incompatible then most or all 8-bit characters are lost in conversion.


11/19/2005 05:38:02 PM Jan Schneider Comment #4 Reply to this comment
Huh?
11/19/2005 04:59:54 PM leena (dot) heino (at) uta (dot) fi Comment #3 Reply to this comment
I can't reproduce this. Sending messages for example in KOI-R works
fine here.
This will of course only work if the interface is either in UTF-8 or
KOI-R (or any compatible charset).
Would it be possible to extend Imp so that If both the input and the 
desired output (the character set chosen by the user) are 8-bit 
character sets then Imp would just rename the character set and it 
would not try use any character set conversion functions to the input.


11/17/2005 02:15:08 PM Jan Schneider Comment #2
State ⇒ Feedback
Reply to this comment
I can't reproduce this. Sending messages for example in KOI-R works fine here.

This will of course only work if the interface is either in UTF-8 or 
KOI-R (or any compatible charset).
11/17/2005 01:20:16 PM leena (dot) heino (at) uta (dot) fi Comment #1
Priority ⇒ 1. Low
Type ⇒ Bug
Summary ⇒ unable to compose email messages in character sets other than utf-8 or iso-8859-*
Queue ⇒ IMP
State ⇒ Unconfirmed
Reply to this comment
If user tries to compose a message in character set other than utf-8 
or iso-8859-* the characters in the mail will be illegible in the 
chosen character set.



What seems to happen is that server decides that the input charset is 
eg. ISO-8859-1 (or UTF-8 converted to ISO-8859-1) then it invokes some 
sort of character convert routine to transform the iso-8859-1 encoded 
text to eg. KOI8-R encoded. This routine fails because you cannot 
convert eg. iso-8859-1 encoded text to eg. KOI8-R encoded and the 
result is that all the 8-bit characters are lost and they are 
converted to "?" characters.



What the program should do instead is to check that the both of those 
character sets are 8-bit and single type and just tag the message to 
the character set that the user has chosen and it should not try use 
any conversion routines. Conversion routines should only be used if 
you are trying to convert multibyte charset to single byte charsets.


Saved Queries