Summary | unable to compose email messages in character sets other than utf-8 or iso-8859-* |
Queue | IMP |
Queue Version | FRAMEWORK_3 |
Type | Bug |
State | Not A Bug |
Priority | 1. Low |
Owners | |
Requester | leena.heino (at) uta (dot) fi |
Created | 11/17/2005 (7194 days ago) |
Due | |
Updated | 11/20/2005 (7191 days ago) |
Assigned | 11/17/2005 (7194 days ago) |
Resolved | 11/20/2005 (7191 days ago) |
Github Issue Link | |
Github Pull Request | |
Milestone | |
Patch | No |
Anyway thanks for your help, you gave good pointers and hints and with
those I was able write a piece of code that should fix this situation
locally.
State ⇒ Not A Bug
charsets. After all they could be using browsers or systems where they
are unable to change settings.
compose window and the interface is set to iso-8859-1
(NLS::getCharset) then Imp tries to convert the iso-8859-1 to koi8-r.
String::convertCharset is used in imp/compose.php to do the charset
conversion. String::convertCharset uses iconv() or
mb_convert_encoding(). Iconv() and mb_convert_encoding() mess up the
conversion of from iso-8859-1to koi8-r.
iconv() and mb_convert_encoding() tries to convert input string so
that resulting string have only those characters that are found in
both character sets. This means that if 8-bit character sets are
incompatible then most or all 8-bit characters are lost in conversion.
fine here.
This will of course only work if the interface is either in UTF-8 or
KOI-R (or any compatible charset).
desired output (the character set chosen by the user) are 8-bit
character sets then Imp would just rename the character set and it
would not try use any character set conversion functions to the input.
State ⇒ Feedback
This will of course only work if the interface is either in UTF-8 or
KOI-R (or any compatible charset).
Priority ⇒ 1. Low
Type ⇒ Bug
Summary ⇒ unable to compose email messages in character sets other than utf-8 or iso-8859-*
Queue ⇒ IMP
State ⇒ Unconfirmed
or iso-8859-* the characters in the mail will be illegible in the
chosen character set.
What seems to happen is that server decides that the input charset is
eg. ISO-8859-1 (or UTF-8 converted to ISO-8859-1) then it invokes some
sort of character convert routine to transform the iso-8859-1 encoded
text to eg. KOI8-R encoded. This routine fails because you cannot
convert eg. iso-8859-1 encoded text to eg. KOI8-R encoded and the
result is that all the 8-bit characters are lost and they are
converted to "?" characters.
What the program should do instead is to check that the both of those
character sets are 8-bit and single type and just tag the message to
the character set that the user has chosen and it should not try use
any conversion routines. Conversion routines should only be used if
you are trying to convert multibyte charset to single byte charsets.