Tickets :: Comment on [#3101] Wrong mime-encoding of subject header

* Your Email Address
* Spam protection	Enter the letters below: ._..__..__ .___. __ \| \| \|[__) \| / ` _\|_\|__\[__) \| \__.
Comment	>> When I write a new mail with umlaut characters in the subject, > >> sometimes spaces seem to be missing because the subjects seems to be > >> wrongly encoded. For example with: > >> > >> Subject: Infos über kIZ für die Nightline > >> > >> IMP encodes it as: > >> > >> Subject: Infos =?iso-8859-1?b?/GJlcg==?= kIZ =?iso-8859-1?b?Zvxy?= die > >> Nightline > >> > >> Note that before "Nightline" there is a new-line and a tab. > > > > Yes, and this is perfectly acceptable. It is how you break a MIME > Header. The MUA should convert all newlines and leading white space > in a header to a single space. > > > >> In RFC2047 it is written: > >> > >> > >> When displaying a particular header field that contains multiple > >> 'encoded-word's, any 'linear-white-space' that separates a pair of > >> adjacent 'encoded-word's is ignored. (This is to allow the use of > >> multiple 'encoded-word's to represent long strings of unencoded text, > >> without having to separate 'encoded-word's where spaces occur in the > >> unencoded text.) > > > > Correct. > > > >> That is, I think that you need to put the newline between two encoded > >> words so that the newline is ignored. > > > > No, this is not what the RFC says. The RFC says if you put a > newline between two encoded words, then the space is ignored. So it > allows you to > > break in the middle of a word, for example, if that encoded word > would cause the line to exceed 78 characters (plus CRLF) in length. > However, there is no requirement to break a line this way. > > > >> Mutt does encode the same > >> subject like this: > >> > >> Subject: Infos =?iso-8859-1?Q?=FCber_kIZ_f=FC?= > >> =?iso-8859-1?Q?r?= die Nightline > > > > It looks like mutt uses a different, less complex algorithim. They > encode spaces within the encoded string using '_'. IMP (actually > Horde) only encodes spaces when two consecutive words both contain > characters that require encoding. > > > > This results in (at least in my opinion) easier strings to read when > the string is unencoded (i.e. viewing the message source). The IMP > string of: > > Infos =?iso-8859-1?b?/GJlcg==?= kIZ =?iso-8859-1?b?Zvxy?= die Nightline > > is logically viewed by me as > > Infos <some encoded word> klZ <some encoded word> die Nightline > > > > While the mutt way of doing things: > > Infos =?iso-8859-1?Q?=FCber_kIZ_f=FC?= =?iso-8859-1?Q?r?= die Nightline > > is logically viewed by me as: > > Infos <some encoded word> <some encoded word> die Nightline > > > > The 'klZ' in the mutt example is completely lost in the encoded stuff. > > > > Long story short - both ways of encoding are correct according to the > RFCs. So if your mail reader is entering extra spaces between words > with either encoding, then your mail reader is broken.
Attachment
Watch this ticket

Saved Queries