Tickets :: [#11073] Charset is not set for the "View Source"

6.0.0-beta1

7/27/25

Summary	Charset is not set for the "View Source"
Queue	IMP
Queue Version	Git master
Type	Bug
State	Not A Bug
Priority	3. High
Owners
Requester	ak (at) lps (dot) komi (dot) ru
Created	03/13/2012 (4884 days ago)
Due	03/13/2012 (4884 days ago)
Updated	03/14/2012 (4883 days ago)
Assigned
Resolved	03/13/2012 (4884 days ago)
Github Issue Link
Github Pull Request
Milestone
Patch	No

03/14/2012 06:19:38 AM	Michael Slusarz	Comment #6	Reply to this comment
With second and third examples patch will not alter content type, as I scan only rfc-822 headers (before first empty line) and these charset are out of scope- these are mime's! Exactly. RFC822 Headers MUST be in ASCII. So sending in US-ASCII (or UTF-8, or ISO-8859-1) is perfectly fine.

03/13/2012 11:14:44 AM	ak (at) lps (dot) komi (dot) ru	Comment #5	Reply to this comment
1)The difference b/w patched and unpatched version is only to use charset from headers of message (Not from mime-part cludge inside message as you put on 2-3 example!) instead of charset that Apache will put by default. That is ALL/ 2) For your examples: for the first there will be no visible changes, I suppose (as base64 enconding means only 7 bit chars). Anyway it will not be worse than default charset put by webserver. With second and third examples patch will not alter content type, as I scan only rfc-822 headers (before first empty line) and these charset are out of scope- these are mime's! [Show Quoted Text - 33 lines][Hide Quoted Text] Again, no. Here's a message that will be completely broken in the display: Content-Type: text/plain; charset=iso-2022-jp Content-Transfer-Encoding: base64 The output message will be garbage Here's another message that will be completely broken: Content-Type: multipart/mixed; boundary=1 --1 Content-Type: text/plain; charset=iso-8859-1 [Text] --1 Content-Type: text/plain; charset=utf-8 [Text] --1-- Again, if you want a quick display of the part's contents THAT MAY NOT BE 100% ACCURATE, you can use View Source. If you need the EXACT CONTENTS, you MUST use save as. We do NOT guarantee that View Source outputs an accurate representation of the data. Because that is impossible to do in the browser (or, for that matter, any text editor). That's because MIME messages are NOT designed to be viewed as a single file! A MIME message != a discrete file as is commonly used in an OS.

03/13/2012 10:53:28 AM	Michael Slusarz	Comment #4	Reply to this comment
Again, no. Here's a message that will be completely broken in the display: Content-Type: text/plain; charset=iso-2022-jp Content-Transfer-Encoding: base64 The output message will be garbage Here's another message that will be completely broken: Content-Type: multipart/mixed; boundary=1 --1 Content-Type: text/plain; charset=iso-8859-1 [Text] --1 Content-Type: text/plain; charset=utf-8 [Text] --1-- Again, if you want a quick display of the part's contents THAT MAY NOT BE 100% ACCURATE, you can use View Source. If you need the EXACT CONTENTS, you MUST use save as. We do NOT guarantee that View Source outputs an accurate representation of the data. Because that is impossible to do in the browser (or, for that matter, any text editor). That's because MIME messages are NOT designed to be viewed as a single file! A MIME message != a discrete file as is commonly used in an OS.

03/13/2012 10:16:44 AM	ak (at) lps (dot) komi (dot) ru	Comment #3	Reply to this comment
1)Mime message have content type neither a 'text/plain' nor a 'text/html', but ussually 'multipart/mixed' and some others. AND this patch puts charset only for the content type with a charset inside 'content type' header, i.e. for 'text/plain' and 'text/html' ! 2)If you are sending just text/plain than Apache adds defaultcharset itself, as I know 3)User offten cant change charset at the browser (for exaple me)when you are using window without toolbars and menues. Sic! 4)About rare or nor rare cases. Now it is very offten when autogenerated mail goes "Content-encoding: 8bit" with not latin1 and so on with first 7bit chars. Ask more people from not latin alphabet countries. 5)" Save as" is not the best solution, but it works. [Show Quoted Text - 26 lines][Hide Quoted Text] One more needful thing for non-latin alphabet's users. When there is generates view of "View Source" [actionID=view_source] there is setting of the content type of 'text/plain' without charset. For correctness file /webroot/imp/view.php, lines after 194 should be: case 'view_source': $msg = $contents->fullMessageText(array('stream' => true)); rewind($msg); while (!feof($msg)) { $line=fgets($msg, 4096); if ($line !== false){ if (strlen($line)==0) break 1; if (strpos($line ,"charset=")>0 && ($pos=strpos($line ,";"))>0){$ct=substr($line,$pos);break 1;} } } No. A MIME message could theoretically have a different charset for each part - you can't just choose the first part and use that as the charset. Additionally, the charset is worthless information since most messages will be encoded in US-ASCII. The charset parameter of Content-Type deals with the UNencoded data. To accurately download the contents of a message, you need to use the "Save As" feature. View Source is not meant to be a canonical representation of the data.

03/13/2012 09:55:32 AM	Michael Slusarz	Comment #2 State ⇒ Not A Bug	Reply to this comment
[Show Quoted Text - 16 lines][Hide Quoted Text] One more needful thing for non-latin alphabet's users. When there is generates view of "View Source" [actionID=view_source] there is setting of the content type of 'text/plain' without charset. For correctness file /webroot/imp/view.php, lines after 194 should be: case 'view_source': $msg = $contents->fullMessageText(array('stream' => true)); rewind($msg); while (!feof($msg)) { $line=fgets($msg, 4096); if ($line !== false){ if (strlen($line)==0) break 1; if (strpos($line ,"charset=")>0 && ($pos=strpos($line ,";"))>0){$ct=substr($line,$pos);break 1;} } } No. A MIME message could theoretically have a different charset for each part - you can't just choose the first part and use that as the charset. Additionally, the charset is worthless information since most messages will be encoded in US-ASCII. The charset parameter of Content-Type deals with the UNencoded data. To accurately download the contents of a message, you need to use the "Save As" feature. View Source is not meant to be a canonical representation of the data.

03/13/2012 08:59:57 AM	ak (at) lps (dot) komi (dot) ru	Comment #1 Priority ⇒ 3. High Type ⇒ Bug Summary ⇒ Charset is not set for the "View Source" Due ⇒ 03/13/2012 Queue ⇒ IMP Milestone ⇒ Patch ⇒ No State ⇒ Unconfirmed	Reply to this comment
One more needful thing for non-latin alphabet's users. When there is generates view of "View Source" [actionID=view_source] there is setting of the content type of 'text/plain' without charset. For correctness file /webroot/imp/view.php, lines after 194 should be: case 'view_source': $msg = $contents->fullMessageText(array('stream' => true)); rewind($msg); while (!feof($msg)) { $line=fgets($msg, 4096); if ($line !== false){ if (strlen($line)==0) break 1; if (strpos($line ,"charset=")>0 && ($pos=strpos($line ,";"))>0){$ct=substr($line,$pos);break 1;} } } fseek($msg, 0, SEEK_END); $headertext='text/plain'; if (isset($ct))$headertext.=$ct; $browser->downloadHeaders('Message Source', $headertext, true, ftell($msg)); INSTEAD OF case 'view_source': $msg = $contents->fullMessageText(array('stream' => true)); fseek($msg, 0, SEEK_END); $browser->downloadHeaders('Message Source', 'text/plain', true, ftell($msg));