6.0.0-git
2018-12-15

[#14770] HTML export does not encode special chars (umlauts)
Summary HTML export does not encode special chars (umlauts)
Queue Wicked
Queue Version Git master
Type Bug
State Feedback
Priority 2. Medium
Owners
Requester birnbacs (at) gmail (dot) com
Created 2018-01-19 (330 days ago)
Due
Updated 2018-01-29 (320 days ago)
Assigned 2018-01-29 (320 days ago)
Resolved
Milestone
Patch No

History
2018-01-29 17:21:04 Jan Schneider Comment #4
State ⇒ Feedback
Reply to this comment
The page is exported perfectly fine, the only issue is, that it's only 
an HTML fragment, not a complete, valid HTML document. Thus it doesn't 
contain any charset information. If you load it with the correct 
charset UTF-8, the page is displayed perfectly fine though.
2018-01-23 15:13:43 birnbacs (at) gmail (dot) com Comment #3
New Attachment: Umlaut_HTML_entities.zip Download
Reply to this comment
I realised that all HTML entities are affected, so I changed the 
render function.
Not sure how to create a seach string that catches all HTML entities, though.


Update: using PHP function htmlentities() for replacement now. Please 
ignore previous comment.


Again, the following line goes into wicked/lib/Page.php, around line 525:

               $this->_proc->insertRule('Umlaut');

And the enclosed files:

Render_Xhtml_Umlaut.php =(goes to)=> 
wicked/lib/Text_Wiki/Render/Xhtml/Umlaut.php
Parse_Default_Umlaut.php =(goes to)=> 
wicked/lib/Text_Wiki/Parse/Default/Umlaut.php
2018-01-23 13:56:02 birnbacs (at) gmail (dot) com Comment #2
New Attachment: Umlaut.zip Download
Reply to this comment
I have fixed the misbehaviour, trying to follow the existing 
practices, that is, using Text_Wiki.

Relative to the wicked directory, I added the following line into 
lib/Page.php, line 525 (at the end of the 'case Xhtml' branch):

             $this->_proc->insertRule('Umlaut');

I also wrote two files, which are enclosed:

Render_Xhtml_Umlaut.php =(goes to)=> lib/Text_Wiki/Render/Xhtml/Umlaut.php
Parse_Default_Umlaut =(goes to)=> lib/Text_Wiki/Parse/Default/Umlaut.php

In retrospective I should have given them less confusing names.

This is my first contribution to speak of, constructive feedback is 
appreciated.



2018-01-19 17:13:23 birnbacs (at) gmail (dot) com Comment #1
Type ⇒ Bug
State ⇒ Unconfirmed
Priority ⇒ 2. Medium
Summary ⇒ HTML export does not encode special chars (umlauts)
Queue ⇒ Wicked
Milestone ⇒
Patch ⇒ No
Reply to this comment
A wicked page may be downloaded in different formats, one of which is 
HTML (others comprise LaTex and structured text). I use this feature 
for printing out buck sheets that will then be processed manually on 
paper.

My pages contain German umlauts (äöüß) and are displayed correctly 
only in browsing mode. The exported file has garbled sequences around 
those cahracters and does not print out well.

In either document the umlauts are not HTML-encoded (ä, ö, 
ü &szlig). It appears that the browsing document explicitly uses 
UTF-8 while the export version does not.

Saved Queries