6.0.0-alpha12
6/7/25

[#14770] HTML export does not encode special chars (umlauts)
Summary HTML export does not encode special chars (umlauts)
Queue Wicked
Queue Version Git master
Type Bug
State No Feedback
Priority 2. Medium
Owners
Requester birnbacs (at) gmail (dot) com
Created 01/19/2018 (2696 days ago)
Due
Updated 03/29/2019 (2262 days ago)
Assigned 01/29/2018 (2686 days ago)
Resolved 03/29/2019 (2262 days ago)
Github Issue Link
Github Pull Request
Milestone
Patch No

History
03/29/2019 01:26:31 PM Jan Schneider State ⇒ No Feedback
 
01/29/2018 05:21:04 PM Jan Schneider Comment #4
State ⇒ Feedback
Reply to this comment
The page is exported perfectly fine, the only issue is, that it's only 
an HTML fragment, not a complete, valid HTML document. Thus it doesn't 
contain any charset information. If you load it with the correct 
charset UTF-8, the page is displayed perfectly fine though.
01/23/2018 03:13:43 PM birnbacs (at) gmail (dot) com Comment #3
New Attachment: Umlaut_HTML_entities.zip Download
Reply to this comment
I realised that all HTML entities are affected, so I changed the 
render function.
Not sure how to create a seach string that catches all HTML entities, though.


Update: using PHP function htmlentities() for replacement now. Please 
ignore previous comment.


Again, the following line goes into wicked/lib/Page.php, around line 525:

               $this->_proc->insertRule('Umlaut');

And the enclosed files:

Render_Xhtml_Umlaut.php =(goes to)=> 
wicked/lib/Text_Wiki/Render/Xhtml/Umlaut.php
Parse_Default_Umlaut.php =(goes to)=> 
wicked/lib/Text_Wiki/Parse/Default/Umlaut.php
01/23/2018 01:56:02 PM birnbacs (at) gmail (dot) com Comment #2
New Attachment: Umlaut.zip Download
Reply to this comment
I have fixed the misbehaviour, trying to follow the existing 
practices, that is, using Text_Wiki.

Relative to the wicked directory, I added the following line into 
lib/Page.php, line 525 (at the end of the 'case Xhtml' branch):

             $this->_proc->insertRule('Umlaut');

I also wrote two files, which are enclosed:

Render_Xhtml_Umlaut.php =(goes to)=> lib/Text_Wiki/Render/Xhtml/Umlaut.php
Parse_Default_Umlaut =(goes to)=> lib/Text_Wiki/Parse/Default/Umlaut.php

In retrospective I should have given them less confusing names.

This is my first contribution to speak of, constructive feedback is 
appreciated.



01/19/2018 05:13:23 PM birnbacs (at) gmail (dot) com Comment #1
Priority ⇒ 2. Medium
Type ⇒ Bug
Summary ⇒ HTML export does not encode special chars (umlauts)
Queue ⇒ Wicked
Milestone ⇒
Patch ⇒ No
State ⇒ Unconfirmed
Reply to this comment
A wicked page may be downloaded in different formats, one of which is 
HTML (others comprise LaTex and structured text). I use this feature 
for printing out buck sheets that will then be processed manually on 
paper.

My pages contain German umlauts (äöüß) and are displayed correctly 
only in browsing mode. The exported file has garbled sequences around 
those cahracters and does not print out well.

In either document the umlauts are not HTML-encoded (ä, ö, 
ü &szlig). It appears that the browsing document explicitly uses 
UTF-8 while the export version does not.

Saved Queries