6.0.0-git
2019-03-23

[#7805] Charset trouble when saving page names in db backend
Summary Charset trouble when saving page names in db backend
Queue Wicked
Type Bug
State Resolved
Priority 1. Low
Owners chuck (at) horde (dot) org
Requester lfbm.andamentos (at) gmail (dot) com
Created 2008-12-24 (3741 days ago)
Due
Updated 2008-12-31 (3734 days ago)
Assigned 2008-12-28 (3737 days ago)
Resolved 2008-12-31 (3734 days ago)
Milestone
Patch No

History
2008-12-31 02:49:28 Chuck Hagenbuch Comment #15
Assigned to Chuck Hagenbuch
State ⇒ Resolved
Reply to this comment
Patch looks excellent - committed, thanks!
2008-12-31 02:48:56 CVS Commit Comment #14 Reply to this comment
2008-12-30 17:30:23 lfbm (dot) andamentos (at) gmail (dot) com Comment #13
New Attachment: patch.txt Download
Reply to this comment
Ok, should be fixed now. A few more sql queries needed charset 
handling, specially getPages() and renamePage() functions.



As I was already changing it, I also added two functions 
(_convertToDriver() and _convertFromDriver()), and then altered all 
the long String::convert..., in order to make the file more readable, 
following the same style as Turba driver.



Renaming pages with special chars works just fine now, as also 
permissions are working great.



The patch is attached.



Thanks.
2008-12-30 06:10:58 lfbm (dot) andamentos (at) gmail (dot) com Comment #12 Reply to this comment
I did the tests. It makes no difference the charset defined for the 
field in the db.



I saved data in both situations (when charset for the page_name field 
is latin1_swedish_ci - the default - and when it is utf8_unicode_ci).



In any case, the data (page name) will be saved correctly only when 
using the patched sql.php.



But when using the patched sql.php, then a lot things break, like the 
inability to identify already created pages (which, by consequence - I 
think - breaks the identification of a renamed page during rename 
operation).



This shows why pages are not identified as already created:



http://img377.imageshack.us/img377/504/variablesbn0.jpg



Also, permissions get troubled, because page names are displayed incorrectly:



http://img367.imageshack.us/img367/6181/permsgn5.jpg


2008-12-30 05:04:48 Chuck Hagenbuch Comment #11
Taken from Chuck Hagenbuch
Reply to this comment
It would certainly help narrow things down if you tried it to see if 
it fixes the problem.
2008-12-30 04:00:06 lfbm (dot) andamentos (at) gmail (dot) com Comment #10 Reply to this comment
The trouble is $page variable in Wikilink.php returns the page name 
just fine: 'Tributário', but the $list array, which is pulled from 
$this->getConf('pages') returns the page name as 'Tribut�rio', 
thus making the test "$exists = in_array($page, $list)" fail (line 76 
of Wikilink.php).



The page_name, after the patch provided in this bug report, is saving 
correctly in the db (like 'Tributário'). The table's default charset 
is latin1.



Asking on #php, I was told I should change de charset of the table's 
fields to UTF-8.



I'm confused, because fields on my turba_objects table, for example, 
are also latin1 and I never had trouble concerning arbitrary 
characters on them. And, on the other hand, phpMyAdmin are capable of 
displaying the page_names on wicked_pages table just fine, what makes 
me guess this is something that can be handled on the fly, through 
php, without any db fields attributes changing.



If that's the case, is this a task for wicked or for Text_Wiki? Or 
should I really change the db field charset?


2008-12-28 21:23:35 lfbm (dot) andamentos (at) gmail (dot) com Comment #9 Reply to this comment
What's an example of something you're using as a page name? I'm not
getting Text_Wiki to even identify page names with unicode characters
in them.
'Tributário', for example. I'm attaching the screenshots in order to clarify.



This is my wikihome. Both 'Penal' and 'Tributário' pages exist, but 
the link for 'Tributário' is still red (although it works if clicked):



http://img136.imageshack.us/img136/8862/wikihomefp4.jpg



This is the 'Tributário' page. After the patch, the name is displayed 
correctly and also in the db the name is correctly saved:



http://img171.imageshack.us/img171/7743/tributarioez1.jpg  (page screenshot)



http://img525.imageshack.us/img525/4858/dbny6.jpg  (Database screenshot)



And when searching for 'Tributário', it works flawlessly, but still 
there's the same red coloured link problem in the results page:



http://img230.imageshack.us/img230/7728/searchue3.jpg


2008-12-28 20:38:16 Chuck Hagenbuch Comment #8 Reply to this comment
What's an example of something you're using as a page name? I'm not 
getting Text_Wiki to even identify page names with unicode characters 
in them.
2008-12-28 19:48:43 lfbm (dot) andamentos (at) gmail (dot) com Comment #7 Reply to this comment
Are these pages that were created before the patch, or after?
I tested in both situations. New pages are also not identified as
already created.

PS: Pages without any special chars (created before or afeter the
patch) are recognized just fine.
Oh, and I forgot to mention. I have renamed in the db the pages 
created before the patches, so they can still be opened. It's working. 
The trouble are the 'red' links.
2008-12-28 18:53:06 lfbm (dot) andamentos (at) gmail (dot) com Comment #6 Reply to this comment
Are these pages that were created before the patch, or after?
I tested in both situations. New pages are also not identified as 
already created.



PS: Pages without any special chars (created before or afeter the 
patch) are recognized just fine.
2008-12-28 18:49:03 Chuck Hagenbuch Comment #5
State ⇒ Feedback
Reply to this comment
Are these pages that were created before the patch, or after?
2008-12-28 18:32:03 lfbm (dot) andamentos (at) gmail (dot) com Comment #4 Reply to this comment
A bunch more convertCharset calls were also needed. Should be fixed now.
Seems to be working great! But the application is unable to identify a 
page with special characters in the name as "already created", thus 
making the link to it red (as if the page didn't exist), although you 
can open the page normally.




2008-12-28 04:46:08 Chuck Hagenbuch Comment #3
Assigned to Chuck Hagenbuch
State ⇒ Resolved
Reply to this comment
A bunch more convertCharset calls were also needed. Should be fixed now.
2008-12-24 19:28:52 lfbm (dot) andamentos (at) gmail (dot) com Comment #1
Type ⇒ Bug
State ⇒ Unconfirmed
Priority ⇒ 1. Low
Summary ⇒ Charset trouble when saving page names in db backend
Queue ⇒ Wicked
Milestone ⇒
Patch ⇒ No
Reply to this comment
Horde 3.3.3-cvs

Wicked 1.0-cvs



When I create a page with special characters in the name, like 
"Circunstância", the page is saved in my database as "Circunstância".



It would be better to get the names saved exactly as they were wrote 
(like in Turba), due to future migrations and even manual query 
activities.



If I change line 616 of wicked/lib/Driver/sql.php from '$pagename,' to:



'String::convertCharset($pagename, NLS::getCharset(), $this->getCharset()),'



Then the page name is saved ok in my db backend.



But then when displaying the page name in the application I get:



'Circunst�ncia'



Thanks

Saved Queries