Summary | db_migrate and incorrect charset handling |
Queue | Horde Framework Packages |
Queue Version | Git master |
Type | Bug |
State | Resolved |
Priority | 1. Low |
Owners | jan (at) horde (dot) org, mrubinsk (at) horde (dot) org |
Requester | leena.heino (at) uta (dot) fi |
Created | 03/02/2011 (5233 days ago) |
Due | |
Updated | 04/06/2011 (5198 days ago) |
Assigned | 04/04/2011 (5200 days ago) |
Resolved | 04/06/2011 (5198 days ago) |
Github Issue Link | |
Github Pull Request | |
Milestone | |
Patch | No |
State ⇒ Resolved
impossible due to SQLite issues, but nothing we can do about that.
real world anyway, so while it's a pain, I'm not sure how much of a
problem it is...
database to do case-insensitive searches, it's completely insane that
it doesn't work by default with SQLite. I'm surprised it didn't break
anything else yet. This makes SQLite pretty useless for any real-world
usage of Horde.
Fix case-insensitive filtering of duplicate tags (
Bug #9617).This simplifies the _checkTags() method a lot too. Unfortunately it
doesn't work at all with SQLite, so unit tests are rather useless.
3 files changed, 19 insertions(+), 26 deletions(-)
http://git.horde.org/horde-git/-/commit/a90c671771adbbb1aa08576a2b9d13e011ca6790
Add failing test for
bug #9617.1 files changed, 5 insertions(+), 4 deletions(-)
http://git.horde.org/horde-git/-/commit/55802691eafbb6931b93e8c96ee8d2d4fd5b441b
This makes unit testing this stuff a PITA.
WHERE tag_name IN (...)" won't work in this case, because it is case
sensitive.
The correct solution would be to delegate the lowercasing to the
database, but at least for SQLite this doesn't seem to work. "SELECT
LOWER('TYÖ')" returns "tyÖ" there. It works fine in MySQL though.
tags TYÖ and työ are considered equal, right?
tags TYÖ and työ are considered equal, right?
State ⇒ Feedback
TaggerTest.php that demonstrates the broken behavior?
State ⇒ Assigned
Assigned to Jan Schneider
bytes, therefore it will not work correctly with UTF-8 encoded
strings that contain non ascii characters.
strtolower()/strtoupper() work correctly with
multibyte charset like utf-8.
ambiguously: "Note that 'alphabetic' is determined by the current
locale"
But if we look at php's source code for strtoupper() it works by
bytes, therefore it will not work correctly with UTF-8 encoded strings
that contain non ascii characters.
Excerpt from ext/standard/string.c:
char *php_strtoupper(char *s, size_t len)
{
unsigned char *c, *e;
c = (unsigned char *)s;
e = (unsigned char *)c+len;
while (c < e) {
*c = toupper(*c);
c++;
}
return s;
}
The non ascii characters in UTF-8 are multi byte. Therefore using
php's strtoupper()/strtolower() will not work correctly with UTF-8
encoded strings with non ascii characters.
strtolower()/strtoupper() work correctly with multibyte charset like
utf-8.
should not assume that strtolower()/strtoupper() work correctly with
multibyte charset like utf-8.
Should the code use mb_strtoupper()/mb_strtolower() or Horde::String
instead of strtolower()/strtoupper()?
Bug #9617: Fix property name.1 files changed, 2 insertions(+), 2 deletions(-)
http://git.horde.org/horde-git/-/commit/7d484c517ddde6a6818845ea1b33be3f20c36c89
Bug #9617: Fix property name.1 files changed, 2 insertions(+), 2 deletions(-)
http://git.horde.org/horde-git/-/commit/7d484c517ddde6a6818845ea1b33be3f20c36c89
Fix charset handling in tagger
Bug: 96173 files changed, 26 insertions(+), 8 deletions(-)
http://git.horde.org/horde-git/-/commit/5ac680c13a93b32df97fc5bb3c29cb3c4b8e4cbb
Summary ⇒ db_migrate and incorrect charset handling
However, there was the missing conversion from the database to utf-8
in the migration script that probably made the extra conversion in
your patch necessary. This has been fixed.
DEBUG: SQL SELECT user_id, user_name FROM `rampage_users` WHERE
user_name IN ('ntllt')
DEBUG: SQL QUERY FAILED: Duplicate entry 'ntllt' for key
rampage_users_user_name' INSERT INTO `rampage_users` (user_name)
VALUES ('ntllt')
content. Working on it, and updated the title of the ticket to reflect
the actual problem.
Need to convert to utf-8 when reading the category before tagging.
Bug: 96171 files changed, 2 insertions(+), 2 deletions(-)
http://git.horde.org/horde-git/-/commit/3baf0b99425470dfdd77e02de1da4f32bf4851ff
New Attachment: Tagger.php.patch
Need to convert from database's charset before comparing
Bug: 96171 files changed, 1 insertions(+), 1 deletions(-)
http://git.horde.org/horde-git/-/commit/0f36ff69b64c96e3ab91d6f479fa34cb15a451a9
Another bug has appeared. Output from debug log:
DEBUG: SQL SELECT user_id, user_name FROM `rampage_users` WHERE
user_name IN ('ntllt')
DEBUG: SQL QUERY FAILED: Duplicate entry 'ntllt' for key
rampage_users_user_name' INSERT INTO `rampage_users` (user_name)
VALUES ('ntllt')
Need to convert from database's charset before comparing
Bug: 96171 files changed, 1 insertions(+), 1 deletions(-)
http://git.horde.org/horde-git/-/commit/0f36ff69b64c96e3ab91d6f479fa34cb15a451a9
Assigned to Michael Rubinsky
State ⇒ Feedback
Use Horde_String::lower
Possibly fixes
Bug: 96171 files changed, 1 insertions(+), 1 deletions(-)
http://git.horde.org/horde-git/-/commit/ae5714cbb917c43e184f942db0e1b5f3197f679f
Priority ⇒ 1. Low
State ⇒ Unconfirmed
Patch ⇒ No
Milestone ⇒
Summary ⇒ db_migrate and duplicate tags in rampage
Type ⇒ Bug
Queue ⇒ Horde Framework Packages
Either it is mysql which is case insentive or it is the migration
script, but it seems as if you cannot add tags to rampage_tags if tags
differ only by their case.
Eg. these tags are consider the same:
TYÖ
työ
If those tags exists in old data then db_migrate will fail with error:
QUERY FAILED: Duplicate entry 'työ' for key 'rampage_tags_tag_name'
INSERT INTO `rampage_tags` (tag_name) VALUES ('työ')