Summary | Horde_String::validUtf8 fails to validate valid UTF8 |
Queue | Horde Framework Packages |
Queue Version | Git master |
Type | Bug |
State | Resolved |
Priority | 1. Low |
Owners | slusarz (at) horde (dot) org |
Requester | samuel (at) sheepflock (dot) de |
Created | 01/02/2013 (4556 days ago) |
Due | |
Updated | 01/09/2013 (4549 days ago) |
Assigned | 01/04/2013 (4554 days ago) |
Resolved | 01/09/2013 (4549 days ago) |
Github Issue Link | |
Github Pull Request | |
Milestone | |
Patch | No |
strips all non 7 bit characters
This works in english but strips the whole text in other
languages.Maybe a less heavy handed approach is possible striping
only offending characters and replacing with equal byte symbols by
calling this logic .
7 bit characters is a last ditch effort when we can't determine what
the encoding is. If we don't know what encoding the text is in, how
are we to know what equal byte symbols are?
The reason we have to strip the non 7 bit characters is because if we
send invalid UTF-8 data over wbxml, it can completely break the sync
and even crash clients like iOS. Again, we only due this as a last
ditch effort when the incoming email contains inproper character
encoding information.
http://www.unicode.org/versions/Unicode6.0.0/ch03.pdf
a implementation of the table i crafted is bellow
strips all non 7 bit characters
This works in english but strips the whole text in other
languages.Maybe a less heavy handed approach is possible striping only
offending characters and replacing with equal byte symbols by calling
this logic .
http://www.unicode.org/versions/Unicode6.0.0/ch03.pdf
a implementation of the table i crafted is bellow
static public function validUtf8($text)
{
$text = strval($text);
$len = strlen($text);
for ($i = 0; $i < $len; $i++) {
$c = ord($text[$i]);
if ($c >= 128) {
if ($c > 244) return false;
elseif ($c > 239) {$bytes = 4;
if ($c = 240) {$c1 = ord($text[$i+1]);
if (($c1 < 144)) return false;}
if ($c = 244) {$c1 = ord($text[$i+1]);
if (($c1 > 144)) return false;}}
elseif ($c > 223) {$bytes = 3;
if ( $c = 237) {$c1 = ord($text[$i+1]);
if (($c1 > 159)) return false;}
if ( $c = 224) {$c1 = ord($text[$i+1]);
if (($c1 < 160)) return false;}}
elseif ($c > 193) $bytes = 2;
else return false;
if (($i + $bytes) > $len) return false;
while ($bytes > 1) {
$i++;
$c = ord($text[$i]);
if (($c < 128) || ($c > 191)) return false;
$bytes--;
}
}
}
return true;
}
Warning i am not a programmer
Assigned to Michael Slusarz
State ⇒ Resolved
what fixes this issue. Reverting that commit causes the test to fail
for me and reapplying the commit causes it to pass.
file from git and see if fixes the issue. The released package have
an older one. Not sure why, but git version works without issues for
me, but released version won't. Probably a bug somewhere in the
validUtf8 logic.
from git and see if fixes the issue. The released package have an
older one. Not sure why, but git version works without issues for me,
but released version won't. Probably a bug somewhere in the validUtf8
logic.
might be at issue. For obvious reasons, this needs to be 0. If this
is 1, all sorts of things are going to be broken in Horde.
I am not on git, only horde stable packages.
New Attachment: boolean_result.png
Grüßen';
var_dump(Horde_String::validUtf8($test));
=> boolean false
--------------------------------------%<--------------------------------------
$test = 'ö ä
Grü ßen';
var_dump(Horde_String::validUtf8($test));
=> boolean true
--------------------------------------%<--------------------------------------
$test = 'öä
Grü ßen';
var_dump(Horde_String::validUtf8($test));
=> boolean false
root@wds:/usr/share/php/tests/Horde_Util/Horde/Util# phpunit StringTest.php
PHPUnit 3.7.10 by Sebastian Bergmann.
Configuration read from /usr/share/php/tests/Horde_Util/Horde/Util/phpunit.xml
.S.S..S.........
Time: 0 seconds, Memory: 3.50Mb
OK, but incomplete or skipped tests!
Tests: 16, Assertions: 90, Skipped: 3.
root@wds:/usr/share/php/tests/Horde_Util/Horde/Util#
your PHP include_path.
I tried this on my installation to:
$test = 'ö ä ü ß
Mit freundlichen Grüßen';
var_dump(Horde_String::validUtf8($test));
If I change this: Grüßen to Grü ßen
the result is true ?
Then I changed "ö ä" to "öä" and it fails again.
Are there two utf8 values not allowed after each-other ?
false as well but different format:
boolean false
I am not sure phpunit work in my setup, here is the result:
root@wds:/usr/share/php/tests/Horde_Util/Horde/Util# phpunit StringTest.php
PHP Warning: require_once(Horde/Test/Bootstrap.php): failed to open
stream: No such file or directory in
/usr/share/php/tests/Horde_Util/Horde/Util/bootstrap.php on line 2
PHP Stack trace:
PHP 1. {main}() /usr/bin/phpunit:0
PHP 2. PHPUnit_TextUI_Command::main() /usr/bin/phpunit:46
PHP 3. PHPUnit_TextUI_Command->run()
/usr/share/php/PHPUnit/TextUI/Command.php:130
PHP 4. PHPUnit_TextUI_Command->handleArguments()
/usr/share/php/PHPUnit/TextUI/Command.php:139
PHP 5. PHPUnit_TextUI_Command->handleBootstrap()
/usr/share/php/PHPUnit/TextUI/Command.php:620
PHP 6. PHPUnit_Util_Fileloader::checkAndLoad()
/usr/share/php/PHPUnit/TextUI/Command.php:867
PHP 7. PHPUnit_Util_Fileloader::load()
/usr/share/php/PHPUnit/Util/Fileloader.php:79
PHP 8. include_once() /usr/share/php/PHPUnit/Util/Fileloader.php:95
PHP Fatal error: require_once(): Failed opening required
'Horde/Test/Bootstrap.php'
(include_path='.:/usr/share/php:/usr/share/pear') in
/usr/share/php/tests/Horde_Util/Horde/Util/bootstrap.php on line 2
PHP Stack trace:
PHP 1. {main}() /usr/bin/phpunit:0
PHP 2. PHPUnit_TextUI_Command::main() /usr/bin/phpunit:46
PHP 3. PHPUnit_TextUI_Command->run()
/usr/share/php/PHPUnit/TextUI/Command.php:130
PHP 4. PHPUnit_TextUI_Command->handleArguments()
/usr/share/php/PHPUnit/TextUI/Command.php:139
PHP 5. PHPUnit_TextUI_Command->handleBootstrap()
/usr/share/php/PHPUnit/TextUI/Command.php:620
PHP 6. PHPUnit_Util_Fileloader::checkAndLoad()
/usr/share/php/PHPUnit/TextUI/Command.php:867
PHP 7. PHPUnit_Util_Fileloader::load()
/usr/share/php/PHPUnit/Util/Fileloader.php:79
PHP 8. include_once() /usr/share/php/PHPUnit/Util/Fileloader.php:95
root@wds:/usr/share/php/tests/Horde_Util/Horde/Util#
of PHP. Versions of PHP distributed via a package (i.e. Debian) is
not acceptable.
of PHP. Versions of PHP distributed via a package (i.e. Debian) is
not acceptable.
also show fail.
http://wiki.horde.org/Doc/Dev/TestH5#toc8
You also need to make sure you are running a somewhat recent version
of PHP. Versions of PHP distributed via a package (i.e. Debian) is
not acceptable.
also show fail.
upgrade-all ok: channel://pear.horde.org/Horde_Mime-2.0.2
upgrade-all ok: channel://pear.horde.org/Horde_Imap_Client-2.4.2
upgrade-all ok: channel://pear.horde.org/Horde_Core-2.1.4
upgrade-all ok: channel://pear.horde.org/Horde_ActiveSync-2.1.1
--> bool(false)
New Attachment: Screen Shot 2013-01-05 at 12.33.52 PM.png
also show fail.
Even quoting (single and double) the string, it still fails.
Kinglok, Fong
State ⇒ Feedback
validates correctly:
slusarz@bigworm % phpunit StringTest.php
PHPUnit 3.7.10 by Sebastian Bergmann.
Configuration read from
/disk2/src/horde/framework/Util/test/Horde/Util/phpunit.xml
.S.S..S.........
Time: 0 seconds, Memory: 5.50Mb
OK, but incomplete or skipped tests!
Tests: 16, Assertions: 91, Skipped: 3.
commit c29e62131c9582c59890a8031f909be7d8e4ccbb
Author: Michael M Slusarz <slusarz@horde.org>
Date: Fri Jan 4 14:09:43 2013 -0700
Validation test for
Bug #11930framework/Util/test/Horde/Util/StringTest.php | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
http://git.horde.org/horde-git/-/commit/c29e62131c9582c59890a8031f909be7d8e4ccbb
Priority ⇒ 1. Low
State ⇒ Unconfirmed
New Attachment: horde_php_shell_utf8.png
Patch ⇒ No
Milestone ⇒
Queue ⇒ Horde Framework Packages
Summary ⇒ Horde_String::validUtf8 fails to validate valid UTF8
Type ⇒ Bug
$test = 'ö ä ü ß
Mit freundlichen Grüßen';
var_dump(Horde_String::validUtf8($test));
Result:
bool(false)
Debian Squeeze Server with php5 5.3.3-7+squeeze14
INSTALLED PACKAGES, CHANNEL PEAR.HORDE.ORG:
===========================================
PACKAGE VERSION STATE
Horde_ActiveSync 2.0.14 stable
Horde_Alarm 2.0.2 stable
Horde_Argv 2.0.2 stable
Horde_Auth 2.0.1 stable
Horde_Autoloader 2.0.1 stable
Horde_Browser 2.0.2 stable
Horde_Cache 2.0.1 stable
Horde_Cli 2.0.1 stable
Horde_Compress 2.0.1 stable
Horde_Constraint 2.0.1 stable
Horde_Controller 2.0.1 stable
Horde_Core 2.1.3 stable
Horde_Crypt 2.1.0 stable
Horde_Crypt_Blowfish 1.0.1 stable
Horde_Data 2.0.1 stable
Horde_Date 2.0.1 stable
Horde_Date_Parser 2.0.1 stable
Horde_Db 2.0.1 stable
Horde_Editor 2.0.1 stable
Horde_ElasticSearch 1.0.1 stable
Horde_Exception 2.0.1 stable
Horde_Feed 2.0.1 stable
Horde_Form 2.0.1 stable
Horde_Group 2.0.1 stable
Horde_History 2.0.1 stable
Horde_Http 2.0.1 stable
Horde_Icalendar 2.0.1 stable
Horde_Image 2.0.1 stable
Horde_Imap_Client 2.4.1 stable
Horde_Imsp 2.0.1 stable
Horde_Injector 2.0.1 stable
Horde_Itip 2.0.1 stable
Horde_Kolab_Format 2.0.1 stable
Horde_Kolab_Server 2.0.1 stable
Horde_Kolab_Session 2.0.1 stable
Horde_Kolab_Storage 2.0.2 stable
Horde_ListHeaders 1.0.1 stable
Horde_Lock 2.0.1 stable
Horde_Log 2.0.1 stable
Horde_LoginTasks 2.0.1 stable
Horde_Mail 2.0.3 stable
Horde_Memcache 2.0.1 stable
Horde_Mime 2.0.1 stable
Horde_Mime_Viewer 2.0.1 stable
Horde_Nls 2.0.1 stable
Horde_Notification 2.0.1 stable
Horde_Oauth 2.0.1 stable
Horde_Perms 2.0.1 stable
Horde_Prefs 2.0.1 stable
Horde_Rdo 2.0.1 stable
Horde_Role 1.0.1 stable
Horde_Routes 2.0.1 stable
Horde_Rpc 2.0.2 stable
Horde_Scribe 2.0.1 stable
Horde_Secret 2.0.2 stable
Horde_Serialize 2.0.1 stable
Horde_Service_Facebook 2.0.1 stable
Horde_Service_Twitter 2.0.1 stable
Horde_Service_Weather 2.0.1 stable
Horde_SessionHandler 2.0.1 stable
Horde_Share 2.0.1 stable
Horde_SpellChecker 2.0.1 stable
Horde_Stream 1.2.0 stable
Horde_Stream_Filter 2.0.1 stable
Horde_Stream_Wrapper 2.0.1 stable
Horde_Support 2.0.2 stable
Horde_SyncMl 2.0.1 stable
Horde_Template 2.0.1 stable
Horde_Text_Diff 2.0.1 stable
Horde_Text_Filter 2.0.3 stable
Horde_Text_Filter_Csstidy 2.0.1 stable
Horde_Text_Flowed 2.0.1 stable
Horde_Thrift 2.0.1 stable
Horde_Timezone 1.0.1 stable
Horde_Token 2.0.1 stable
Horde_Translation 2.0.1 stable
Horde_Tree 2.0.1 stable
Horde_Url 2.0.1 stable
Horde_Util 2.0.2 stable
Horde_Vfs 2.0.3 stable
Horde_View 2.0.1 stable
Horde_Xml_Element 2.0.1 stable
Horde_Xml_Wbxml 2.0.1 stable
content 2.0.1 stable
horde 5.0.2 stable
imp 6.0.2 stable
ingo 3.0.1 stable
kronolith 4.0.2 stable
mnemo 4.0.1 stable
nag 4.0.1 stable
trean 1.0.0beta2 beta
turba 4.0.1 stable