| Summary | Word paste document |
| Queue | Horde Base |
| Queue Version | 3.0.1 |
| Type | Bug |
| State | Not A Bug |
| Priority | 2. Medium |
| Owners | |
| Requester | ralph (at) islandroots (dot) ca |
| Created | 06/02/2005 (1075 days ago) |
| Due | |
| Updated | 03/24/2008 (49 days ago) |
| Assigned | 12/19/2007 (145 days ago) |
| Resolved | 01/16/2008 (117 days ago) |
| Attachments | compose.js.patch ![]() |
| Milestone | |
| Patch |
Pasting word documents into email messages isn't friendly either. I think the best solution is to teach users. There is no golden way to handle this. At least until PHP or browsers provide a better way to reliably detect charsets.Then what character set are you supposed to use when ASCII doesn't cut it? Can we ask the user to manually choose another character set if their message won't fit in ASCII, but they're using the default of ASCII? Dropping the extra characters silently is not user-friendly.UTF-8 is not a good default for email messages. Contrary to browsers, the UTF-8 support in email clients is still not that widespread yet, and the market is much more diverse too.What if we default to UTF-8 and downgrade to ASCII if we pass something similar to Data_ldif's _is_safe_string() check? The user always has the option to override this...mbstring doesn't support all charsets that we do.Something like:
$str = "שלום";
echo mb_detect_encoding($str, "ASCII, ISO-8859-15, ISO-8859-1, ISO-8859-2, KOI8-R, ISO-8859-7, ISO-8859-6, ISO-8859-8, ISO-2022-JP, BIG-5, EUC-KR, UTF-8") . "\n";
Maybe this should be opened as a new bug, but why does the user have to select a character set at all? Fat mail clients (including Mail.app and alpine) seem to detect the character set automatically (with the option to override).
Don't we want that feature?
State ⇒ Not A Bug
I agree with Chuck.This seems just annoying to me, especially since if there's a non-printing char that isn't visible, the user will just be confused.
I would be fine with running a smart-quotes-fixing function (we have the cleanascii text_filter routine already) if we're in ascii or iso-8859-1, but otherwise I think this goes back to being Not A Bug.
New Attachment: compose.js.patch
Agreed. But this seems useful though overly simplified...We do this already iirc. But only having the choice between utf-8 and ascii is not what we want to do.pine/alpine has a function pith/send.c:set_charset_possibly_to_ascii(), which does exactly this. If no-non-ASCII characters are found, then convert text to ASCII.> Would it be
> crazy to check for character set conversion problems and warn the
> user before sending?
This is not possible unfortunately.
> Or make UTF-8 the default and downgrade to
> ISO-8859-1 when UTF-8 isn't needed?
No, that's not possible either. You can't determine which charset you might need from an utf-8 stream.
Thanks. Changing the sending charset fixed the problem. Would it be crazy to check for character set conversion problems and warn the user before sending? Or make UTF-8 the default and downgrade to ISO-8859-1 when UTF-8 isn't needed?Because you usually don't send messages in utf-8. If you choose utf-8 in the compose view, those characters are probably rendered fine at the recipient.My source document is utf8. It's a plain text file. And the IMP UI is set to utf8. Why is character set conversion happening?This won't help. It's a charset conversion problem.Can't we convert the illegal characters using something like this code:
http://shiflett.org/blog/2005/oct/convert-smart-quotes-with-php
function convert_smart_quotes($string)
{
$search = array(chr(145),
chr(146),
chr(147),
chr(148),
chr(151));
$replace = array("'",
"'",
'"',
'"',
'-');
return str_replace($search, $replace, $string);
}
State ⇒ Not A Bug
Word uses illegal characters. We can't really do much about it.State ⇒ Unconfirmed
Type ⇒ Bug
Priority ⇒ 2. Medium
Queue ⇒ Horde Base
Summary ⇒ Word paste document
When I cut and paste a Word text document into a new Horde email letter it looks fine. However, when it issent and received, all quotes, apostrophies and dashes(" ' - ) are turned into question marks (?) in the text.