[#2072] Word paste document
Summary Word paste document
Queue Horde Base
Queue Version 3.0.1
Type Bug
State Not A Bug
Priority 2. Medium
Owners
Requester ralph (at) islandroots (dot) ca
Created 06/02/2005 (1075 days ago)
Due
Updated 03/24/2008 (49 days ago)
Assigned 12/19/2007 (145 days ago)
Resolved 01/16/2008 (117 days ago)
Attachments compose.js.patch Download
Milestone
Patch

History
03/24/2008 Jan Schneider Comment #20 Reply to this comment
Pasting word documents into email messages isn't friendly either. I think the best solution is to teach users. There is no golden way to handle this. At least until PHP or browsers provide a better way to reliably detect charsets.
03/24/2008 Matt Selsky Comment #19 Reply to this comment
Then what character set are you supposed to use when ASCII doesn't cut it?  Can we ask the user to manually choose another character set if their message won't fit in ASCII, but they're using the default of ASCII?  Dropping the extra characters silently is not user-friendly.
03/24/2008 Jan Schneider Comment #18 Reply to this comment
UTF-8 is not a good default for email messages. Contrary to browsers, the UTF-8 support in email clients is still not that widespread yet, and the market is much more diverse too.
03/24/2008 Matt Selsky Comment #17 Reply to this comment
What if we default to UTF-8 and downgrade to ASCII if we pass something similar to Data_ldif's _is_safe_string() check?  The user always has the option to override this...
02/14/2008 Jan Schneider Comment #16 Reply to this comment
mbstring doesn't support all charsets that we do.
02/14/2008 Matt Selsky Comment #15 Reply to this comment
Something like:

$str = "שלום";

echo mb_detect_encoding($str, "ASCII, ISO-8859-15, ISO-8859-1, ISO-8859-2, KOI8-R, ISO-8859-7, ISO-8859-6, ISO-8859-8, ISO-2022-JP, BIG-5, EUC-KR, UTF-8") . "\n";
02/14/2008 Matt Selsky Comment #14 Reply to this comment
Maybe this should be opened as a new bug, but why does the user have to select a character set at all?  Fat mail clients (including Mail.app and alpine) seem to detect the character set automatically (with the option to override).

Don't we want that feature?
01/16/2008 Jan Schneider Comment #13
State ⇒ Not A Bug
Reply to this comment
I agree with Chuck.
01/06/2008 Chuck Hagenbuch Comment #12 Reply to this comment
This seems just annoying to me, especially since if there's a non-printing char that isn't visible, the user will just be confused.

I would be fine with running a smart-quotes-fixing function (we have the cleanascii text_filter routine already) if we're in ascii or iso-8859-1, but otherwise I think this goes back to being Not A Bug.
12/19/2007 Chuck Hagenbuch State ⇒ Feedback
 
12/19/2007 Matt Selsky Comment #11
New Attachment: compose.js.patch Download
Reply to this comment
Agreed.  But this seems useful though overly simplified...
12/17/2007 Jan Schneider Comment #10 Reply to this comment
We do this already iirc. But only having the choice between utf-8 and ascii is not what we want to do.
12/17/2007 Matt Selsky Comment #9 Reply to this comment
pine/alpine has a function pith/send.c:set_charset_possibly_to_ascii(), which does exactly this.  If no-non-ASCII characters are found, then convert text to ASCII.
12/17/2007 Jan Schneider Comment #8 Reply to this comment
> Would it be
> crazy to check for character set conversion problems and warn the
> user before sending?
This is not possible unfortunately.
> Or make UTF-8 the default and downgrade to
> ISO-8859-1 when UTF-8 isn't needed?
No, that's not possible either. You can't determine which charset you might need from an utf-8 stream.
12/17/2007 Matt Selsky Comment #7 Reply to this comment
Thanks.  Changing the sending charset fixed the problem.  Would it be crazy to check for character set conversion problems and warn the user before sending?  Or make UTF-8 the default and downgrade to ISO-8859-1 when UTF-8 isn't needed?
12/17/2007 Jan Schneider Comment #6 Reply to this comment
Because you usually don't send messages in utf-8. If you choose utf-8 in the compose view, those characters are probably rendered fine at the recipient.
12/17/2007 Matt Selsky Comment #5 Reply to this comment
My source document is utf8.  It's a plain text file.  And the IMP UI is set to utf8.  Why is character set conversion happening?
12/17/2007 Jan Schneider Comment #4 Reply to this comment
This won't help. It's a charset conversion problem.
12/17/2007 Matt Selsky Comment #3 Reply to this comment
Can't we convert the illegal characters using something like this code:

http://shiflett.org/blog/2005/oct/convert-smart-quotes-with-php

function convert_smart_quotes($string)
{
    $search = array(chr(145),
                    chr(146),
                    chr(147),
                    chr(148),
                    chr(151));

    $replace = array("'",
                     "'",
                     '"',
                     '"',
                     '-');

    return str_replace($search, $replace, $string);
}
06/02/2005 Chuck Hagenbuch Comment #2
State ⇒ Not A Bug
Reply to this comment
Word uses illegal characters. We can't really do much about it.
06/02/2005 ralph (at) islandroots (dot) ca Comment #1
State ⇒ Unconfirmed
Type ⇒ Bug
Priority ⇒ 2. Medium
Queue ⇒ Horde Base
Summary ⇒ Word paste document
Reply to this comment
When I cut and paste  a Word text document  into a new Horde email letter it looks fine. However, when it issent and received, all quotes, apostrophies and dashes(" ' - ) are turned into question marks (?) in the text.