Tickets :: [#14618] Attachments with special file names (RFC 2231)

6.0.0-beta1

7/4/25

Summary	Attachments with special file names (RFC 2231)
Queue	IMP
Queue Version	6.2.18
Type	Bug
State	Resolved
Priority	2. Medium
Owners	jan (at) horde (dot) org
Requester	wahnes (at) uni-koeln (dot) de
Created	04/18/2017 (2999 days ago)
Due
Updated	10/20/2017 (2814 days ago)
Assigned	04/27/2017 (2990 days ago)
Resolved	04/28/2017 (2989 days ago)
Github Issue Link
Github Pull Request
Milestone
Patch	No

10/20/2017 08:33:46 PM	Git Commit	Comment #10	Reply to this comment
Changes have been made in Git (FRAMEWORK_5_2): commit db0c29770f694135bd118803bafa32fa50e6a107 Author: Jan Schneider <jan@horde.org> Date: Fri, 28 Apr 2017 18:00:17 +0200 [jan] Fix filename charset of certain attachments (~~Bug #14618~~). M docs/CHANGES M package.xml https://github.com/horde/imp/commit/db0c29770f694135bd118803bafa32fa50e6a107

10/20/2017 08:33:46 PM	Git Commit	Comment #9	Reply to this comment
Changes have been made in Git (FRAMEWORK_5_2): commit e82bd3c0818f25beb0cfd86bc1735fe4c39086d8 Author: Jan Schneider <jan@horde.org> Date: Fri, 28 Apr 2017 17:59:38 +0200 The header charset for attachments is always UTF-8 (~~Bug #14618~~). M lib/Compose.php https://github.com/horde/imp/commit/e82bd3c0818f25beb0cfd86bc1735fe4c39086d8

05/03/2017 04:32:51 PM	wahnes (at) uni-koeln (dot) de	Comment #8	Reply to this comment
Many thanks for this bugfix. As it turns out after sending test e-mails using Imp 6.2.19, there are few receiving mail programs that actually do profit from this fix, with mutt being one of the notable exceptions. Others like Thunderbird and Open-Xchange did already accept "us-ascii" in lieu of "utf-8", and Outlook does not recognize filename encoding according to RFC 2231 at all. Same sad thing goes for GMX's web interface, no RFC 2231 support there either.

05/03/2017 09:42:10 AM	Git Commit	Comment #7	Reply to this comment
Changes have been made in Git (master): commit 91590bd39d92d614442afb56c48bb5c17ca1b8cb Author: Jan Schneider <jan@horde.org> Date: Fri Apr 28 18:00:17 2017 +0200 [jan] Fix filename charset of certain attachments (~~Bug #14618~~). imp/package.xml \| 1 + 1 file changed, 1 insertion(+) http://github.com/horde/horde/commit/91590bd39d92d614442afb56c48bb5c17ca1b8cb

04/28/2017 04:03:26 PM	Jan Schneider	State ⇒ Resolved Assigned to Jan Schneider

04/28/2017 04:00:29 PM	Git Commit	Comment #6	Reply to this comment
Changes have been made in Git (FRAMEWORK_5_2): commit 65e2461a1f7fcc5a29080f37f90d84e0431bf0fa Author: Jan Schneider <jan@horde.org> Date: Fri Apr 28 18:00:17 2017 +0200 [jan] Fix filename charset of certain attachments (~~Bug #14618~~). imp/docs/CHANGES \| 1 + imp/package.xml \| 2 ++ 2 files changed, 3 insertions(+) http://github.com/horde/horde/commit/65e2461a1f7fcc5a29080f37f90d84e0431bf0fa

04/28/2017 04:00:28 PM	Git Commit	Comment #5	Reply to this comment
Changes have been made in Git (FRAMEWORK_5_2): commit d121cc674c814ccc5d820c5e3d9e86d7028bfba1 Author: Jan Schneider <jan@horde.org> Date: Fri Apr 28 17:57:22 2017 +0200 The header charset for attachments is always UTF-8 (~~Bug #14618~~). imp/lib/Compose.php \| 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) http://github.com/horde/horde/commit/d121cc674c814ccc5d820c5e3d9e86d7028bfba1

04/28/2017 03:57:49 PM	Git Commit	Comment #4	Reply to this comment
Changes have been made in Git (master): commit 8a852241683e302f740b40c718d0140d8bb00ab5 Author: Jan Schneider <jan@horde.org> Date: Fri Apr 28 17:57:22 2017 +0200 The header charset for attachments is always UTF-8 (~~Bug #14618~~). imp/lib/Compose.php \| 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) http://github.com/horde/horde/commit/8a852241683e302f740b40c718d0140d8bb00ab5

04/28/2017 02:06:53 PM	wahnes (at) uni-koeln (dot) de	Comment #3 New Attachment: testfiles-with-strange-names.tar	Reply to this comment
Here's another try with a tar file containing the said files. It opens OK on my Linux box with UTF-8 locale, but I don't know if there are any standards for the charset in a .tar file, given it's such an old format. The zip file might be using UTF-16 or such, I don't know.

04/27/2017 07:35:08 PM	Jan Schneider	Comment #2 State ⇒ Feedback	Reply to this comment
Well, at least the files that you had in the archive have no valid charset at all. This may be due to packaging though, but they looked like double encoded UTF-8.

04/18/2017 11:03:58 AM	wahnes (at) uni-koeln (dot) de	Comment #1 Patch ⇒ No State ⇒ Unconfirmed New Attachment: horde attachment examples rfc 2231.zip Milestone ⇒ Queue ⇒ IMP Summary ⇒ Attachments with special file names (RFC 2231) Type ⇒ Bug Priority ⇒ 2. Medium	Reply to this comment
In some cases, the RFC 2231 encoding of the file name for attached files is wrong, causing trouble on the receiving side of email with such attachments. This seems to be happening when the content of an attachment is pure ASCII but the filename contains non-ASCII characters. Example: Given a file by the name of "File with a long name coñtaïning strånge characters but pure ASCII content.txt" that does, as the name implies, contain only ASCII characters and thus will have a MIME encoding of "Content-Type: text/plain". When attaching this file, the file name will be encoded like this: name0=us-ascii''File%20with%20a%20long%20name%20co%F1ta%EFning%20str%E5ng; name1=e%20characters%20but%20pure%20ASCII%20content.txt Note that the charset used for the encoding of the filename (given before the first single-quote character in the "name0" line) is "us-ascii" in this case. Cleary, this cannot be the case as ASCII does not contain the character "ñ". In fact, ASCII does not contain any character with hex code above 0x7F, so an encoding that uses a hex code "F1" with "us-ascii" must be wrong. The actual charset would be ISO-8859-1 or similar, as it contains the "n" with tilde at position 0xF1 (241 decimal). This error does not happen, however, when the attachment's content has non-ASCII characters in it. When attaching a file that has got both non-ASCII content and a non-ASCII name, the encoding generated by Horde is fine. For example, a file by the name of "Example file with name coñtaïning strånge characters which has non-ASCII content too.txt" that does in fact contain non-ASCII content (e.g. the string "Hallo Bärbel") is encoded correctly. In this case, the encoding generated would be name0=utf-8''Example%20file%20with%20name%20co%C3%B1ta%C3%AFning%20str; name1=%C3%A5nge%20characters%20which%20has%20non-ASCII%20content%20too.tx; name2=t which is perfectly right. For instance, the "ñ" is encoded here as two characters in UTF-8, 0xC3 and 0xB1, which is correct. The root cause of the problem seems to be that Horde uses the charset of the attachment's content to encode the attachment's filename. This is wrong because the filename can use a different encoding than the content. This issue manifests itself as well when there is an attachment that contains non-ASCII characters but the filename uses pure ASCII: the filename will be encoded as "UTF-8". This does not cause real problems because any ASCII text is also valid UTF-8 text, but it adds to my assumption that Horde wrongfully uses the content's charset in the place where the filename's charset should be used. I will attach a zip file with four files that can be used to illustrate the problem: 1. File with ASCII content and ASCII filename --> OK. 2. File with ASCII content and non-ASCII filename --> wrong. 3. File with non-ASCII content and ASCII filename --> OK in a way because UTF-8 is a superset of ASCII. 4. File with non-ASCII content and non-ASCII filename --> OK. I hope the file names will be preserved correctly in the zip file. This zip file was generated using Microsoft Windows's built-in ZIP functionality, so the file names might not be recognized as they should everywhere. If you are unable to read them, I will try some other way to send them.