View Issue Details

IDProjectCategoryView StatusLast Update
0001009SOGoWeb Mailpublic2011-01-06 00:26
ReporterMarcel Assigned Toludovic  
PrioritynormalSeverityminorReproducibilityalways
Status resolvedResolutionfixed 
Product Version1.3.4 
Target Version1.3.5Fixed in Version1.3.5 
Summary0001009: Long lines with non-ASCII characters split illegally (and then not displayed correctly)
Description

When sending mail with non-ASCII characters from within SOGo mail, long lines on input seem to be handled as follows:

  1. The mail is represented as UTF-8
  2. Text is converted to quoted-printable
  3. Lines longer than 990 characters are split after every 990 chars (maybe this is also done by Postfix; but then SOGo should already deliver the input RFC-compliant)

This can result in splitting a quoted-printable entity over multiple lines, which then results in a quoted-printable/UTF-8 decoding error when displaying the mail, i.e. the quoted-printable/UTF-8 message is displayed as if it were quoted-printable/ISO-8859-1, which is wrong and makes it hard/impossible to read.

Requested fix: Lines should be split at maybe 900 chars, but not be inside a quoted-printable sequence, only before or after.

Additional Information

Here are the incriminating lines created by SOGo, as received (non-encoded characters were replaced with "I" to anonymize the actual content); the sequence =E2=80=9C is split between "=" and "9":

I=C3=BCI III IIIIIII I=C3=A4IIII III=C3=9FI II III IIIIIII IIIIIIII: =E2=80=9EIII IIIIIII I=C3=BCIIII IIIIIIIIII IIIIIIIIIII IIIIII=E2=80=9C, IIII III IIIIIIIIIIIIIIII. IIIII II=C3=B6II IIIIIIIII- IIIII II IIIIIIIIIIIIIIIII-IIIIIIIII =C3=BCIIII IIIII IIIIIIII II IIIIIIIIII, I=C3=B6 - II II IIII II, III IIIIIIIIIIII II II=C3=B6IIII III IIIIII IIIII II IIIIIIIIII. IIII IIII, I=C3=BCII IIIIII I=C3=A4II IIII III I=C3=A4IIIII IIIII IIII. III IIIII IIIII IIIIIII III IIIIIIIII II IIIII, III III I=C3=BCIIIII, II- I=C3=BCI III IIIIIIIIII IIIIIIIIIII IIIIIIIIII- IIIII IIIIIIIII IIIIIII. II I=C3=A4IIIIII IIIIIIIII III IIIIIIIIII- IIIIII IIIIII III IIIIIIII III IIIIIIII, III IIII III III=C3=BCIIIe =C3=9CIIIIIIIIIIII III IIIIIIIII IIIIIII III=C3=9FI. II III II II- IIIII IIIIII IIIIIII III IIIIIIIIII- III II IIIIIIIIIIIIII IIII IIIII IIIII IIIIII IIIIIIIII: II=C3=BCIIIIII IIII II IIIII IIII III II IIII III IIIIIIIIIIIIIIII. IIIIIIIIIII IIIIIIIII III IIIIIII =E2=80=9EII- IIIIIIIIIIII=E2=80=
9C IIIIIIIIIIIII; IIIIIII II- III III III IIIIIIIIIIIIIII, IIII III- IIII I=C3=BCI III IIIIIII. III IIIII IIII IIII IIIIII IIII IIII II III IIIII IIIII IIIIIIII IIIIIIIIIII- IIIIIII IIIIIIIIIII. II III IIIII IIIII- III III IIIIIIIII IIIIIIIIII IIIII IIIII IIIIIIIIII; IIII IIIII III IIIIIIII- III III IIIIIIIIIIIII, III IIIIII III- III III, IIIIII IIIIIIIII, IIII IIIII- IIIII IIIIIIII =E2=80=93 III IIIII IIIIII, IIIII IIIIIIIIIIII, =E2=80=9EIIIII III IIIIIIIIII III- IIIIIIIII=E2=80=9C.

You can repeat this by pasting the following line on an empty line in a new SOGo mail message:

IüI III IIIIIII IäIIII IIIßI II III IIIIIII IIIIIIII: „III IIIIIII IüIIII IIIIIIIIII IIIIIIIIIII IIIIII“, IIII III IIIIIIIIIIIIIIII. IIIII IIöII IIIIIIIII- IIIII II IIIIIIIIIIIIIIIII-IIIIIIIII üIIII IIIII IIIIIIII II IIIIIIIIII, Iö - II II IIII II, III IIIIIIIIIIII II IIöIIII III IIIIII IIIII II IIIIIIIIII. IIII IIII, IüII IIIIII IäII IIII III IäIIIII IIIII IIII. III IIIII IIIII IIIIIII III IIIIIIIII II IIIII, III III IüIIIII, II- IüI III IIIIIIIIII IIIIIIIIIII IIIIIIIIII- IIIII IIIIIIIII IIIIIII. II IäIIIIII IIIIIIIII III IIIIIIIIII- IIIIII IIIIII III IIIIIIII III IIIIIIII, III IIII III IIIüIIII ÜIIIIIIIIIIII III IIIIIIIII IIIIIII IIIßI. II III II II- IIIII IIIIII IIIIIII III IIIIIIIIII- III II IIIIIIIIIIIIII IIII IIIII IIIII IIIIII IIIIIIIII: IIüIIIIII IIII II IIIII IIII III II IIII III IIIIIIIIIIIIIIII. IIIIIIIIIII IIIIIIIII III IIIIIII „II- IIIIIIIIIIII“ IIIIIIIIIIIII; IIIIIII II- III III III IIIIIIIIIIIIIII, IIII III- IIII IüI III IIIIIII. III IIIII IIII IIII IIIIII IIII IIII II III IIIII IIIII IIIIIIII IIIIIIIIIII- IIIIIII IIIIIIIIIII. II III IIIII IIIII- III III IIIIIIIII IIIIIIIIII IIIII IIIII IIIIIIIIII; IIII IIIII III IIIIIIII- III III IIIIIIIIIIIII, III IIIIII III- III III, IIIIII IIIIIIIII, IIII IIIII- IIIII IIIIIIII – III IIIII IIIIII, IIIII IIIIIIIIIIII, „IIIII III IIIIIIIIII III- IIIIIIIII“.

The arriving message will have the following text instead of what was pasted (notice the lonely "9E"):

I�I III IIIIIII IäIIII IIIÃI II III IIIIIII IIIIIIII: �III IIIIIII I�IIII IIIIIIIIII IIIIIIIIIII IIIIII�, IIII III IIIIIIIIIIIIIIII. IIIII IIöII IIIIIIIII- IIIII II IIIIIIIIIIIIIIIII-IIIIIIIII �IIII IIIII IIIIIIII II IIIIIIIIII, Iö - II II IIII II, III IIIIIIIIIIII II IIöIIII III IIIIII IIIII II IIIIIIIIII. IIII IIII, I�II IIIIII IäII IIII III IäIIIII IIIII IIII. III IIIII IIIII IIIIIII III IIIIIIIII II IIIII, III III I�IIIII, II- I�I III IIIIIIIIII IIIIIIIIIII IIIIIIIIII- IIIII IIIIIIIII IIIIIII. II IäIIIIII IIIIIIIII III IIIIIIIIII- IIIIII IIIIII III IIIIIIII III IIIIIIII, III IIII III III�IIII �IIIIIIIIIIII III IIIIIIIII IIIIIII IIIÃI. II III II II- IIIII IIIIII IIIIIII III IIIIIIIIII- III II IIIIIIIIIIIIII IIII IIIII IIIII IIIIII IIIIIIIII: II�IIIIII IIII II IIIII IIII III II IIII III IIIIIIIIIIIIIIII. IIIIIIIIIII IIIIIIIII III IIIIIII �II- IIIIIIIIIIII╠9C IIIIIIIIIIIII; IIIIIII II- III III III IIIIIIIIIIIIIII, IIII III- IIII I�I III IIIIIII. III IIIII IIII IIII IIIIII IIII IIII II III IIIII IIIII IIIIIIII IIIIIIIIIII- IIIIIII IIIIIIIIIII. II III IIIII IIIII- III III IIIIIIIII IIIIIIIIII IIIII IIIII IIIIIIIIII; IIII IIIII III IIIIIIII- III III IIIIIIIIIIIII, III IIIIII III- III III, IIIIII IIIIIIIII, IIII IIIII- IIIII IIIIIIII ╄ III IIIII IIIIII, IIIII IIIIIIIIIIII, �IIIII III IIIIIIIIII III- IIIIIIIII�.

TagsNo tags attached.

Activities

Marcel

Marcel

2010-11-23 20:55

reporter   ~0001890

According to RFC 2045, the line MUST actually be split at or before position 76 for quoted-printable (not 900 as I recommended)

Marcel

Marcel

2011-01-05 22:01

reporter   ~0002001

ludovic, I don't understand why this was postponed beyond 1.3.5, when this is a clear violation of the RFC and seems like something that should be easy to fix, especially as it also affects French accents :-) [as well as my beloved German umlauts]

I had a quick look at the source and attached a possible fix. Unfortunately, I do not have the whole build environment, so I cannot easily verify the code. The patch also fixes a potential buffer overflow problem in the existing code.

BTW: Is it desired that NGDecodeQuotedPrintableX() will erase "=\n\n"? "=\r\n" and "=\n" should be considered soft newlines, but "=\n\n" IMHO should be considered a soft newline followed by a real newline.

2011-01-05 22:02

 

sope-core-NGExtensions-NGQuotedPrintableCoding.m.diff (2,271 bytes)   
diff -u old/NGQuotedPrintableCoding.m new/NGQuotedPrintableCoding.m
--- old/NGQuotedPrintableCoding.m	2011-01-05 22:53:05.000000000 +0100
+++ new/NGQuotedPrintableCoding.m	2011-01-05 22:58:10.000000000 +0100
@@ -98,11 +98,12 @@
   char         *des    = NULL;
   unsigned int desLen  = 0;
 
-  desLen = length *3;
+  // length/64*3 should be plenty for soft newlines
+  desLen = (length + length/64) *3;
   des = NGMallocAtomic(sizeof(char) * desLen);
 
   desLen = NGEncodeQuotedPrintable(bytes, length, des, desLen);
-  
+
   return (int)desLen != -1
     ? [NSData dataWithBytesNoCopy:des length:desLen]
     : nil;
@@ -270,33 +271,51 @@
                             char *_dest, unsigned _destLen) {
   unsigned cnt      = 0;
   unsigned destCnt  = 0;
+  unsigned lineStart= destCnt;
   char     hexT[16] = {'0','1','2','3','4','5','6','7','8',
                        '9','A','B','C','D','E','F'};
-  
+
   if (_srcLen > _destLen)
     return -1;
-  
+
   for (cnt = 0; (cnt < _srcLen) && (destCnt < _destLen); cnt++) {
+    if (destCnt - lineStart > 70) { // Possibly going to exceed 76 chars this line
+      if (_destLen - destCnt > 2) {
+        _dest[destCnt++] = '=';
+        _dest[destCnt++] = '\r';
+        _dest[destCnt++] = '\n';
+        lineStart = destCnt;
+      }
+      else
+        break;
+    }
     char c = _src[cnt];
     if (c == 95) {  // we encode the _, otherwise we'll always decode it as a space!
-      _dest[destCnt++] = '=';
-      _dest[destCnt++] = '5';
-      _dest[destCnt++] = 'F';
+      if (_destLen - destCnt > 2) {
+        _dest[destCnt++] = '=';
+        _dest[destCnt++] = '5';
+        _dest[destCnt++] = 'F';
+      }
+      else
+        break;
     }
     else if ((c == 9)  ||
-        (c == 10) ||
         (c == 13) ||
         ((c > 31) && (c < 61)) ||
         ((c > 61) && (c < 127))) { // no quoting
       _dest[destCnt++] = c;
     }
+    else if (c == 10) { // Reset line length counter
+      _dest[destCnt++] = c;
+      lineStart = destCnt;
+    }
     else { // need to be quoted
       if (_destLen - destCnt > 2) {
         _dest[destCnt++] = '=';
         _dest[destCnt++] = hexT[(c >> 4) & 15];
         _dest[destCnt++] = hexT[c & 15];
       }
-      else 
+      else
         break;
     }
   }
ludovic

ludovic

2011-01-05 22:32

administrator   ~0002002

I'll look at your fix and integrate it for sure if it works.

It has been postponed because in v1.3.6, we will drop sope-mime and use Pantomime instead.

Marcel

Marcel

2011-01-05 22:48

reporter   ~0002005

Ah, I see!
If it works, feel free to use it; otherwise I'll have to wait for 1.3.6

ludovic

ludovic

2011-01-06 00:23

administrator   ~0002007

I've applied your patch. You could investigate further for your about your decode question and submit an other patch if appropriate.

ludovic

ludovic

2011-01-06 00:26

administrator   ~0002008

http://mtn.inverse.ca/revision/diff/cea3fd5edf797a3a5aa3b5fcb0ec00b98278842e/with/b605553eb6d92d6d091473a50c3148ef02fc28f7

Issue History

Date Modified Username Field Change
2010-11-23 19:50 Marcel New Issue
2010-11-23 20:55 Marcel Note Added: 0001890
2010-11-24 16:26 ludovic Target Version => 1.3.5
2011-01-05 20:30 ludovic Target Version 1.3.5 =>
2011-01-05 22:01 Marcel Note Added: 0002001
2011-01-05 22:02 Marcel File Added: sope-core-NGExtensions-NGQuotedPrintableCoding.m.diff
2011-01-05 22:32 ludovic Note Added: 0002002
2011-01-05 22:48 Marcel Note Added: 0002005
2011-01-06 00:23 ludovic Note Added: 0002007
2011-01-06 00:25 ludovic Target Version => 1.3.5
2011-01-06 00:26 ludovic Note Added: 0002008
2011-01-06 00:26 ludovic Status new => resolved
2011-01-06 00:26 ludovic Fixed in Version => 1.3.5
2011-01-06 00:26 ludovic Resolution open => fixed
2011-01-06 00:26 ludovic Assigned To => ludovic