View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0003510 | SOGo | Backend Mail | public | 2016-02-09 13:18 | 2017-01-09 20:49 |
Reporter | Jens Erat | Assigned To | ludovic | ||
Priority | high | Severity | feature | Reproducibility | always |
Status | resolved | Resolution | fixed | ||
Product Version | 2.3.7 | ||||
Fixed in Version | 3.2.5 | ||||
Summary | 0003510: DOS through uploading large attachments | ||||
Description | SOGo seems to have a memory leak with attachment uploads, possibly only considering attachments that could not be stored on the IMAP server (because the resulting mail would be too large and is denied). It is possible to crash the whole service/server by (repeatedly) uploading attachments. Consider the uploaded Zabbix memory graph. The test machine used for reproducing the issue is equipped with 2GB of RAM and 16GB of swap space. The same issue probably also happened multiple times on our production server, not because of an attacker, but because of normal users with failed uploads. On our live setup, free memory gradually gets consumed and later swapped out. After some time, service performance suddenly drops to barely usable, probably after the first non-garbage information is swapped out. Once the upload completed, it looks like SOGo copies the attachment (into memcached?) and then eleminates the copy in the sogod. The memcached copy stays forever/until the SOGo service is restarted. | ||||
Steps To Reproduce |
| ||||
Tags | No tags attached. | ||||
|
|
|
|
I've just tried a few times with different file sizes. I got the following errors: 1GB: 2016-02-10 08:28:35.298 sogod[8139] ERROR: could not parse body part at index 10 in multipart <HttpRequest: method=POST uri=/SOGo/so/sogo1/Mail/0/folderDrafts/newDraft1455110765-1/save header={accept = "application/json, text/javascript, /; q=0.01"; "accept-encoding" = "gzip, deflate"; "accept-language" = "en-US,en;q=0.8,fr;q=0.6"; "content-length" = 1048577226; "content-type" = "multipart/form-data; boundary=----WebKitFormBoundarymrXtDVSekpUhFByq"; cookie = "0xHIGHFLYxSOGo=basic%20VTZWNWR4MEduOU00VjErMEx4S2JkU0pqV1dnZlpnVk1Nb0E5SDNDVWxiNk5pdWVCY25ENWJETllkd3JzTTZ5UmN6VDdZVU8yWUQ4SU1na3FwVEFZcXFadis0WHZBMStvQ0d2dEoyTThQRUNtMi9zaWxtR3NQQ0lHSVBCZXhyOVhTenFCSVMwZXNVWlAyelZDNENPaXhqUXh2c1RsZ21ISDlYU0NadU5qWGlhZCtzdUdUR3B6My9BR3VoVFhwUWRvNVcwTEZYNGxhc0NELzluUWlGQ1VGdz09OlpOVG5VaFFxcmc1Z1QzdWpub1RQcEE9PQ%3D%3D"; host = sogoludo; origin = "http://sogoludo"; referer = "http://sogoludo/SOGo/so/sogo1/Mail/0/folderDrafts/newDraft1455110765-1/edit"; "user-agent" = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.82 Safari/537.36"; "x-forwarded-for" = "172.20.20.42"; "x-forwarded-host" = sogoludo; "x-forwarded-server" = "sogo.example.com"; "x-requested-with" = XMLHttpRequest; "x-webobjects-remote-user" = "(null)"; "x-webobjects-server-port" = 80; "x-webobjects-server-protocol" = "HTTP/1.0"; } body=(null)>: <NSException: 0x7fff64000930> NAME:NSMallocException REASON:Default zone has run out of memory INFO:(null) 256MB: Feb 10 08:36:10 sogod [8139]: <0x0x7fff64013210[NGImap4Client]> ERROR(-[NGImap4Client _processUnknownCommandParserException:]): catched non-IMAP4 parsing exception NSMallocException: Default zone has run out of memory 128MB: 2016-02-10 08:37:19.249 sogod[8139] WARNING(-[NGBufferedStream writeBytes:count:]): got passed in length 175MB (183929292 bytes, errcode=2147483647) ... In all three cases, sogod's memory consumption stayed stable: sogo 8139 12.0 0.8 330460 17232 pts/0 S+ 08:25 1:41 /usr/local/sbin/sogod -WOUseWatchDog NO -WONoDetach YES -WOPort 20000 -WOWorkersCount 1 -WOLogFile - -WOPidFile /tmp/sogo.pidr So I suspect this is happening at an other stage. What errors are you getting? |
|
We had some error messages that processes were killed because using more than the maximum amount of memory configured in the SOGo configuration (512MB for us). I'm out of office today and tomorrow, and cannot reproduce it on my home instance (but this has a different setup than the university systems). Will have to report back on Friday. We could observe this issue on both our test and producation servers. They both have a memcached attached, and the frontend Apache server runs on another machine. I think I observed an increased memory consumption on memcached, but am not sure about that. Will have to reproduce the issue again to monitor that (and this time, better watch at not completely DOSing the test instance by accident). Does the behavior change if you set WOUseWatchDog yes and some WOWorkersCount larger one? |
|
From our test machine: 2016-02-09 13:26:42.346 sogod[22771] WARNING(-[NGBufferedStream writeBytes:count:]): got passed in length 274MB (287388529 bytes, errcode=2147483647) ... Then no log entries anymore, machine dead with 100% CPU (2 cores) and swapping till no space left. |
|
sogod.log (31,557 bytes)
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3958 sogo 18 0 1243m 914m 5332 D 3.3 45.5 1:19.65 sogod 3993 root 34 19 251m 692 692 S 0.0 0.0 0:00.52 yum-updatesd 3953 sogo 16 0 242m 5440 3516 S 0.0 0.3 0:05.41 sogod 3965 sogo 16 0 240m 5400 3464 S 0.0 0.3 0:01.84 sogod 3976 sogo 16 0 234m 5084 3136 S 0.0 0.2 0:01.48 sogod 3957 sogo 16 0 221m 4816 2868 S 0.0 0.2 0:01.58 sogod 3963 sogo 16 0 221m 4760 2824 S 0.0 0.2 0:01.57 sogod 3980 sogo 16 0 221m 4784 2832 S 0.0 0.2 0:01.43 sogod PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3953 sogo 18 0 7923m 314m 2108 R 14.4 15.6 1:24.41 sogod 3958 sogo 18 0 3595m 6716 1948 D 0.0 0.3 1:51.87 sogod 3976 sogo 18 0 2591m 1.4g 238m D 0.7 72.6 0:31.64 sogod 3993 root 34 19 251m 676 676 S 0.0 0.0 0:00.52 yum-updatesd 3965 sogo 16 0 240m 2820 1988 S 0.0 0.1 0:02.02 sogod 3957 sogo 16 0 223m 2708 1900 D 0.0 0.1 0:01.82 sogod 3963 sogo 15 0 221m 2416 1724 S 0.0 0.1 0:01.71 sogod 21424 sogo 16 0 220m 4340 2192 D 0.3 0.2 0:00.09 sogod PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3953 sogo 18 0 10.9g 1.8g 2096 R 7.8 91.4 1:36.49 sogod 3976 sogo 18 0 4942m 23m 10m D 8.4 1.2 0:57.97 sogod 2757 root 11 -10 28704 22m 1744 S 0.0 1.1 0:00.01 iscsiuio 3484 ntp 15 0 23420 5052 3924 S 0.0 0.2 0:00.08 ntpd 2763 root 5 -10 5100 3048 1904 S 0.0 0.1 0:00.01 iscsid 21424 sogo 16 0 237m 2928 2068 D 1.1 0.1 0:00.78 sogod 21466 sogo 16 0 220m 2624 1804 S 1.5 0.1 0:00.24 sogod PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ 21424 sogo 18 0 7920m 580m 371m D 8.0 28.9 1:14.40 21518 sogo 18 0 4942m 1.3g 219m D 4.3 68.3 1:02.52 3963 sogo 18 0 2590m 107m 101m D 0.0 5.3 0:38.22 3993 root 34 19 251m 672 672 S 0.0 0.0 0:00.52 3965 sogo 16 0 240m 6412 2700 D 0.3 0.3 0:08.81 3957 sogo 16 0 238m 5796 2848 D 0.0 0.3 0:09.43 21592 sogo 16 0 220m 3188 2080 S 0.0 0.2 0:00.11 21593 sogo 16 0 220m 2588 2108 S 0.0 0.1 0:00.15 3954 sogo 16 0 220m 2460 1648 S 0.0 0.1 0:05.70 3955 sogo 16 0 220m 2464 1648 S 0.0 0.1 0:07.76 Feb 12 10:04:54 sogod [3958]: 134.34.1.247 "POST /SOGo/so/pop15959/Mail/0/mailboxes HTTP/1.0" 200 2841/0 1.561 - - 4K Feb 12 10:04:55 sogod [3958]: 134.34.1.247 "POST /SOGo/so/pop15959/Mail/0/folderINBOX/unseenCount HTTP/1.0" 200 13/0 0.129 - - 8K Feb 12 10:04:56 sogod [3958]: 134.34.1.247 "POST /SOGo/so/pop15959/Mail/foldersState HTTP/1.0" 200 371/0 0.006 - - 0 Feb 12 10:04:56 sogod [3958]: 134.34.1.247 "GET /SOGo/so/pop15959/Mail/0/compose HTTP/1.0" 302 0/0 0.069 - - 16K Feb 12 10:04:57 sogod [3958]: 134.34.1.247 "GET /SOGo/so/pop15959/Mail/0/folderDrafts/newDraft1455267896-1/edit HTTP/1.0" 200 7353/0 0.259 27782 73% 128K Feb 12 10:06:45 sogod [3820]: [WARN] <0x0x2adc8c7954f8[WOWatchDogChild]> pid 3958 has been hanging in the same request for 1 minutes Feb 12 10:07:45 sogod [3820]: [WARN] <0x0x2adc8c7954f8[WOWatchDogChild]> pid 3958 has been hanging in the same request for 2 minutes Feb 12 10:08:45 sogod [3820]: [WARN] <0x0x2adc8c7954f8[WOWatchDogChild]> pid 3958 has been hanging in the same request for 3 minutes Feb 12 10:09:45 sogod [3820]: [WARN] <0x0x2adc8c7954f8[WOWatchDogChild]> pid 3958 has been hanging in the same request for 4 minutes Feb 12 10:10:48 sogod [3820]: [WARN] <0x0x2adc8c7954f8[WOWatchDogChild]> pid 3958 has been hanging in the same request for 5 minutes Feb 12 10:11:03 sogod [3820]: [WARN] <0x0x2adc8c7cb5d8[WOWatchDogChild]> pid 3953 has been hanging in the same request for 1 minutes Feb 12 10:11:46 sogod [3820]: [WARN] <0x0x2adc8c7954f8[WOWatchDogChild]> pid 3958 has been hanging in the same request for 6 minutes Feb 12 10:12:02 sogod [3820]: [WARN] <0x0x2adc8c7cb5d8[WOWatchDogChild]> pid 3953 has been hanging in the same request for 2 minutes Feb 12 10:12:48 sogod [3820]: [WARN] <0x0x2adc8c7954f8[WOWatchDogChild]> pid 3958 has been hanging in the same request for 7 minutes Feb 12 10:13:01 sogod [3820]: [WARN] <0x0x2adc8c7cb5d8[WOWatchDogChild]> pid 3953 has been hanging in the same request for 3 minutes Feb 12 10:13:46 sogod [3820]: [WARN] <0x0x2adc8c7954f8[WOWatchDogChild]> pid 3958 has been hanging in the same request for 8 minutes Feb 12 10:14:03 sogod [3820]: [WARN] <0x0x2adc8c7cb5d8[WOWatchDogChild]> pid 3953 has been hanging in the same request for 4 minutes Feb 12 10:14:45 sogod [3820]: [WARN] <0x0x2adc8c7954f8[WOWatchDogChild]> pid 3958 has been hanging in the same request for 9 minutes Feb 12 10:15:03 sogod [3820]: [ERROR] <0x0x2adc8c76e738[WOWatchDogChild]> FAILURE receiving status for child 3980 Feb 12 10:15:03 sogod [3820]: [ERROR] <0x0x2adc8c76e738[WOWatchDogChild]> socket: <NGActiveSocket[0x0x2adc8c735338]: mode=rw address=(null) connectedTo=<0x0x2adc8c705e18[NGLocalSocketAddress]: /tmp/_ngsocket_3820_0x2adc8c535df8_000> receive-timeout=1.000s> Feb 12 10:15:05 sogod [3820]: [ERROR] <0x0x2adc8c76e738[WOWatchDogChild]> exception: <NGSocketTimedOutException: 0x2adc8c6e1de8> NAME:NGSocketTimedOutException REASON:the socket was shutdown INFO:{errno = 11; error = "Resource temporarily unavailable"; stream = "{object = <NGActiveSocket[0x0x2adc8c735338]: mode=rw address=(null) connectedTo=<0x0x2adc8c705e18[NGLocalSocketAddress]: /tmp/_ngsocket_3820_0x2adc8c535df8_000> receive-timeout=1.000s>;}"; } Feb 12 10:15:05 sogod [3820]: <0x0x2adc8c76e738[WOWatchDogChild]> sending terminate signal to pid 3980 Feb 12 10:15:05 sogod [3820]: [ERROR] <0x0x2adc8c76e738[WOWatchDogChild]> FAILURE notifying child 3980 Feb 12 10:15:05 sogod [3820]: <0x0x2adc8c76e738[WOWatchDogChild]> sending terminate signal to pid 3980 Feb 12 10:15:06 sogod [3820]: [WARN] <0x0x2adc8c7cb5d8[WOWatchDogChild]> pid 3953 has been hanging in the same request for 5 minutes Feb 12 10:15:33 sogod [3957]: 134.34.1.247 "POST /SOGo/so/pop15959/Mail/0/folderINBOX/unseenCount HTTP/1.0" 200 13/0 25.804 - - 4M Feb 12 10:15:37 sogod [3980]: 134.34.1.247 "POST /SOGo/so/pop15959/Mail/0/folderINBOX/uids HTTP/1.0" 200 938/92 34.141 2883 67% 5M Feb 12 10:15:46 sogod [3820]: [WARN] <0x0x2adc8c7954f8[WOWatchDogChild]> pid 3958 has been hanging in the same request for 10 minutes Feb 12 10:15:56 sogod [3820]: <0x0x2adc8c76e738[WOWatchDogChild]> child 3980 exited Feb 12 10:15:56 sogod [3820]: <0x0x2adc8c574b38[WOWatchDog]> child spawned with pid 21424 Feb 12 10:16:01 sogod [3820]: [WARN] <0x0x2adc8c7cb5d8[WOWatchDogChild]> pid 3953 has been hanging in the same request for 6 minutes Feb 12 10:16:08 sogod [3820]: [WARN] <0x0x2adc8c775fb8[WOWatchDogChild]> pid 3976 has been hanging in the same request for 1 minutes Feb 12 10:16:46 sogod [3820]: [WARN] <0x0x2adc8c7954f8[WOWatchDogChild]> pid 3958 has been hanging in the same request for 11 minutes Feb 12 10:17:06 sogod [3820]: [WARN] <0x0x2adc8c7cb5d8[WOWatchDogChild]> pid 3953 has been hanging in the same request for 7 minutes Feb 12 10:17:11 sogod [3820]: [WARN] <0x0x2adc8c775fb8[WOWatchDogChild]> pid 3976 has been hanging in the same request for 2 minutes Feb 12 10:17:49 sogod [3820]: [WARN] <0x0x2adc8c7954f8[WOWatchDogChild]> pid 3958 has been hanging in the same request for 12 minutes Feb 12 10:18:01 sogod [3820]: [WARN] <0x0x2adc8c7cb5d8[WOWatchDogChild]> pid 3953 has been hanging in the same request for 8 minutes Feb 12 10:18:08 sogod [3820]: [WARN] <0x0x2adc8c775fb8[WOWatchDogChild]> pid 3976 has been hanging in the same request for 3 minutes Feb 12 10:18:45 sogod [3820]: [WARN] <0x0x2adc8c7954f8[WOWatchDogChild]> pid 3958 has been hanging in the same request for 13 minutes Feb 12 10:19:01 sogod [3820]: [WARN] <0x0x2adc8c7cb5d8[WOWatchDogChild]> pid 3953 has been hanging in the same request for 9 minutes Feb 12 10:19:08 sogod [3820]: [WARN] <0x0x2adc8c775fb8[WOWatchDogChild]> pid 3976 has been hanging in the same request for 4 minutes Feb 12 10:19:45 sogod [3820]: [WARN] <0x0x2adc8c7954f8[WOWatchDogChild]> pid 3958 has been hanging in the same request for 14 minutes Feb 12 10:20:10 sogod [3820]: [ERROR] <0x0x2adc8c76e738[WOWatchDogChild]> FAILURE receiving status for child 21424 Feb 12 10:20:10 sogod [3820]: [ERROR] <0x0x2adc8c76e738[WOWatchDogChild]> socket: <NGActiveSocket[0x0x2adc8c787e88]: mode=rw address=(null) connectedTo=<0x0x2adc8c70ebf8[NGLocalSocketAddress]: /tmp/_ngsocket_3820_0x2adc8c535df8_000> receive-timeout=1.000s> Feb 12 10:20:13 sogod [3820]: [ERROR] <0x0x2adc8c76e738[WOWatchDogChild]> exception: <NGSocketTimedOutException: 0x2adc8c7bc2a8> NAME:NGSocketTimedOutException REASON:the socket was shutdown INFO:{errno = 11; error = "Resource temporarily unavailable"; stream = "{object = <NGActiveSocket[0x0x2adc8c787e88]: mode=rw address=(null) connectedTo=<0x0x2adc8c70ebf8[NGLocalSocketAddress]: /tmp/_ngsocket_3820_0x2adc8c535df8_000> receive-timeout=1.000s>;}"; } Feb 12 10:20:13 sogod [3820]: <0x0x2adc8c76e738[WOWatchDogChild]> sending terminate signal to pid 21424 Feb 12 10:20:13 sogod [3820]: [ERROR] <0x0x2adc8c76e738[WOWatchDogChild]> FAILURE notifying child 21424 Feb 12 10:20:13 sogod [3820]: <0x0x2adc8c76e738[WOWatchDogChild]> sending terminate signal to pid 21424 Feb 12 10:20:15 sogod [3820]: [WARN] <0x0x2adc8c7cb5d8[WOWatchDogChild]> pid 3953 has been hanging in the same request for 10 minutes Feb 12 10:20:15 sogod [3820]: [WARN] <0x0x2adc8c775fb8[WOWatchDogChild]> pid 3976 has been hanging in the same request for 5 minutes Feb 12 10:20:31 sogod [21424]: <0x0x2adc8ca37c98[SOGoCache]> Cache cleanup interval set every 900.000000 seconds Feb 12 10:20:31 sogod [21424]: <0x0x2adc8ca37c98[SOGoCache]> Using host(s) 'localhost' as server(s) Feb 12 10:20:48 sogod [3820]: [WARN] <0x0x2adc8c7954f8[WOWatchDogChild]> safety belt -- sending KILL signal to pid 3958 Feb 12 10:20:50 sogod [3820]: <0x0x2adc8c7954f8[WOWatchDogChild]> child 3958 exited Feb 12 10:20:50 sogod [3820]: <0x0x2adc8c7954f8[WOWatchDogChild]> (terminated due to signal 9) Feb 12 10:20:50 sogod [3820]: <0x0x2adc8c574b38[WOWatchDog]> child spawned with pid 21466 Feb 12 10:21:01 sogod [3820]: [WARN] <0x0x2adc8c7cb5d8[WOWatchDogChild]> pid 3953 has been hanging in the same request for 11 minutes Feb 12 10:21:08 sogod [3820]: [WARN] <0x0x2adc8c775fb8[WOWatchDogChild]> pid 3976 has been hanging in the same request for 6 minutes Feb 12 10:21:15 sogod [3820]: [WARN] <0x0x2adc8c76e738[WOWatchDogChild]> pid 21424 has been hanging in the same request for 1 minutes Feb 12 10:22:03 sogod [3820]: [WARN] <0x0x2adc8c7cb5d8[WOWatchDogChild]> pid 3953 has been hanging in the same request for 12 minutes Feb 12 10:22:08 sogod [3820]: [WARN] <0x0x2adc8c775fb8[WOWatchDogChild]> pid 3976 has been hanging in the same request for 7 minutes Feb 12 10:22:15 sogod [3820]: [WARN] <0x0x2adc8c76e738[WOWatchDogChild]> pid 21424 has been hanging in the same request for 2 minutes Feb 12 10:23:03 sogod [3820]: [WARN] <0x0x2adc8c7cb5d8[WOWatchDogChild]> pid 3953 has been hanging in the same request for 13 minutes Feb 12 10:23:08 sogod [3820]: [WARN] <0x0x2adc8c775fb8[WOWatchDogChild]> pid 3976 has been hanging in the same request for 8 minutes Feb 12 10:23:15 sogod [3820]: [WARN] <0x0x2adc8c76e738[WOWatchDogChild]> pid 21424 has been hanging in the same request for 3 minutes Feb 12 10:24:03 sogod [3820]: [WARN] <0x0x2adc8c7cb5d8[WOWatchDogChild]> pid 3953 has been hanging in the same request for 14 minutes Feb 12 10:24:10 sogod [3820]: [WARN] <0x0x2adc8c775fb8[WOWatchDogChild]> pid 3976 has been hanging in the same request for 9 minutes Feb 12 10:24:17 sogod [3820]: [WARN] <0x0x2adc8c76e738[WOWatchDogChild]> pid 21424 has been hanging in the same request for 4 minutes Feb 12 10:26:27 sogod [3820]: [ERROR] <0x0x2adc8c7954f8[WOWatchDogChild]> FAILURE receiving status for child 21466 Feb 12 10:26:32 sogod [3820]: [ERROR] <0x0x2adc8c7954f8[WOWatchDogChild]> socket: <NGActiveSocket[0x0x2adc8c7d8778]: mode=rw address=(null) connectedTo=<0x0x2adc8c7776e8[NGLocalSocketAddress]: /tmp/_ngsocket_3820_0x2adc8c535df8_000> receive-timeout=1.000s> Feb 12 10:26:39 sogod [3820]: [ERROR] <0x0x2adc8c7954f8[WOWatchDogChild]> exception: <NGSocketTimedOutException: 0x2adc8c7cff38> NAME:NGSocketTimedOutException REASON:the socket was shutdown INFO:{errno = 11; error = "Resource temporarily unavailable"; stream = "{object = <NGActiveSocket[0x0x2adc8c7d8778]: mode=rw address=(null) connectedTo=<0x0x2adc8c7776e8[NGLocalSocketAddress]: /tmp/_ngsocket_3820_0x2adc8c535df8_000> receive-timeout=1.000s>;}"; } Feb 12 10:26:39 sogod [3820]: <0x0x2adc8c7954f8[WOWatchDogChild]> sending terminate signal to pid 21466 Feb 12 10:26:39 sogod [3820]: [ERROR] <0x0x2adc8c7954f8[WOWatchDogChild]> FAILURE notifying child 21466 Feb 12 10:26:39 sogod [3820]: <0x0x2adc8c7954f8[WOWatchDogChild]> sending terminate signal to pid 21466 Feb 12 10:26:40 sogod [3820]: [WARN] <0x0x2adc8c7cb5d8[WOWatchDogChild]> safety belt -- sending KILL signal to pid 3953 Feb 12 10:26:40 sogod [3820]: [WARN] <0x0x2adc8c775fb8[WOWatchDogChild]> pid 3976 has been hanging in the same request for 10 minutes Feb 12 10:26:42 sogod [3820]: [ERROR] <0x0x2adc8c76e738[WOWatchDogChild]> FAILURE receiving status for child 21424 Feb 12 10:26:42 sogod [3820]: [ERROR] <0x0x2adc8c76e738[WOWatchDogChild]> socket: <NGActiveSocket[0x0x2adc8c787e88]: mode=rw address=(null) connectedTo=<0x0x2adc8c70ebf8[NGLocalSocketAddress]: /tmp/_ngsocket_3820_0x2adc8c535df8_000> receive-timeout=1.000s> Feb 12 10:26:43 sogod [3820]: [ERROR] <0x0x2adc8c76e738[WOWatchDogChild]> exception: <NGSocketTimedOutException: 0x2adc8c7cffb8> NAME:NGSocketTimedOutException REASON:the socket was shutdown INFO:{errno = 11; error = "Resource temporarily unavailable"; stream = "{object = <NGActiveSocket[0x0x2adc8c787e88]: mode=rw address=(null) connectedTo=<0x0x2adc8c70ebf8[NGLocalSocketAddress]: /tmp/_ngsocket_3820_0x2adc8c535df8_000> receive-timeout=1.000s>;}"; } Feb 12 10:26:43 sogod [3820]: <0x0x2adc8c76e738[WOWatchDogChild]> sending terminate signal to pid 21424 Feb 12 10:26:43 sogod [3820]: [ERROR] <0x0x2adc8c76e738[WOWatchDogChild]> FAILURE notifying child 21424 Feb 12 10:26:43 sogod [3820]: <0x0x2adc8c76e738[WOWatchDogChild]> sending terminate signal to pid 21424 Feb 12 10:26:43 sogod [3820]: <0x0x2adc8c7cb5d8[WOWatchDogChild]> child 3953 exited Feb 12 10:26:43 sogod [3820]: <0x0x2adc8c7cb5d8[WOWatchDogChild]> (terminated due to signal 9) Feb 12 10:26:44 sogod [3820]: <0x0x2adc8c574b38[WOWatchDog]> child spawned with pid 21509 Feb 12 10:26:57 sogod [21466]: <0x0x2adc8ca13a58[SOGoCache]> Cache cleanup interval set every 900.000000 seconds Feb 12 10:26:57 sogod [21466]: <0x0x2adc8ca13a58[SOGoCache]> Using host(s) 'localhost' as server(s) Feb 12 10:27:01 sogod [3957]: 134.34.1.247 "POST /SOGo/so/pop15959/Mail/0/folderINBOX/unseenCount HTTP/1.0" 200 13/0 15.502 - - 3M Feb 12 10:27:08 sogod [3820]: [WARN] <0x0x2adc8c775fb8[WOWatchDogChild]> pid 3976 has been hanging in the same request for 11 minutes Feb 12 10:27:12 sogod [21466]: [ERROR] <0x0x2adc8c884818[WOHttpTransaction]> client disconnected during delivery of response for <WORequest[0x0x2adc8ca01e18]: method=POST uri=/SOGo/so/pop15959/Mail/0/folderINBOX/uids app=SOGo rqKey=so rqPath=pop15959/Mail/0/folderINBOX/uids> (len=938): the socket was shutdown Feb 12 10:27:12 sogod [21466]: 134.34.1.247 "POST /SOGo/so/pop15959/Mail/0/folderINBOX/uids HTTP/1.0" 200 938/92 24.313 2883 67% 5M Feb 12 10:27:18 sogod [3820]: <0x0x2adc8c7954f8[WOWatchDogChild]> child 21466 exited Feb 12 10:27:18 sogod [3820]: <0x0x2adc8c574b38[WOWatchDog]> child spawned with pid 21518 Feb 12 10:27:48 sogod [3820]: [WARN] <0x0x2adc8c7923b8[WOWatchDogChild]> pid 3963 has been hanging in the same request for 1 minutes Feb 12 10:28:12 sogod [3820]: [WARN] <0x0x2adc8c775fb8[WOWatchDogChild]> pid 3976 has been hanging in the same request for 12 minutes Feb 12 10:28:51 sogod [3820]: [WARN] <0x0x2adc8c7923b8[WOWatchDogChild]> pid 3963 has been hanging in the same request for 2 minutes Feb 12 10:29:08 sogod [3820]: [WARN] <0x0x2adc8c775fb8[WOWatchDogChild]> pid 3976 has been hanging in the same request for 13 minutes Feb 12 10:29:49 sogod [3820]: [WARN] <0x0x2adc8c7923b8[WOWatchDogChild]> pid 3963 has been hanging in the same request for 3 minutes Feb 12 10:30:06 sogod [3820]: [ERROR] <0x0x2adc8c7954f8[WOWatchDogChild]> FAILURE receiving status for child 21518 Feb 12 10:30:06 sogod [3820]: [ERROR] <0x0x2adc8c7954f8[WOWatchDogChild]> socket: <NGActiveSocket[0x0x2adc8c7b53b8]: mode=rw address=(null) connectedTo=<0x0x2adc8c7926f8[NGLocalSocketAddress]: /tmp/_ngsocket_3820_0x2adc8c535df8_000> receive-timeout=1.000s> Feb 12 10:30:08 sogod [3820]: [ERROR] <0x0x2adc8c7954f8[WOWatchDogChild]> exception: <NGSocketTimedOutException: 0x2adc8c761ad8> NAME:NGSocketTimedOutException REASON:the socket was shutdown INFO:{errno = 11; error = "Resource temporarily unavailable"; stream = "{object = <NGActiveSocket[0x0x2adc8c7b53b8]: mode=rw address=(null) connectedTo=<0x0x2adc8c7926f8[NGLocalSocketAddress]: /tmp/_ngsocket_3820_0x2adc8c535df8_000> receive-timeout=1.000s>;}"; } Feb 12 10:30:08 sogod [3820]: <0x0x2adc8c7954f8[WOWatchDogChild]> sending terminate signal to pid 21518 Feb 12 10:30:08 sogod [3820]: [ERROR] <0x0x2adc8c7954f8[WOWatchDogChild]> FAILURE notifying child 21518 Feb 12 10:30:08 sogod [3820]: <0x0x2adc8c7954f8[WOWatchDogChild]> sending terminate signal to pid 21518 Feb 12 10:30:08 sogod [3820]: [WARN] <0x0x2adc8c775fb8[WOWatchDogChild]> pid 3976 has been hanging in the same request for 14 minutes Feb 12 10:30:10 sogod [3820]: [ERROR] <0x0x2adc8c7954f8[WOWatchDogChild]> FAILURE receiving status for child 21518 Feb 12 10:30:10 sogod [3820]: [ERROR] <0x0x2adc8c7954f8[WOWatchDogChild]> socket: <NGActiveSocket[0x0x2adc8c7b53b8]: mode=rw address=(null) connectedTo=<0x0x2adc8c7926f8[NGLocalSocketAddress]: /tmp/_ngsocket_3820_0x2adc8c535df8_000> receive-timeout=1.000s> Feb 12 10:30:10 sogod [3820]: [ERROR] <0x0x2adc8c7954f8[WOWatchDogChild]> exception: <NGSocketTimedOutException: 0x2adc8c761b58> NAME:NGSocketTimedOutException REASON:the socket was shutdown INFO:{errno = 11; error = "Resource temporarily unavailable"; stream = "{object = <NGActiveSocket[0x0x2adc8c7b53b8]: mode=rw address=(null) connectedTo=<0x0x2adc8c7926f8[NGLocalSocketAddress]: /tmp/_ngsocket_3820_0x2adc8c535df8_000> receive-timeout=1.000s>;}"; } Feb 12 10:30:11 sogod [3820]: <0x0x2adc8c7954f8[WOWatchDogChild]> sending terminate signal to pid 21518 Feb 12 10:30:11 sogod [3820]: [ERROR] <0x0x2adc8c7954f8[WOWatchDogChild]> FAILURE notifying child 21518 Feb 12 10:30:11 sogod [3820]: <0x0x2adc8c7954f8[WOWatchDogChild]> sending terminate signal to pid 21518 Feb 12 10:30:24 sogod [21518]: <0x0x2adc8ca379a8[SOGoCache]> Cache cleanup interval set every 900.000000 seconds Feb 12 10:30:25 sogod [21518]: <0x0x2adc8ca379a8[SOGoCache]> Using host(s) 'localhost' as server(s) Feb 12 10:30:46 sogod [3820]: [WARN] <0x0x2adc8c7923b8[WOWatchDogChild]> pid 3963 has been hanging in the same request for 4 minutes Feb 12 10:31:08 sogod [3820]: [WARN] <0x0x2adc8c775fb8[WOWatchDogChild]> safety belt -- sending KILL signal to pid 3976 Feb 12 10:31:09 sogod [3820]: <0x0x2adc8c775fb8[WOWatchDogChild]> child 3976 exited Feb 12 10:31:09 sogod [3820]: <0x0x2adc8c775fb8[WOWatchDogChild]> (terminated due to signal 9) Feb 12 10:31:09 sogod [3820]: <0x0x2adc8c574b38[WOWatchDog]> child spawned with pid 21555 Feb 12 10:31:11 sogod [3820]: [WARN] <0x0x2adc8c7954f8[WOWatchDogChild]> pid 21518 has been hanging in the same request for 1 minutes Feb 12 10:31:51 sogod [3820]: [WARN] <0x0x2adc8c7923b8[WOWatchDogChild]> pid 3963 has been hanging in the same request for 5 minutes Feb 12 10:32:16 sogod [3820]: [WARN] <0x0x2adc8c7954f8[WOWatchDogChild]> pid 21518 has been hanging in the same request for 2 minutes Feb 12 10:33:02 sogod [3820]: [WARN] <0x0x2adc8c7923b8[WOWatchDogChild]> pid 3963 has been hanging in the same request for 6 minutes Feb 12 10:33:11 sogod [3820]: [WARN] <0x0x2adc8c7954f8[WOWatchDogChild]> pid 21518 has been hanging in the same request for 3 minutes Feb 12 10:33:51 sogod [3820]: [WARN] <0x0x2adc8c7923b8[WOWatchDogChild]> pid 3963 has been hanging in the same request for 7 minutes Feb 12 10:34:14 sogod [3820]: [WARN] <0x0x2adc8c7954f8[WOWatchDogChild]> pid 21518 has been hanging in the same request for 4 minutes Feb 12 10:34:46 sogod [3820]: [WARN] <0x0x2adc8c7923b8[WOWatchDogChild]> pid 3963 has been hanging in the same request for 8 minutes Feb 12 10:34:58 sogod [3820]: [ERROR] <0x0x2adc8c775fb8[WOWatchDogChild]> FAILURE receiving status for child 21555 Feb 12 10:35:00 sogod [3820]: [ERROR] <0x0x2adc8c775fb8[WOWatchDogChild]> socket: <NGActiveSocket[0x0x2adc8c86bc48]: mode=rw address=(null) connectedTo=<0x0x2adc8c6d6fb8[NGLocalSocketAddress]: /tmp/_ngsocket_3820_0x2adc8c535df8_000> receive-timeout=1.000s> Feb 12 10:35:06 sogod [3820]: [ERROR] <0x0x2adc8c775fb8[WOWatchDogChild]> exception: <NGSocketTimedOutException: 0x2adc8c76e408> NAME:NGSocketTimedOutException REASON:the socket was shutdown INFO:{errno = 11; error = "Resource temporarily unavailable"; stream = "{object = <NGActiveSocket[0x0x2adc8c86bc48]: mode=rw address=(null) connectedTo=<0x0x2adc8c6d6fb8[NGLocalSocketAddress]: /tmp/_ngsocket_3820_0x2adc8c535df8_000> receive-timeout=1.000s>;}"; } Feb 12 10:35:06 sogod [3820]: <0x0x2adc8c775fb8[WOWatchDogChild]> sending terminate signal to pid 21555 Feb 12 10:35:06 sogod [3820]: [ERROR] <0x0x2adc8c775fb8[WOWatchDogChild]> FAILURE notifying child 21555 Feb 12 10:35:06 sogod [3820]: <0x0x2adc8c775fb8[WOWatchDogChild]> sending terminate signal to pid 21555 Feb 12 10:35:08 sogod [3820]: [ERROR] <0x0x2adc8c7cb5d8[WOWatchDogChild]> FAILURE receiving status for child 21509 Feb 12 10:35:08 sogod [3820]: [ERROR] <0x0x2adc8c7cb5d8[WOWatchDogChild]> socket: <NGActiveSocket[0x0x2adc8c7c6ce8]: mode=rw address=(null) connectedTo=<0x0x2adc8c6e2b78[NGLocalSocketAddress]: /tmp/_ngsocket_3820_0x2adc8c535df8_000> receive-timeout=1.000s> Feb 12 10:35:09 sogod [3820]: [ERROR] <0x0x2adc8c7cb5d8[WOWatchDogChild]> exception: <NGSocketTimedOutException: 0x2adc8c76e488> NAME:NGSocketTimedOutException REASON:the socket was shutdown INFO:{errno = 11; error = "Resource temporarily unavailable"; stream = "{object = <NGActiveSocket[0x0x2adc8c7c6ce8]: mode=rw address=(null) connectedTo=<0x0x2adc8c6e2b78[NGLocalSocketAddress]: /tmp/_ngsocket_3820_0x2adc8c535df8_000> receive-timeout=1.000s>;}"; } Feb 12 10:35:09 sogod [3820]: <0x0x2adc8c7cb5d8[WOWatchDogChild]> sending terminate signal to pid 21509 Feb 12 10:35:09 sogod [3820]: [ERROR] <0x0x2adc8c7cb5d8[WOWatchDogChild]> FAILURE notifying child 21509 Feb 12 10:35:09 sogod [3820]: <0x0x2adc8c7cb5d8[WOWatchDogChild]> sending terminate signal to pid 21509 Feb 12 10:35:11 sogod [3820]: [ERROR] <0x0x2adc8c7ab458[WOWatchDogChild]> FAILURE receiving status for child 3957 Feb 12 10:35:11 sogod [3820]: [ERROR] <0x0x2adc8c7ab458[WOWatchDogChild]> socket: <NGActiveSocket[0x0x2adc8c830898]: mode=rw address=(null) connectedTo=<0x0x2adc8c77ccc8[NGLocalSocketAddress]: /tmp/_ngsocket_3820_0x2adc8c535df8_000> receive-timeout=1.000s> Feb 12 10:35:11 sogod [3820]: [ERROR] <0x0x2adc8c7ab458[WOWatchDogChild]> exception: <NGSocketTimedOutException: 0x2adc8c759c18> NAME:NGSocketTimedOutException REASON:the socket was shutdown INFO:{errno = 11; error = "Resource temporarily unavailable"; stream = "{object = <NGActiveSocket[0x0x2adc8c830898]: mode=rw address=(null) connectedTo=<0x0x2adc8c77ccc8[NGLocalSocketAddress]: /tmp/_ngsocket_3820_0x2adc8c535df8_000> receive-timeout=1.000s>;}"; } Feb 12 10:35:11 sogod [3820]: <0x0x2adc8c7ab458[WOWatchDogChild]> sending terminate signal to pid 3957 Feb 12 10:35:11 sogod [3820]: [ERROR] <0x0x2adc8c7ab458[WOWatchDogChild]> FAILURE notifying child 3957 Feb 12 10:35:11 sogod [3820]: <0x0x2adc8c7ab458[WOWatchDogChild]> sending terminate signal to pid 3957 Feb 12 10:35:11 sogod [3820]: [WARN] <0x0x2adc8c7954f8[WOWatchDogChild]> pid 21518 has been hanging in the same request for 5 minutes Feb 12 10:35:13 sogod [3820]: [ERROR] <0x0x2adc8c77a928[WOWatchDogChild]> FAILURE receiving status for child 3965 Feb 12 10:35:13 sogod [3820]: [ERROR] <0x0x2adc8c77a928[WOWatchDogChild]> socket: <NGActiveSocket[0x0x2adc8c77c6b8]: mode=rw address=(null) connectedTo=<0x0x2adc8c759448[NGLocalSocketAddress]: /tmp/_ngsocket_3820_0x2adc8c535df8_000> receive-timeout=1.000s> Feb 12 10:35:13 sogod [3820]: [ERROR] <0x0x2adc8c77a928[WOWatchDogChild]> exception: <NGSocketTimedOutException: 0x2adc8c79ba18> NAME:NGSocketTimedOutException REASON:the socket was shutdown INFO:{errno = 11; error = "Resource temporarily unavailable"; stream = "{object = <NGActiveSocket[0x0x2adc8c77c6b8]: mode=rw address=(null) connectedTo=<0x0x2adc8c759448[NGLocalSocketAddress]: /tmp/_ngsocket_3820_0x2adc8c535df8_000> receive-timeout=1.000s>;}"; } Feb 12 10:35:13 sogod [3820]: <0x0x2adc8c77a928[WOWatchDogChild]> sending terminate signal to pid 3965 Feb 12 10:35:13 sogod [3820]: [ERROR] <0x0x2adc8c77a928[WOWatchDogChild]> FAILURE notifying child 3965 Feb 12 10:35:13 sogod [3820]: <0x0x2adc8c77a928[WOWatchDogChild]> sending terminate signal to pid 3965 Feb 12 10:35:26 sogod [21509]: <0x0x2adc8c9d79f8[SOGoCache]> Cache cleanup interval set every 900.000000 seconds Feb 12 10:35:26 sogod [21509]: <0x0x2adc8c9d79f8[SOGoCache]> Using host(s) 'localhost' as server(s) Feb 12 10:35:26 sogod [21555]: <0x0x2adc8ca31748[SOGoCache]> Cache cleanup interval set every 900.000000 seconds Feb 12 10:35:27 sogod [21555]: <0x0x2adc8ca31748[SOGoCache]> Using host(s) 'localhost' as server(s) Feb 12 10:35:46 sogod [3820]: [WARN] <0x0x2adc8c7923b8[WOWatchDogChild]> pid 3963 has been hanging in the same request for 9 minutes Feb 12 10:36:09 sogod [3820]: [WARN] <0x0x2adc8c775fb8[WOWatchDogChild]> pid 21555 has been hanging in the same request for 1 minutes Feb 12 10:36:11 sogod [3820]: [WARN] <0x0x2adc8c7954f8[WOWatchDogChild]> pid 21518 has been hanging in the same request for 6 minutes Feb 12 10:36:13 sogod [3820]: [WARN] <0x0x2adc8c7cb5d8[WOWatchDogChild]> pid 21509 has been hanging in the same request for 1 minutes Feb 12 10:36:14 sogod [3820]: [WARN] <0x0x2adc8c7ab458[WOWatchDogChild]> pid 3957 has been hanging in the same request for 1 minutes Feb 12 10:36:31 sogod [21509]: 134.34.1.247 "POST /SOGo/so/pop15959/Mail/0/folderINBOX/unseenCount HTTP/1.0" 200 13/0 74.708 - - 6M Feb 12 10:36:33 sogod [21555]: [ERROR] <0x0x2adc8ca032e8[WOHttpTransaction]> client disconnected during delivery of response for <WORequest[0x0x2adc8ca1f038]: method=POST uri=/SOGo/so/pop15959/Mail/0/folderINBOX/uids app=SOGo rqKey=so rqPath=pop15959/Mail/0/folderINBOX/uids> (len=938): the socket was shutdown Feb 12 10:36:34 sogod [21555]: 134.34.1.247 "POST /SOGo/so/pop15959/Mail/0/folderINBOX/uids HTTP/1.0" 200 938/92 88.569 2883 67% 6M Feb 12 10:36:48 sogod [3820]: [WARN] <0x0x2adc8c7923b8[WOWatchDogChild]> pid 3963 has been hanging in the same request for 10 minutes Feb 12 10:36:55 sogod [3820]: <0x0x2adc8c7cb5d8[WOWatchDogChild]> child 21509 exited Feb 12 10:36:55 sogod [3820]: <0x0x2adc8c775fb8[WOWatchDogChild]> child 21555 exited Feb 12 10:36:57 sogod [3820]: <0x0x2adc8c574b38[WOWatchDog]> child spawned with pid 21592 Feb 12 10:36:58 sogod [3820]: <0x0x2adc8c574b38[WOWatchDog]> child spawned with pid 21593 Feb 12 10:37:11 sogod [3820]: [WARN] <0x0x2adc8c7954f8[WOWatchDogChild]> pid 21518 has been hanging in the same request for 7 minutes Feb 12 10:37:14 sogod [3820]: [WARN] <0x0x2adc8c7ab458[WOWatchDogChild]> pid 3957 has been hanging in the same request for 2 minutes Feb 12 10:37:47 sogod [3820]: [WARN] <0x0x2adc8c7923b8[WOWatchDogChild]> pid 3963 has been hanging in the same request for 11 minutes Feb 12 10:38:11 sogod [3820]: [WARN] <0x0x2adc8c7954f8[WOWatchDogChild]> pid 21518 has been hanging in the same request for 8 minutes Feb 12 10:38:14 sogod [3820]: [WARN] <0x0x2adc8c7ab458[WOWatchDogChild]> pid 3957 has been hanging in the same request for 3 minutes Feb 12 10:38:47 sogod [3820]: [WARN] <0x0x2adc8c7923b8[WOWatchDogChild]> pid 3963 has been hanging in the same request for 12 minutes Feb 12 10:39:11 sogod [3820]: [WARN] <0x0x2adc8c7954f8[WOWatchDogChild]> pid 21518 has been hanging in the same request for 9 minutes Feb 12 10:39:14 sogod [3820]: [WARN] <0x0x2adc8c7ab458[WOWatchDogChild]> pid 3957 has been hanging in the same request for 4 minutes Feb 12 10:39:50 sogod [3820]: [WARN] <0x0x2adc8c7923b8[WOWatchDogChild]> pid 3963 has been hanging in the same request for 13 minutes Feb 12 10:40:03 sogod [3820]: [ERROR] <0x0x2adc8c7cb5d8[WOWatchDogChild]> FAILURE receiving status for child 21593 Feb 12 10:40:05 sogod [3820]: [ERROR] <0x0x2adc8c7cb5d8[WOWatchDogChild]> socket: <NGActiveSocket[0x0x2adc8c7b6bf8]: mode=rw address=(null) connectedTo=<0x0x2adc8c7341c8[NGLocalSocketAddress]: /tmp/_ngsocket_3820_0x2adc8c535df8_000> receive-timeout=1.000s> Feb 12 10:40:21 sogod [3820]: [ERROR] <0x0x2adc8c7cb5d8[WOWatchDogChild]> exception: <NGSocketTimedOutException: 0x2adc8c9001c8> NAME:NGSocketTimedOutException REASON:the socket was shutdown INFO:{errno = 11; error = "Resource temporarily unavailable"; stream = "{object = <NGActiveSocket[0x0x2adc8c7b6bf8]: mode=rw address=(null) connectedTo=<0x0x2adc8c7341c8[NGLocalSocketAddress]: /tmp/_ngsocket_3820_0x2adc8c535df8_000> receive-timeout=1.000s>;}"; } Feb 12 10:40:21 sogod [3820]: <0x0x2adc8c7cb5d8[WOWatchDogChild]> sending terminate signal to pid 21593 Feb 12 10:40:21 sogod [3820]: [ERROR] <0x0x2adc8c7cb5d8[WOWatchDogChild]> FAILURE notifying child 21593 Feb 12 10:40:22 sogod [3820]: <0x0x2adc8c7cb5d8[WOWatchDogChild]> sending terminate signal to pid 21593 Feb 12 10:40:22 sogod [3820]: [WARN] <0x0x2adc8c7954f8[WOWatchDogChild]> pid 21518 has been hanging in the same request for 10 minutes Feb 12 10:40:22 sogod [3820]: [WARN] <0x0x2adc8c7ab458[WOWatchDogChild]> pid 3957 has been hanging in the same request for 5 minutes |
|
I reproduced the issue -- uploading a single 1GB file. I uploaded a log excerpt starting from the time I uploaded the attachment. Interestingly, one process after the other gets occupied and dies after some time -- without the attachment being uploaded again. I'm sorry I forgot to put timestamps near the top outputs, but they're chosen at arbitrary times anyway (not related to any special events in the log). At least they are in order, and one can correlate what's going on using the process IDs. |
|
After restarting, the problem occured again. Deleting them and starting the SOGo service again finally resolved the issue (until uploading a large attachment again). -rw------- 1 sogo sogo 1,4G 12. Feb 10:12 OGo0F7156BDA1E501.tmp |
|
If you try to reproduce the issue on a very fast LAN connection to the SOGo server, does that work? I guess you have the issue when a user uploads a very large file and the transfer takes longer than the watchdog timeout. |
|
The connection speed is not an issue, we're connected with Fast Ethernet (and sometimes already Gigabit) The 1GB attachment was completely uploaded in about a minute, and then the upload fails with an error message in the browser that the message could not be stored. The same happens if uploading smaller chunks that are transmitted in few seconds, but you need to upload more of them for notable memory occupation. |
|
In the trace above, we have: Feb 09 13:28:41 sogod [22771]: [ERROR] <0x0x2b540d5d8778[WOHttpTransaction]> client disconnected during delivery of response for <WORequest[0x0x2b540d633f98]: method=POST uri=/SOGo/so/pop15959/Mail/0/folderDrafts/newDraft1455020593-1/save app=SOGo rqKey=so rqPath=pop15959/Mail/0/folderDrafts/newDraft1455020593-1/save> (len=382): the socket was shutdown Do you have a proxy in front of sogod that could drop the connection? |
|
We have only the Apache revers proxy in front of it (on another machine), which handels HTTPS, delivers the static content and does the normal revers translations. |
|
Would it be possible for me to have root access with gdb on the test machine and you reproduce the problem while I hook on the sogod processes with gdb? |
|
Ok, I assume your IP and Key from 2015-02-09 are still valid? But the security team is already gone into the weekend. Therefore the firewall can not be opend before next week. |
|
I think I have a solution to propose you. 1- we introduce a SOPE parameter to limit the POST/PUT size 2- if SOPE sees it, it doesn't try to parse the mulitpart data being uploaded and return a proper exception to the HTTP client 3- the server process at this point is immediately made available, no big memory consumption is required 4- when the HTTP client (like the SOGo's web interface) finishes the upload (in no man's land, really), it'll get the return code generated in step. no 2 and tell the client "You've uploaded a file that is too big." What do you think? |
|
What happens if:
A warning message for users trying to upload whole DVD images and wondering why it fails all the time (happened not only once!) would be a great thing, but I think the memory leak is another issue that also has to be addressed. If you have trouble sending large attachments for debugging (we have decent internet connection at our university, so if your upstream is large enough, you should be fine anyway), feel free to get in touch with either of us (mail addresses are also valid XMPP aliases) to attach one locally. |
|
To answers your questions: 1- SOPE exposes the Content-Length. Any HTTP 1.1 client should do that during a POST/PUT - otherwise right now, SOPE will reject its request. HTTP 0.9 and 1.0 client won't even work with SOGo anyway 2- we are talking about setting a limit here to a fairly large value - say 100MB or even more. You want to protect the application from any large upload, not just file attachments so that fix is important 3- I don't understand your point "too large from being sent to the IMAP server". If there's a limit there, it can be honoured in SOGo I've not been able to see if a memory leak is present - ALL tests I have done shown no leak - just large memory consumption due to the data needing to be encoded there and there for multiple operations in SOGo. |
|
Your points regarding (1) and (2) seem legit, didn't consider that this limit will consider the whole request size. With respect to (3), users might hit message size limitations and/or quota rules when uploading a new mail as draft/sent mail. This might be triggering the "memory leak", whatever this is, at least it reproducably occupies memory and does not release it again. Have a look at the top output and log I uploaded. I sent exactly one HTTP request uploading a 1GB file. This occupied multiple sogod workers, one after the other. All originally occupied ones have been killed after some time, and others take over the job. I'm not sure this is also the exactly same issue we see on the production machine (which loses a lot of memory over time, which is not freed again, not even during low traffic hours; until it starts swapping after a week or two and gets horribly slow). At least there's the indication that we observed a massive amount of memory lost after a user tried to upload a rather large file (and failed with it). |
|
I'll try to set a small quota (say 10MB) and upload a 50MB file. It should fail and expose the issue if any. Assuming there's an issue and it is fixed regarding the quota, we'll go ahead and implement the proposed solution which should fix the issue you are facing. |
|
I see no memory leak when trying to attach a file and going over quota. |
|
The patch (updated from the nightly repository, which already includes it) did not fully resolve the issue. I set the maximum allowed attachment size to 30MB. When uploading single large attachments, the issue seems indeed resolved (users trying to upload DVD images do not crash the system any more). When uploading a large number of 25MB attachments, the system still crashes. Some of the more interesting log lines: Feb 17 11:35:01 sogod [14813]: [ERROR] <0x0x2b5c7eb34308[WOWatchDogChild]> exception: <NGSocketTimedOutException: 0x2b5c8a5e1468> NAME:NGSocketTimedOutException REASON:the socket was shutdown INFO:{errno = 11; error = "Die Ressource ist zur Zeit nicht verf\U00FCgbar"; stream = "{object = <NGActiveSocket[0x0x2b5c89c16928]: mode=rw address=(null) connectedTo=<0x0x2b5c7eaedcd8[NGLocalSocketAddress]: /tmp/_ngsocket_14813_0x2b5c85e0bd98_000> receive-timeout=1.000s>;}"; } Feb 17 11:35:59 sogod [14843]: [ERROR] <0x0x2b5c7ede32f8[WOHttpTransaction]> client disconnected during delivery of response for <WORequest[0x0x2b5c8cbed1b8]: method=POST uri=/SOGo/so/pop15959/Mail/0/folderDrafts/newDraft1455705129-1/save app=SOGo rqKey=so rqPath=pop15959/Mail/0/folderDrafts/newDraft1455705129-1/save> (len=381): the socket was shutdown Deep in swapping, I had to killall -9 them and clean up /tmp/OGo* before restarting SOGo. |
|
Are you sure you updated SOPE? Because what you sent does not make sense. The worker should NOT be hanging a few minutes, it should NOT try to read the content after seeing that the Content-Length is greater than 30MB. ALL my tests worked regarding that. |
|
I think that the upload limit is indeed working -- it blocks single, large attachments. It does not seem to work for multiple, smaller attachments -- each of them is small enough, but together they're still too large for single mails (and there is some kind of memory leak). I'm not sure whether this is a "traditional" memory leak: Cleaning up the /tmp/OGo* files seems required sometimes when restarting the service. Is it possible SOGo picks those up automatically when restarted, and tries to perform again whatever it failed at before (resulting in those files being stuck)? We installed both sope and SOGo from your nightly repository on our test server: yum list sopeLoaded plugins: fastestmirror
|
|
I had another try with the 20160218 packages, problem still exists. |
|
Do you upload them one by one or all at the same time? What limit have you defined? |
|
The maximum size is still 30MB on the test server. I uploaded a bunch of 25MB files at the same time. |
|
The problem is that the message keeps growing when you do that. For example, if your WOMaxUploadSize is set to 25MB and you upload 20 files files of 25MB (for 500MB), it'll work because your browser will do 20 uploads of 25MB files. Now where it gets tricky, is that the email message on the SOGo server will grow each time an upload is completed, by 25 MB - because the mail is being saved to the IMAP's draft folder every time an attachment upload is completed. Now that takes memory, A LOT of it. a 500MB message will need at least 1 to 1.5GB of memory to be correctly saved on the IMAP server - that is due to MIME generation of the message before being saved to the IMAP server, and also saving operation to the IMAP server. I suggest we add SOGoMaximumMessageSize - so that whenever that limit is reached, SOGo will prevent any further attachment addition. |
|
sogo: v2 32bb1456 2016-02-16 16:28 Details Diff |
(feat) attachment size upload using WOMaxUploadSize (fixes 0003510 and 0003135) |
Affected Issues 0003510 |
|
mod - NEWS | Diff File | ||
mod - UI/MailerUI/UIxMailEditor.m | Diff File | ||
sogo: v2 3f68fb7b 2016-12-28 10:18 Details Diff |
(feat) added new SOGoMaximumMessageSizeLimit config parameter (fixes 0003510) |
Affected Issues 0003510 |
|
mod - SoObjects/Mailer/SOGoDraftObject.m | Diff File | ||
mod - SoObjects/SOGo/SOGoSystemDefaults.h | Diff File | ||
mod - SoObjects/SOGo/SOGoSystemDefaults.m | Diff File | ||
sogo: master 2cebee42 2016-12-28 10:18 Details Diff |
(feat) added new SOGoMaximumMessageSizeLimit config parameter (fixes 0003510) Conflicts: SoObjects/Mailer/SOGoDraftObject.m |
Affected Issues 0003510 |
|
mod - Documentation/SOGoInstallationGuide.asciidoc | Diff File | ||
mod - SoObjects/Mailer/SOGoDraftObject.m | Diff File | ||
mod - SoObjects/SOGo/SOGoSystemDefaults.h | Diff File | ||
mod - SoObjects/SOGo/SOGoSystemDefaults.m | Diff File | ||
sogo: maint 81df371e 2016-12-28 10:18 Committer: francis Details Diff |
(feat) added new SOGoMaximumMessageSizeLimit config parameter (fixes 0003510) |
Affected Issues 0003510 |
|
mod - SoObjects/Mailer/SOGoDraftObject.m | Diff File | ||
mod - SoObjects/SOGo/SOGoSystemDefaults.h | Diff File | ||
mod - SoObjects/SOGo/SOGoSystemDefaults.m | Diff File |
Date Modified | Username | Field | Change |
---|---|---|---|
2016-02-09 13:18 | Jens Erat | New Issue | |
2016-02-09 13:18 | Jens Erat | File Added: zabbix-memory.png | |
2016-02-09 13:18 | Jens Erat | File Added: zabbix-memory-production.png | |
2016-02-10 13:40 | ludovic | Note Added: 0009488 | |
2016-02-10 14:14 | Jens Erat | Note Added: 0009489 | |
2016-02-12 09:11 | Christian Mack | Note Added: 0009514 | |
2016-02-12 09:48 | Jens Erat | File Added: sogod.log | |
2016-02-12 09:51 | Jens Erat | Note Added: 0009516 | |
2016-02-12 10:04 | Jens Erat | Note Added: 0009517 | |
2016-02-12 12:31 | ludovic | Note Added: 0009520 | |
2016-02-12 12:38 | Jens Erat | Note Added: 0009522 | |
2016-02-12 13:39 | ludovic | Note Added: 0009525 | |
2016-02-12 13:53 | Christian Mack | Note Added: 0009526 | |
2016-02-12 14:07 | ludovic | Note Added: 0009527 | |
2016-02-12 14:25 | Christian Mack | Note Added: 0009528 | |
2016-02-12 15:02 | ludovic | Note Added: 0009529 | |
2016-02-12 15:24 | Jens Erat | Note Added: 0009530 | |
2016-02-12 16:11 | ludovic | Note Added: 0009531 | |
2016-02-12 16:22 | Jens Erat | Note Added: 0009532 | |
2016-02-12 16:24 | ludovic | Note Added: 0009533 | |
2016-02-12 16:43 | ludovic | Note Added: 0009534 | |
2016-02-16 21:29 | ludovic | Changeset attached | => sogo v2 32bb1456 |
2016-02-16 21:29 | ludovic | Assigned To | => ludovic |
2016-02-16 21:29 | ludovic | Resolution | open => fixed |
2016-02-16 21:30 | ludovic | Status | new => resolved |
2016-02-16 21:30 | ludovic | Fixed in Version | => 2.3.9 |
2016-02-17 11:01 | Jens Erat | Note Added: 0009561 | |
2016-02-17 11:01 | Jens Erat | Status | resolved => feedback |
2016-02-17 11:01 | Jens Erat | Resolution | fixed => reopened |
2016-02-18 00:10 | ludovic | Note Added: 0009563 | |
2016-02-18 08:40 | Jens Erat | Note Added: 0009564 | |
2016-02-18 08:40 | Jens Erat | Status | feedback => assigned |
2016-02-18 09:14 | Jens Erat | Note Added: 0009565 | |
2016-02-19 01:50 | ludovic | Note Added: 0009567 | |
2016-02-23 09:45 | Jens Erat | Note Added: 0009581 | |
2016-12-28 13:03 | ludovic | Severity | crash => major |
2016-12-28 14:08 | ludovic | Note Added: 0011114 | |
2016-12-28 14:08 | ludovic | Severity | major => feature |
2016-12-28 15:19 | ludovic | Changeset attached | => sogo v2 3f68fb7b |
2016-12-28 15:24 | ludovic | Changeset attached | => sogo master 2cebee42 |
2016-12-28 15:26 | ludovic | Status | assigned => resolved |
2016-12-28 15:26 | ludovic | Fixed in Version | 2.3.9 => 3.2.5 |
2016-12-28 15:26 | ludovic | Resolution | reopened => fixed |
2017-01-09 20:49 | francis | Changeset attached | => sogo maint 81df371e |