Zimbra Times Out

Discuss your pilot or production implementation with other Zimbra admins or our engineers.
Post Reply
csilvertooth
Posts: 7
Joined: Sun Jun 03, 2018 6:28 pm

Zimbra Times Out

Post by csilvertooth »

Hi Everyone,

I am losing my mind trying to work out the kinks in my zimbra install. I have been having this intermittent issue for a long, long time. In summary, my server when accessed through the public ip will randomly stall/hang/time out, whatever you want to call it. While it is hung i can't access the web site at all. A simple curl command to my server will stall yet I can ping the ip without issue. I can connect to the zimbra server with the same email account on the internal network perfectly fine when the external IP connection is hung. It is maddening.

To give you an idea of what I have tried below is the following:

Completely rebuilt zimbra on a new centos 7 server - multiple times
Opened firewall completely
Rebooted my cisco asa firewalls
Changed from a Public -> NAT ip configuration to direct IP through DMZ.

I have tried to mess with the DOS filter and some other settings in a variety of incarnations to no avail. The only thing I haven't really tried is to get rid of the proxy stuff that zimbra automagically does.

If anyone could point me in the right direction for capturing the specific logs, etc that would be great. I want to say this all started happening around version 8 of zcs. Never had issues like this until then.

My VM Settings
2 cpu
8gb ram
30 gb disk for os
50 gb disk for data

All flash disk

I have a whopping 7 users. So performance or contention should never really be an issue. My pipe to the internet in my colo is 500Mb/s - my home connection is 1Gb/s

Any hints, tips, questions, anything would help. I am at a loss.

Thanks,

Chris
Attachments
Screenshot 2018-06-03 11.26.37.png
Screenshot 2018-06-03 11.26.37.png (20.17 KiB) Viewed 1910 times
phoenix
Ambassador
Ambassador
Posts: 27278
Joined: Fri Sep 12, 2014 9:56 pm
Location: Liverpool, England

Re: Zimbra Times Out

Post by phoenix »

I'd suggest you take a look at the ESMTP TLS Configuration of your firewall, it can cause problems for mail servers: https://www.cisco.com/c/en/us/support/d ... .html#anc9
Regards

Bill

Rspamd: A high performance spamassassin replacement

Per ardua ad astra
csilvertooth
Posts: 7
Joined: Sun Jun 03, 2018 6:28 pm

Re: Zimbra Times Out

Post by csilvertooth »

io-asa-primary(config)# policy-map global_policy
io-asa-primary(config-pmap)# class inspection_default
io-asa-primary(config-pmap-c)# no inspect esmtp
ERROR: Inspection not installed or parameters do not match
io-asa-primary(config-pmap-c)#


I am not inspecting ESMTP. Other thoughts?
csilvertooth
Posts: 7
Joined: Sun Jun 03, 2018 6:28 pm

Re: Zimbra Times Out

Post by csilvertooth »

phoenix wrote:I'd suggest you take a look at the ESMTP TLS Configuration of your firewall, it can cause problems for mail servers: https://www.cisco.com/c/en/us/support/d ... .html#anc9

Also, I should mention that my zimbra setup includes EFA as the spam filter. MX record points to the efa appliance and then it transfers the mail to the zimbra server.
csilvertooth
Posts: 7
Joined: Sun Jun 03, 2018 6:28 pm

Re: Zimbra Times Out

Post by csilvertooth »

A little more info. Looking at the logs on the Cisco ASA I will see this when I am looking at sh conn port 443.

TCP outside homeip:51722 dmz-174 dmzipofzimbraserver:443, idle 0:00:01, bytes 0, flags SaAB

SaAB means this:

SaAB
S - awaiting inside SYN,
a - awaiting outside ACK to SYN,
A - awaiting inside ACK to SYN,
B - initial SYN from outside,

I then go to my zimbra server and the connection is hung. can't click on anything without the little blue spinny thing of death appearing.

So the connection is waiting for an ACK from the zimbra server.
csilvertooth
Posts: 7
Joined: Sun Jun 03, 2018 6:28 pm

Re: Zimbra Times Out

Post by csilvertooth »

I think I may have found a solution.

I did some dumps of the packets traveling from an external client to the zimbra server. I could see the packet left the Cisco ASA and made it to the server. For whatever reason the server would not send the ack back. This was random of course. It would work fine 90% of the time but then suddenly a client would just have this problem.

I had already applied some of the tcp tuning parameters zimbra recommended like below.

net.ipv4.tcp_fin_timeout=15
net.ipv4.tcp_tw_reuse=1
net.ipv4.tcp_tw_recycle=1

What so far has stopped the problem was adding the following to sysctl.conf

net.ipv4.tcp_timestamps=0

Running this command live: sysctl -w net.ipv4.tcp_timestamps=0

Will let this run for a day and see if it crops up again. Not sure if this is related to CentOS, VMXNET 3 adapter, http protocol, or what.
csilvertooth
Posts: 7
Joined: Sun Jun 03, 2018 6:28 pm

Re: Zimbra Times Out

Post by csilvertooth »

So I reset the tcp_timestamps back to normal... net.ipv4.tcp_timestamps=1

I then changed the reuse and recycle settings to 0 as having those at 1 with NAT IP based users causes issues according to some research on the internet.

I have had zero problems in 4 hours. Normally it only took 15 mins or so before the problems cropped up.

See this web site for a better understanding on issues arising from making certain changes.

https://vincent.bernat.im/en/blog/2014- ... tate-linux

I would advise not setting recycle to 1 and reuse to 1 which is opposite of what Zimbra advises in their performance tuning guidelines. I think it would be good for someone from Zimbra to explain why someone should set this especially if you read anything regarding those settings on the internet.

Chris
csilvertooth
Posts: 7
Joined: Sun Jun 03, 2018 6:28 pm

Re: Zimbra Times Out

Post by csilvertooth »

Update on this issue.

After making the changes listed in my last post I still show no signs of the original problem. It would appear with the current software, os, and network this has been cured.

Final Configuration:
CentOS Linux release 7.5.1804
Zimbra 8.8
VMWare ESXi 6.5
Open VM Tools - latest
Zextras (Latest)
LetsEncrypt for SSL

Cisco ASA 5520 (Border Router/Firewall in HA Configuration)
Catalyst 3750 Switch
HP DL 380 G9 Host Server - 1Gb NICs
EVA for SAN as well as Intel DC3600 2TB Flash

Sysctl.conf Parameters

net.ipv4.tcp_fin_timeout=15
net.ipv4.tcp_tw_reuse=0
net.ipv4.tcp_tw_recycle=0
net.ipv4.tcp_timestamps=1

If you have any other questions let me know. Now on to the next problem - Web Interface doesn't always update when emails are deleted on other devices.

Chris
Post Reply