Stopping zimlet webapp... takes 10-15 minutes

Discuss your pilot or production implementation with other Zimbra admins or our engineers.
User avatar
L. Mark Stone
Elite member
Elite member
Posts: 1888
Joined: Wed Oct 09, 2013 11:35 am
Location: Portland, Maine
ZCS/ZD Version: 8.8.10 Network Edition
Contact:

Re: Stopping zimlet webapp... takes 10-15 minutes

Postby L. Mark Stone » Fri Sep 14, 2018 5:28 pm

PaperAdvocate wrote:Both files are the same except for 2 things, that the order of the line items are different and the value of innodb_buffer_pool_size is different; 4163895296 for the server with the delay and 2511535718 for the server that is fine.

The 4163895296 value was given to me by Zimbra support when I sent them my logs, so this is why it's different, but the issue was present prior to changing this value.

On both the innodb_max_dirty_pages_pct = 30.


Dunno, but the InnoDB Buffer Pool size should ideally be 1.25x or larger the size of the InnoDB databases. If the databases are bigger than the buffer pool, some portions of the databases get paged out to disk. If that's what's happening in your case, then likely MariaDB needs to pull that data in from swap and write it to disk before allowing itself to shut down, and that can add to the shutdown time (users will also notice periodic "stalls" of UI responsiveness).

You can use a tool like mysqltuner.pl to get the InnoDB database size, and you can also use the M parameter to set pool size in MB, rather than bytes. Much easier to read and less likely to cause a typo.

So here you can see on one system where, given the fullness of time, more and larger mailboxes, I needed to increase the size of the buffer pool eventually to 7GB. A number of clients have buffer pools 20GB or greater in size, so it's a good thing to check periodically. Be sure to add more RAM to your server if needed too!

Code: Select all

zimbra@zimbra:~$ cat conf/my.cnf | grep -i innodb_buffer
# innodb_buffer_pool_size        = 5047721164
# innodb_buffer_pool_size        = 6144M
innodb_buffer_pool_size        = 7168M
zimbra@zimbra:~$


Hope that helps,
Mark


___________________________________
L. Mark Stone
Mission Critical Email - Zimbra VAR/BSP/Training Partner https://www.missioncriticalemail.com/
Zeta Alliance http://www.zetalliance.org/
PaperAdvocate
Posts: 16
Joined: Tue Oct 11, 2016 9:28 pm

Re: Stopping zimlet webapp... takes 10-15 minutes

Postby PaperAdvocate » Fri Sep 14, 2018 6:55 pm

Thank you. I will check those values.

I haven't timed it perfectly, but it seems it's timing out at 10 minutes which is a common timeout variable. And due to the "timeout" listed in the shutdown process:

Code: Select all

Apr  3 23:37:49 mail postfix/amavisd/smtpd[11477]: timeout after END-OF-MESSAGE from localhost[127.0.0.1]
Apr  3 23:37:49 mail postfix/amavisd/smtpd[11477]: disconnect from localhost[127.0.0.1] ehlo=1 mail=1 rcpt=1 data=1 commands=4


On researching "timeout after END-OF-MESSAGE from localhost" I found several posts linking the behavior to possibly spamassasin or amavis-new. So I looked through my conf folder for differences in those configs and found only that the server with the delay was missing the IPV6 address entries in @mynetworks. I will change this just in case and try.

On the server with the timeout, there are additional files related to spamassasin that aren't present on the server without issues. These are: a directory named ~/conf/sa with a file inside name salocal.cf, and in the root of the conf folder salocal.cf and salocal.cf.blacklist. I'm going to try and clean those up and see if they do anything.
User avatar
L. Mark Stone
Elite member
Elite member
Posts: 1888
Joined: Wed Oct 09, 2013 11:35 am
Location: Portland, Maine
ZCS/ZD Version: 8.8.10 Network Edition
Contact:

Re: Stopping zimlet webapp... takes 10-15 minutes

Postby L. Mark Stone » Sat Sep 15, 2018 12:24 am

FWIW I always remove every trace of IPv6 before deploying Zimbra on an operating system. When IPv6 is ubiquitous, I'll remove every trace of IPv4 before deploying Zimbra on an operating system.

In my experience, Zimbra works well on either an IPv4 OR an IPv6 system, but I have always found "something" (as Rosanne Rosanadana would say...) when both are present on a Zimbra server.

Hope that helps,
Marks
___________________________________
L. Mark Stone
Mission Critical Email - Zimbra VAR/BSP/Training Partner https://www.missioncriticalemail.com/
Zeta Alliance http://www.zetalliance.org/
andrey.ivanov
Posts: 10
Joined: Wed Aug 08, 2018 8:44 am

Re: Stopping zimlet webapp... takes 10-15 minutes

Postby andrey.ivanov » Mon Sep 17, 2018 7:25 am

I've seen sometimes the shutdowns that take a lot of time, usually it's because some connections are established from proxy to backend and they are active or in 'TIME_WAIT' state. What i usually do to avoid it:
* stop proxy and memcached services
* on mailbox server, i monitor the connections coming from the proxy using iptstate -t. I've found that the connections in the state 'TIME_WAIT' cause the shutdown delays (at least for me). So i wait until they disappear (usually about 2 minutes).
* then i stop all the other zimbra services.

Using this sequence of stopping zimbra mailbox i have no problems with delays that i had in the past.
PaperAdvocate
Posts: 16
Joined: Tue Oct 11, 2016 9:28 pm

Re: Stopping zimlet webapp... takes 10-15 minutes

Postby PaperAdvocate » Fri Sep 21, 2018 9:52 pm

@ L. Mark Stone

I cleaned up the extra spamassasin conf files and other misc items to make it match the server without issue, all without result. Set the vm.swappiness=1 (as 0 now means to disable the swap altogether), without result. Ran the mysqltuner.pl and got these results (which are the same as the results on the server without the shutdown delay:

Code: Select all

[OK] InnoDB File per table is activated
[OK] InnoDB buffer pool / data size: 3.9G/661.9M
[OK] Ratio InnoDB log file size / InnoDB Buffer pool size: 500.0M * 2/3.9G should be equal 25%
[!!] InnoDB buffer pool instances: 8
[--] InnoDB Buffer Pool Chunk Size not used or defined in your version
[OK] InnoDB Read buffer efficiency: 100.00% (754767404 hits/ 754803238 total)
[!!] InnoDB Write Log efficiency: 85.19% (206209 hits/ 242065 total)
[OK] InnoDB log waits: 0.00% (0 waits / 35856 writes)


I'm still curious about the amavis interaction in the delay... do you know if there is a way to bypass/disable amavis for troubleshooting?

@andrey.ivanov
Thank you for the info, I'll look into it. My concern is automating this for a graceful shutdown in the event of a power outage (which means I won't be logged on to do the steps manually).

Return to “Administrators”

Who is online

Users browsing this forum: No registered users and 40 guests