Zimbra slows to a crawl

Discuss your pilot or production implementation with other Zimbra admins or our engineers.
Post Reply
desi
Posts: 7
Joined: Thu Oct 25, 2018 1:17 pm

Zimbra slows to a crawl

Post by desi »

I am running ZCS 8.7.1 Community edition with about 3500 mailboxes and 5.5 TB of stored mail. Server has 32 GB RAM and 8 cores (physical box). O/S Ubuntu 12.04.5 LTS

The system works fine and performance is great except that every six months of so it slows to a crawl for 3 days or so. It is si bad that most users cannot even get the logon screen displayed except at night when no-on is on.

Java CPU usage remains high during this period. I have checked the logs and do not see anything unusual. Also checked crontab jobs and don't see anything that runs that infrequently.

This has happened about 4 times over the last couple of years and is driving me nuts. Any ideas?
User avatar
pup_seba
Outstanding Member
Outstanding Member
Posts: 687
Joined: Sat Sep 13, 2014 2:43 am
Location: Tarragona - Spain
Contact:

Re: Zimbra slows to a crawl

Post by pup_seba »

Hi,

Let us know the output of these commands:
free -g
cat /opt/zimbra/conf/my.cnf | grep -i innodb_buffer_pool_size
zmlocalconfig | grep -i heap
du -sh /opt/zimbra/db/data

Also, what kind of disk/raid/storage protocol do you use?
User avatar
vavai
Advanced member
Advanced member
Posts: 174
Joined: Thu Nov 14, 2013 2:41 pm
Location: Indonesia
ZCS/ZD Version: 0
Contact:

Re: Zimbra slows to a crawl

Post by vavai »

desi wrote:I am running ZCS 8.7.1 Community edition with about 3500 mailboxes and 5.5 TB of stored mail. Server has 32 GB RAM and 8 cores (physical box). O/S Ubuntu 12.04.5 LTS

The system works fine and performance is great except that every six months of so it slows to a crawl for 3 days or so. It is si bad that most users cannot even get the logon screen displayed except at night when no-on is on.

Java CPU usage remains high during this period. I have checked the logs and do not see anything unusual. Also checked crontab jobs and don't see anything that runs that infrequently.

This has happened about 4 times over the last couple of years and is driving me nuts. Any ideas?
Ubuntu 12.04.5 is quite old. Beside the problem, I would suggest to upgrade into 16.04 and move Zimbra into multi server as for my experience, running all services on a single physical box would have a slower performance than running on multi server on top of virtualized system, even on similar server specs.

As for your problem, you can monitor /opt/zimbra/log/mailbox.log for relevant info when slowness occurred on client side.
desi
Posts: 7
Joined: Thu Oct 25, 2018 1:17 pm

Re: Zimbra slows to a crawl

Post by desi »

free -g
total used free shared buffers cached
Mem: 31 31 0 0 0 20
-/+ buffers/cache: 9 21
Swap: 31 0 31

cat /opt/zimbra/conf/my.cnf | grep -i innodb_buffer_pool_size
#innodb_buffer_pool_size = 474338304
innodb_buffer_pool_size = 13743895360

zmlocalconfig | grep -i heap
mailboxd_java_heap_memory_percent = 25
mailboxd_java_heap_new_size_percent = 25
mailboxd_java_heap_size = 6426
zimbra_activesync_syncstate_item_cache_heap_size = 10M

du -sh /opt/zimbra/db/data
29G /opt/zimbra/db/data

For storage we use 1.2 TB 10K SAS drives. 14 drives, 2 physical volumes, 1 logical volume (LVM).
RAID 10
User avatar
L. Mark Stone
Ambassador
Ambassador
Posts: 2796
Joined: Wed Oct 09, 2013 11:35 am
Location: Portland, Maine, US
ZCS/ZD Version: 10.0.6 Network Edition
Contact:

Re: Zimbra slows to a crawl

Post by L. Mark Stone »

I agree with Vavai that now would be a good time to reexamine your architecture and upgrade, but I would also say that 8.7.1, indeed many of the early 8.7.x releases had a number of bugs, and that even just doing an in-place upgrade to 8.7.11 might solve your problem for you -- while you plan to upgrade to Zimbra 8.8 on Ubuntu 16.04.

Hope that helps,
Mark
___________________________________
L. Mark Stone
Mission Critical Email - Zimbra VAR/BSP/Training Partner https://www.missioncriticalemail.com/
AWS Certified Solutions Architect-Associate
User avatar
pup_seba
Outstanding Member
Outstanding Member
Posts: 687
Joined: Sat Sep 13, 2014 2:43 am
Location: Tarragona - Spain
Contact:

Re: Zimbra slows to a crawl

Post by pup_seba »

With a 29GB database and ~13GB assigned to the buffer and also seeing the other things you posted, my bet would be for you to increase the ram on that server so you can accomodate your whole db in that buffer. I read on some post Mark Stone once (maybe this one but I'm not sure https://www.missioncriticalemail.com/20 ... uidelines/) wrote, that he recommended using a 1.25 factor for that, and since then I'm following his recomendation so I would suggest you to do the same.

So your buffer size should be something like 37GB, which is actually pretty big. So, I will optimize those dbs first https://wiki.zimbra.com/wiki/DB_not_rel ... eting_data and then add the necessary memory.

This would be a short term workaround (most likely). With 3.500 accounts I have to subscribe to the general recomendation here and suggest you to re-architecture. My suggestion would be p2v to new boxes on latest version. Use ZeXtras incremental migration to migrate and final architecture, you could consider something like this (personal preference): 2 LDAP mmr + 3 stores + 2 MTA + 1 Zimbra Docs. After migration, keep an eye on your database and consider that is easier to have a lot of little pieces than a few really big ones (imho).

Regards,
User avatar
L. Mark Stone
Ambassador
Ambassador
Posts: 2796
Joined: Wed Oct 09, 2013 11:35 am
Location: Portland, Maine, US
ZCS/ZD Version: 10.0.6 Network Edition
Contact:

Re: Zimbra slows to a crawl

Post by L. Mark Stone »

I confess I missed the bit about the innodb buffer pool size, but yes, 1.25x is from the MySQL documentation to provide room for temporary tables and other MySQL internal bits.

One other to add as you plan your upgrade migration:

The ZeXtras backup supports 1:many, many:1 and 1:1 mailbox restores. That’s another way of saying the ZeXtras backup/restore engine does not rely at all on mailbox-specific structural attributes, because the Import process rebuilds LDAP and MySQL data from scratch—that’s why the zimbraId value changes after a Disaster Recovery or Migration Import.

So you could build a multi-server farm with 2-3 mailbox servers and do the migration directly. Zimbra will spread the restored mailboxes across your new mailbox servers.

Hope that helps,
Mark
___________________________________
L. Mark Stone
Mission Critical Email - Zimbra VAR/BSP/Training Partner https://www.missioncriticalemail.com/
AWS Certified Solutions Architect-Associate
desi
Posts: 7
Joined: Thu Oct 25, 2018 1:17 pm

Re: Zimbra slows to a crawl

Post by desi »

Thanks for the responses guys.

I guess i just have to wait out the current slowdown. If past history holds true performance should be back to normal by Monday.

Certainly we will look at upgrading/re-restructuring going forward. The present serer is maxed out on hard drives and is at 85% used so we would be getting new hardware anyway.

Does anyone know if Mysql does any automatic housekeeping every 6 months or so? I really wonder why the slowdown is so cyclic. The system is normally very responsive and easily handles our workload (most of my users are v light email users).
User avatar
pup_seba
Outstanding Member
Outstanding Member
Posts: 687
Joined: Sat Sep 13, 2014 2:43 am
Location: Tarragona - Spain
Contact:

Re: Zimbra slows to a crawl

Post by pup_seba »

not that I'm aware of. If you read the links we provided you, you'll see that at least on "defrag", it does not defrags automatically. You could use a script to optimize database per database, or use mysqlcheck to do it all databases at once (that's how i do it). I see that in the zmdbintegrityreport command there is an option to optimize, but I never used that and I also never dig what that was about really :) Most likely to be a wrapper for the mysqlcheck command tho...

[root@zcsstore01 ~]# /opt/zimbra/libexec/zmdbintegrityreport --help
Usage: /opt/zimbra/libexec/zmdbintegrityreport [-m] [-o] [-r] [-v] [-h]
-m emails report to admin account, otherwise report is presented on stdout
-o attempt auto optimization of tables
-r attempt auto repair of tables
-v verbose output
-h help

I opened a bug in bugzilla asking to include a defrag in the crontab by default (kind of hard decision to make as optimizing the db will lock the access to it)...but that was before knowing that nobody in zimbra cares anymore about bugzilla so yeah... :) Anyways, here it is in case you wanna take a look at it.

https://bugzilla.zimbra.com/show_bug.cgi?id=109048
Post Reply