Zimbra KVM locking up

Running our Appliance (ZCA), ZCS on VMware, or any other virtual machine software? Post your thoughts here.
mattcen
Posts: 17
Joined: Sat Sep 13, 2014 1:08 am

Zimbra KVM locking up

Post by mattcen »

Hi all,
I have an Ubuntu 10.04 physical host running KVM. This host runs several

Ubuntu 10.04 VMs, one of which is running Zimbra

(7.0.1_GA_3105.UBUNTU10_64 UBUNTU10_64 NETWORK).
5 times since September last year I've had just this VM completely lock

up at seemingly irregular intervals, and I can find no pattern as to why

it's happening, but would like to either be able to confirm or discount

Zimbra as a cause.
When the lock-up occurs, I can't ping the VM, view its console (using

the KVM VNC console; it just sits there not even loading a blank

screen), or shutdown the VM. When I try to shut it down, nothing appears

to happen, and if I then try to kill the VM, the dead (zombie) process

stuck around, and KVM thought it was still running, so wouldn't let me

start it back up without rebooting the physical host. In 4 out of 5

cases, all other VMs and the host OS were unaffected. In the 5th, all

VMs locked up.
Has anybody seen anything like this before?
Olorin
Advanced member
Advanced member
Posts: 57
Joined: Fri Sep 12, 2014 10:34 pm

Zimbra KVM locking up

Post by Olorin »

I have seen such behaviour of KVM on Ubuntu and Debian.
If there is room for testing, I would try to take this to RHEL or Fedora hosts and VMs.
Having said that, can you elaborate on the host and VM configuration and resources available?
friedmar
Advanced member
Advanced member
Posts: 86
Joined: Fri Sep 12, 2014 10:01 pm

Zimbra KVM locking up

Post by friedmar »

I could not confirm this specific behavior. But I also found some irregular behavior of the Zimbra KVM guest. No problems with other guests!
So from my point of view it is worth for Zimbra to look into this.
Olorin
Advanced member
Advanced member
Posts: 57
Joined: Fri Sep 12, 2014 10:34 pm

Zimbra KVM locking up

Post by Olorin »

[quote user="friedmar"]I could not confirm this specific behavior. But I also found some irregular behavior of the Zimbra KVM guest. No problems with other guests![/QUOTE]
Well, I know of a large (~5k users) Zimbra setup running on RHEL+KVM for years now, and no such issues were encountered.
Zimbra seems to be a bit of a resource hog, probably due to all the Java stack it uses (still light as a feather compared to msexchange though). There might be some tweaks around the Java memory management side of things, but I'm not aware of those.
BTW, my own setup is restarted every 24 hours, by the backup script, any chance you're running 24/7?
Tripple
Outstanding Member
Outstanding Member
Posts: 285
Joined: Sat Sep 13, 2014 12:22 am

Zimbra KVM locking up

Post by Tripple »

[quote user="Olorin"]BTW, my own setup is restarted every 24 hours, by the backup script, any chance you're running 24/7?[/QUOTE]
Do you mean Zimbra or your complete server?
Olorin
Advanced member
Advanced member
Posts: 57
Joined: Fri Sep 12, 2014 10:34 pm

Zimbra KVM locking up

Post by Olorin »

zimbra services
mattcen
Posts: 17
Joined: Sat Sep 13, 2014 1:08 am

Zimbra KVM locking up

Post by mattcen »

Hi all, thank you all for your responses, and sorry for the delay in

getting back with more information.
Zimbra is configured to do an incremental backup nightly (plus a full

backup weekly), though I don't know whether that is supposed to restart

Zimbra services in the process. If it doesn't, then yes, Zimbra not

being restarted regularly to my knowledge (except for a reboot of the

entire host once a month).
More detail in my setup is:
* I have an Ubuntu 10.04 host running KVM with libvirt.
* Each VM's storage is a raw LV for the OS, and, for those that need it,

another raw LV for data. The Zimbra VM has 4 LVs for it: / (2.0G of

7.9G used, /opt (356G of 493G used), /opt/zimbra/backup (814G of 1.2T

used), and swap (~9G).
* The VMs themselves are also running Ubuntu 10.04. It's worth noting

that these systems haven't had security patches applied for a long

time, despite my best efforts (the customer is holding this up a bit).

I'm hoping that once these updates happen, the problem will go away,

but in the interim, I'm looking into other possible causes.
* The Zimbra has 8GB RAM dedicated to it, as well as 4 cores of an AMD

Opteron(tm) Processor 6134 dedicated to it. The whole server has 2 of

these processors.
* I'm not in a position to be able to test this setup on Redhat-based

hosts.
Hope this is enough information to evoke some new suggestions. Thanks

again for your time.
Olorin
Advanced member
Advanced member
Posts: 57
Joined: Fri Sep 12, 2014 10:34 pm

Zimbra KVM locking up

Post by Olorin »

[quote user="mattcen"]

* I have an Ubuntu 10.04 host running KVM with libvirt.

[/QUOTE]
This might be the issue in it's own right. Any chance of trying this VM on another machine, with a RHEL or derivative running kvm?


[QUOTE]* The VMs themselves are also running Ubuntu 10.04. It's worth noting

that these systems haven't had security patches applied for a long

time, despite my best efforts (the customer is holding this up a bit).

I'm hoping that once these updates happen, the problem will go away,

but in the interim, I'm looking into other possible causes.

[/QUOTE]
I have seen Ubuntu, especially 10 and 11 doing amazing blunders in conjunction with kvm and libvirt, both as hosts and guests. Internet forums are full of reports where VMs lock up, die and generally misbehave, that could not be reproduced on Fedora or RHEL. This is something very easy to prove, and not written in order to start yet another distro war (even though I am definitely not a fan of Ubuntu as a server of any kind)
[QUOTE]* The Zimbra has 8GB RAM dedicated to it, as well as 4 cores of an AMD

Opteron(tm) Processor 6134 dedicated to it. The whole server has 2 of

these processors.

[/QUOTE]
Without looking those CPUs up, does this mean the Zimbra VM has more virtual cores than the host has physical? If so, this is definitely a problem. But I guess those are multiple core CPUs and I am wrong here.
[QUOTE]* I'm not in a position to be able to test this setup on Redhat-based

hosts.

[/QUOTE]
It shouldn't be hard really - get a /opt backup from your customer, and run it in your own local sandbox for a few days, but do it on RHEL or Centos, both as host and guest.
Brad_C
Advanced member
Advanced member
Posts: 106
Joined: Sat Sep 13, 2014 2:33 am

Zimbra KVM locking up

Post by Brad_C »

[quote user="Olorin"]I have seen Ubuntu, especially 10 and 11 doing amazing blunders in conjunction with kvm and libvirt, both as hosts and guests.[/QUOTE]
Just a counter datapoint, but here we've run Zimbra on an Ubuntu 10.04 Guest on a Debian host with no issues at all for over 12 months now. I'm sure you can find plenty of evidence of it misbehaving, but then if you look you can probably find evidence of RH based VM's misbehaving also.
If you have a fault that causes the guest to become an unkillable zombie, it's not the fault of the guest!
I'd be looking very hard at the host, particularly the version of qemu-kvm that is being run.
I keep ours updated from recently compiled kvm git trees. I certainly would not rely on any packaged version from any distribution. There are always tangible performance and stability increases running a more recent emulator and host kernel.
Olorin
Advanced member
Advanced member
Posts: 57
Joined: Fri Sep 12, 2014 10:34 pm

Zimbra KVM locking up

Post by Olorin »

[quote user="Brad_C"]Just a counter datapoint, but here we've run Zimbra on an Ubuntu 10.04 Guest on a Debian host with no issues at all for over 12 months now. I'm sure you can find plenty of evidence of it misbehaving, but then if you look you can probably find evidence of RH based VM's misbehaving also.

[/QUOTE]
I've been specializing in kvm based virtualization since 2008, and so far I've amassed quite a statistic. I really don't mean to start a flamewar, and would rather leave the discussion at this. Considering where kvm and libvirt development and QA happens, it is not a wonder Fedora and RHEL behave better.
[QUOTE]If you have a fault that causes the guest to become an unkillable zombie, it's not the fault of the guest!

[/QUOTE]

not exactly. This can be the fault of the guest kernel drivers. I've seen older virtio_serial and once - misconfigured in-guest IO scheduler to cause exactly this.
[QUOTE]I keep ours updated from recently compiled kvm git trees. I certainly would not rely on any packaged version from any distribution. There are always tangible performance and stability increases running a more recent emulator and host kernel.

[/QUOTE]

well, this explains why you are not seeing the typical ubuntu/debian behaviour. I wouldn't put anything from the upstream git in production though, not if I want to keep my job at least, but that's up to you of course.
Post Reply