Page 1 of 1

zmlogswatchctl is not running after upgrade of Zimbra and Ubuntu

Posted: Sun Oct 11, 2020 4:04 am
by boldlookup
I recently upgrade Zimbra from 8.6.0 to 8.8.15 and then upgraded my server from Ubuntu 14.04 to 16.04 and then 18.04.

No issues during these upgrades. On zmcontrol start everything seems fine but on zmcontrol status Zimbra reports that logger is stopped and zmlogswatchctl is not running. As shown below.

Code: Select all

zimbra@zimbra:~$ zmcontrol status
Host ***.com
	amavis                  Running
	antispam                Running
	antivirus               Running
	ldap                    Running
	logger                  Stopped
		zmlogswatchctl is not running
	mailbox                 Running
	memcached               Running
	mta                     Running
	opendkim                Running
	proxy                   Running
	service webapp          Running
	snmp                    Running
	spell                   Running
	stats                   Running
	zimbra webapp           Running
	zimbraAdmin webapp      Running
	zimlet webapp           Running
	zmconfigd               Running
zimbra@zimbra:~$ 
I have tried the following reinstall van rsyslog

Code: Select all

apt remove rsyslog && apt install rsyslog

Code: Select all

root@zimbra:~# /opt/zimbra/libexec/zmsyslogsetup
updateRsyslogd: Updating /etc/rsyslog.d/50-default.conf...done.
root@zimbra:~# 
But the problem persists. Any suggestions on how this can be fixed?

Re: zmlogswatchctl is not running after upgrade of Zimbra and Ubuntu

Posted: Sun Oct 11, 2020 8:10 pm
by JDunphy
Swatch is a perl program. Sometimes, a missing perl module that it requires is the reason for it not starting and aborting.

Look inside this file: /opt/zimbra/log/zmlogswatch.out for possible reason.

From the file /opt/zimbra/bin/zmlogswatchctl you should see something like this.

Code: Select all

    /opt/zimbra/common/bin/swatchdog --config-file=${configfile} \
      --use-cpan-file-tail --pid-file=${pidfile} --daemon \
      --script-dir=${zimbra_tmp_directory} \
      --tail-file /var/log/zimbra-stats.log > $logfile 2>&1
Where logfile=${zimbra_log_directory}/zmlogswatch.out

Here is an example of swatch running:

Code: Select all

% ps aux |grep swatch
zimbra   12547  0.0  0.1  38172 10364 ?        SN   03:14   0:00 /usr/bin/perl /opt/zimbra/common/bin/swatchdog --config-file=/opt/zimbra/conf/swatchrc --use-cpan-file-tail --script-dir=/opt/zimbra/data/tmp --tail-file /var/log/zimbra.log
zimbra   12560  0.0  0.3  79120 26620 ?        SN   03:14   0:20 /usr/bin/perl /opt/zimbra/data/tmp/.swatchdog_script.12547
You can also run the command directly from the command line to see the possible reason it won't start.

Code: Select all

# su - zimbra
% /usr/bin/perl /opt/zimbra/common/bin/swatchdog --config-file=/opt/zimbra/conf/swatchrc --use-cpan-file-tail --script-dir=/opt/zimbra/data/tmp --tail-file /var/log/zimbra.log
or better yet...

Code: Select all

# su - zimbra
% cd /opt/zimbra/data/tmp/
% ls -al .swatchdog_script*
And run one of the scripts present like this:

Code: Select all

% perl .swatchdog_script.XXXX
Ref: swatch allows one to use a simple config language - see /opt/zimbra/conf/swatchrc and then generates a perl program. That is the .swatchdog_script.XXXX. It looks for patterns in a logfile and then does actions when they are met.

HTH,

Jim

Re: zmlogswatchctl is not running after upgrade of Zimbra and Ubuntu

Posted: Thu Oct 15, 2020 2:45 am
by boldlookup
Everything looks fine. Swatch is running it seems...

Code: Select all

zimbra@zimbra:~/data/tmp$ ps aux | grep swatch
zimbra    6044  0.0  0.2  37820 13028 pts/0    S    04:24   0:00 /usr/bin/perl /opt/zimbra/common/bin/swatchdog --config-file=/opt/zimbra/conf/swatchrc --use-cpan-file-tail --script-dir=/opt/zimbra/data/tmp --tail-file /var/log/zimbra.log
zimbra    6054  0.4  0.7  98100 39340 pts/0    S    04:24   0:00 /usr/bin/perl /opt/zimbra/data/tmp/.swatchdog_script.6044
zimbra   10301  0.0  0.0  11464  1080 pts/0    S+   04:28   0:00 grep swatch
I can run the script manually - no errors

Code: Select all

zimbra@zimbra:~/data/tmp$ /usr/bin/perl /opt/zimbra/common/bin/swatchdog --config-file=/opt/zimbra/conf/swatchrc --use-cpan-file-tail --script-dir=/opt/zimbra/data/tmp --tail-file /var/log/zimbra.log

*** swatchdog version 3.2.4 (pid:12605) started at Thu Oct 15 04:31:48 CEST 2020
Or run the script

Code: Select all

zimbra@zimbra:~/data/tmp$ /usr/bin/perl /opt/zimbra/data/tmp/.swatchdog_script.6044

*** swatchdog version 3.2.4 (pid:6044) started at Thu Oct 15 04:33:29 CEST 2020
What is strange is that there are many scripts in that directory

Code: Select all

zimbra@zimbra:~/data/tmp$ ls -la .swatchdog_script*
-rw-r----- 1 zimbra zimbra 2982 Oct 10 06:13 .swatchdog_script.10050
-rw-r----- 1 zimbra zimbra 2982 Oct 10 06:01 .swatchdog_script.11980
-rw-r----- 1 zimbra zimbra 2982 Oct 11 05:18 .swatchdog_script.15230
-rw-r----- 1 zimbra zimbra 2982 Oct 10 04:29 .swatchdog_script.20624
-rw-r----- 1 zimbra zimbra 2982 Oct 11 05:22 .swatchdog_script.22581
-rw-r----- 1 zimbra zimbra 2982 Oct 11 05:50 .swatchdog_script.23289
-rw-r----- 1 zimbra zimbra 2982 Oct 10 05:47 .swatchdog_script.2383
-rw-r----- 1 zimbra zimbra 2982 Oct 10 06:11 .swatchdog_script.2547
-rw-r----- 1 zimbra zimbra 2982 Oct 10 05:54 .swatchdog_script.2550
-rw-r----- 1 zimbra zimbra 2982 Oct 11 05:00 .swatchdog_script.27207
-rw-r----- 1 zimbra zimbra 2982 Oct 10 09:45 .swatchdog_script.28334
-rw-r----- 1 zimbra zimbra 2982 Oct 11 05:54 .swatchdog_script.30574
-rw-r----- 1 zimbra zimbra 2982 Oct 10 05:47 .swatchdog_script.3235
-rw-r----- 1 zimbra zimbra 2982 Oct 10 09:47 .swatchdog_script.32397
-rw-r----- 1 zimbra zimbra 2982 Oct 11 05:07 .swatchdog_script.3291
-rw-r----- 1 zimbra zimbra 2982 Oct 10 06:11 .swatchdog_script.3573
-rw-r----- 1 zimbra zimbra 2982 Oct 10 05:56 .swatchdog_script.7425
-rw-r----- 1 zimbra zimbra 2982 Oct 11 05:43 .swatchdog_script.8606

Re: zmlogswatchctl is not running after upgrade of Zimbra and Ubuntu

Posted: Thu Oct 15, 2020 3:36 am
by JDunphy
That is normal... It doesn't clean up after itself if it aborts or doesn't shutdown correctly. The only ones you need are the those listed in your ps output. The others you can delete as they are old. Those .swatchdog_script.20624 are generated from the swatchrc files where 20624 would be the pid in this example. You should have 2 swatchdog_scripts running when you do a ps.

Re: zmlogswatchctl is not running after upgrade of Zimbra and Ubuntu

Posted: Tue Oct 20, 2020 1:33 am
by boldlookup
When I run the script manually it starts without errors

Code: Select all

zimbra@zimbra:~/data/tmp$ /usr/bin/perl /opt/zimbra/common/bin/swatchdog --config-file=/opt/zimbra/conf/swatchrc --use-cpan-file-tail --script-dir=/opt/zimbra/data/tmp --tail-file /var/log/zimbra.log

*** swatchdog version 3.2.4 (pid:6615) started at Tue Oct 20 03:20:39 CEST 2020

zimbra@zimbra:~/data/tmp$ ps aux | grep swatch
zimbra    6615  0.1  0.2  37564 12716 pts/0    S+   03:20   0:00 /usr/bin/perl /opt/zimbra/common/bin/swatchdog --config-file=/opt/zimbra/conf/swatchrc --use-cpan-file-tail --script-dir=/opt/zimbra/data/tmp --tail-file /var/log/zimbra.log
zimbra    6616  0.8  0.7  97880 38960 pts/0    S+   03:20   0:00 /usr/bin/perl /opt/zimbra/data/tmp/.swatchdog_script.6615
zimbra    7200  0.0  0.0  11464  1012 pts/1    S+   03:21   0:00 grep swatch
zimbra   10586  0.0  0.2  37856 10548 ?        S    Oct19   0:00 /usr/bin/perl /opt/zimbra/common/bin/swatchdog --config-file=/opt/zimbra/conf/swatchrc --use-cpan-file-tail --script-dir=/opt/zimbra/data/tmp --tail-file /var/log/zimbra.log
zimbra   10597  0.0  0.6  98096 34044 ?        S    Oct19   0:36 /usr/bin/perl /opt/zimbra/data/tmp/.swatchdog_script.10586
But status remains down

Code: Select all

zimbra@zimbra:~/data/tmp$ zmcontrol status
Host *****
	amavis                  Running
	antispam                Running
	antivirus               Running
	ldap                    Running
	logger                  Stopped
		zmlogswatchctl is not running
	mailbox                 Running
	memcached               Running
	mta                     Running
	opendkim                Running
	proxy                   Running
	service webapp          Running
	snmp                    Running
	spell                   Running
	stats                   Running
	zimbra webapp           Running
	zimbraAdmin webapp      Running
	zimlet webapp           Running
	zmconfigd               Running



Re: zmlogswatchctl is not running after upgrade of Zimbra and Ubuntu

Posted: Thu Oct 22, 2020 7:44 pm
by JDunphy
When you started manually, you need to replicate what the script does including the pid files. zmlogswatchctl has the following lines that get executed when you are asking for status if it's running.

Code: Select all


pidfile=${zimbra_log_directory}/logswatch.pid
zmrrdfetchpidfile=${zimbra_log_directory}/zmrrdfetch-server.pid


getpid()
{
  if [ -f ${pidfile} ]; then
    pid=$(cat ${pidfile})
  fi
  if [ -f ${zmrrdfetchpidfile} ]; then
    zmrrdfetchpid=$(cat ${zmrrdfetchpidfile})
  fi
}

checkrunning()
{
  getpid
  if [ "x$pid" = "x" ]; then
    running=0
  else
    kill -0 $pid 2> /dev/null
    if [ $? != 0 ]; then
      pid=""
      running=0
    else
      running=1
    fi
  fi
}
So you can manually do the following to see how/why zmlogswatchchtl might think programs it starts/stop isn't running.

Code: Select all

% cd /opt/zimbra/log
% cat logswatch.pid; cat zmrrdfetch-server.pid 
19693
19795
% ps aux |egrep '(19693|19795)'
zimbra   19693  0.0  0.2  56596 17628 ?        SNs  04:50   0:15 /opt/zimbra/common/bin/swatchdog --config-file=/opt/zimbra/conf/logswatchrc --use-cpan-file-tail --pid-file=/opt/zimbra/log/logswatch.pid --daemon --script-dir=/opt/zimbra/data/tmp --tail-file /var/log/zimbra-stats.log
zimbra   19795  0.0  0.0  60340  7828 ?        SN   04:50   0:00 zmlogger: zmrrdfetch: server
The kill -0 construct from above is an odd way to determine if something is running. It doesn't send any signal and the kernel simply returns back a fail/success that you would be able to send a signal to that process if you have correct permissions. Most of the zimbra status scripts use this paradigm so that is why a lot of forum posts say to try and rm the pid file in case it existed with the wrong permissions and zimbra couldn't overwrite it. Note: what if pid's wrap and something else is running or if that pid file contains a low pid number and you have rebooted and one of the other programs now has that pid? Anyway...

start work like this. Note the amount of information that is saved in /var/log/zimbra.log

Code: Select all

Is there a swatch configuration file? If not log that  "logswatchrc is missing"
Are the programs already running? If not log that" logwatch is already running"
Is logger service enabled? if not log that "logger service is not enabled! failed"
start swatchdog with --pid-file saved to the location shown above. Note: it will also start zmlogger from the swatchrc action.
Check that it is running and if it is... return back to zmcontrol status that everything is running
Normally, whenever something doesn't run from zmcontrol status - the first thing you do is something like:

Code: Select all

# su - zimbra
% zmlogswatchchtl restart
zmcontrol [start|stop|restart|status] calls zmlogswatchctl with the same argument. If that fails to fix it, look at the pid files in /opt/zimbra/log for permission problems or tail -f /var/log/zimbra.log as you attempt to restart it and look for why.

HTH,

Jim