Page 1 of 1
zmlogswatchctl is not running after upgrade of Zimbra and Ubuntu
Posted: Sun Oct 11, 2020 4:04 am
by boldlookup
I recently upgrade Zimbra from 8.6.0 to 8.8.15 and then upgraded my server from Ubuntu 14.04 to 16.04 and then 18.04.
No issues during these upgrades. On zmcontrol start everything seems fine but on zmcontrol status Zimbra reports that logger is stopped and zmlogswatchctl is not running. As shown below.
Code: Select all
zimbra@zimbra:~$ zmcontrol status
Host ***.com
amavis Running
antispam Running
antivirus Running
ldap Running
logger Stopped
zmlogswatchctl is not running
mailbox Running
memcached Running
mta Running
opendkim Running
proxy Running
service webapp Running
snmp Running
spell Running
stats Running
zimbra webapp Running
zimbraAdmin webapp Running
zimlet webapp Running
zmconfigd Running
zimbra@zimbra:~$
I have tried the following reinstall van rsyslog
Code: Select all
apt remove rsyslog && apt install rsyslog
Code: Select all
root@zimbra:~# /opt/zimbra/libexec/zmsyslogsetup
updateRsyslogd: Updating /etc/rsyslog.d/50-default.conf...done.
root@zimbra:~#
But the problem persists. Any suggestions on how this can be fixed?
Re: zmlogswatchctl is not running after upgrade of Zimbra and Ubuntu
Posted: Sun Oct 11, 2020 8:10 pm
by JDunphy
Swatch is a perl program. Sometimes, a missing perl module that it requires is the reason for it not starting and aborting.
Look inside this file: /opt/zimbra/log/zmlogswatch.out for possible reason.
From the file /opt/zimbra/bin/zmlogswatchctl you should see something like this.
Code: Select all
/opt/zimbra/common/bin/swatchdog --config-file=${configfile} \
--use-cpan-file-tail --pid-file=${pidfile} --daemon \
--script-dir=${zimbra_tmp_directory} \
--tail-file /var/log/zimbra-stats.log > $logfile 2>&1
Where logfile=${zimbra_log_directory}/zmlogswatch.out
Here is an example of swatch running:
Code: Select all
% ps aux |grep swatch
zimbra 12547 0.0 0.1 38172 10364 ? SN 03:14 0:00 /usr/bin/perl /opt/zimbra/common/bin/swatchdog --config-file=/opt/zimbra/conf/swatchrc --use-cpan-file-tail --script-dir=/opt/zimbra/data/tmp --tail-file /var/log/zimbra.log
zimbra 12560 0.0 0.3 79120 26620 ? SN 03:14 0:20 /usr/bin/perl /opt/zimbra/data/tmp/.swatchdog_script.12547
You can also run the command directly from the command line to see the possible reason it won't start.
Code: Select all
# su - zimbra
% /usr/bin/perl /opt/zimbra/common/bin/swatchdog --config-file=/opt/zimbra/conf/swatchrc --use-cpan-file-tail --script-dir=/opt/zimbra/data/tmp --tail-file /var/log/zimbra.log
or better yet...
Code: Select all
# su - zimbra
% cd /opt/zimbra/data/tmp/
% ls -al .swatchdog_script*
And run one of the scripts present like this:
Ref: swatch allows one to use a simple config language - see /opt/zimbra/conf/swatchrc and then generates a perl program. That is the .swatchdog_script.XXXX. It looks for patterns in a logfile and then does actions when they are met.
HTH,
Jim
Re: zmlogswatchctl is not running after upgrade of Zimbra and Ubuntu
Posted: Thu Oct 15, 2020 2:45 am
by boldlookup
Everything looks fine. Swatch is running it seems...
Code: Select all
zimbra@zimbra:~/data/tmp$ ps aux | grep swatch
zimbra 6044 0.0 0.2 37820 13028 pts/0 S 04:24 0:00 /usr/bin/perl /opt/zimbra/common/bin/swatchdog --config-file=/opt/zimbra/conf/swatchrc --use-cpan-file-tail --script-dir=/opt/zimbra/data/tmp --tail-file /var/log/zimbra.log
zimbra 6054 0.4 0.7 98100 39340 pts/0 S 04:24 0:00 /usr/bin/perl /opt/zimbra/data/tmp/.swatchdog_script.6044
zimbra 10301 0.0 0.0 11464 1080 pts/0 S+ 04:28 0:00 grep swatch
I can run the script manually - no errors
Code: Select all
zimbra@zimbra:~/data/tmp$ /usr/bin/perl /opt/zimbra/common/bin/swatchdog --config-file=/opt/zimbra/conf/swatchrc --use-cpan-file-tail --script-dir=/opt/zimbra/data/tmp --tail-file /var/log/zimbra.log
*** swatchdog version 3.2.4 (pid:12605) started at Thu Oct 15 04:31:48 CEST 2020
Or run the script
Code: Select all
zimbra@zimbra:~/data/tmp$ /usr/bin/perl /opt/zimbra/data/tmp/.swatchdog_script.6044
*** swatchdog version 3.2.4 (pid:6044) started at Thu Oct 15 04:33:29 CEST 2020
What is strange is that there are many scripts in that directory
Code: Select all
zimbra@zimbra:~/data/tmp$ ls -la .swatchdog_script*
-rw-r----- 1 zimbra zimbra 2982 Oct 10 06:13 .swatchdog_script.10050
-rw-r----- 1 zimbra zimbra 2982 Oct 10 06:01 .swatchdog_script.11980
-rw-r----- 1 zimbra zimbra 2982 Oct 11 05:18 .swatchdog_script.15230
-rw-r----- 1 zimbra zimbra 2982 Oct 10 04:29 .swatchdog_script.20624
-rw-r----- 1 zimbra zimbra 2982 Oct 11 05:22 .swatchdog_script.22581
-rw-r----- 1 zimbra zimbra 2982 Oct 11 05:50 .swatchdog_script.23289
-rw-r----- 1 zimbra zimbra 2982 Oct 10 05:47 .swatchdog_script.2383
-rw-r----- 1 zimbra zimbra 2982 Oct 10 06:11 .swatchdog_script.2547
-rw-r----- 1 zimbra zimbra 2982 Oct 10 05:54 .swatchdog_script.2550
-rw-r----- 1 zimbra zimbra 2982 Oct 11 05:00 .swatchdog_script.27207
-rw-r----- 1 zimbra zimbra 2982 Oct 10 09:45 .swatchdog_script.28334
-rw-r----- 1 zimbra zimbra 2982 Oct 11 05:54 .swatchdog_script.30574
-rw-r----- 1 zimbra zimbra 2982 Oct 10 05:47 .swatchdog_script.3235
-rw-r----- 1 zimbra zimbra 2982 Oct 10 09:47 .swatchdog_script.32397
-rw-r----- 1 zimbra zimbra 2982 Oct 11 05:07 .swatchdog_script.3291
-rw-r----- 1 zimbra zimbra 2982 Oct 10 06:11 .swatchdog_script.3573
-rw-r----- 1 zimbra zimbra 2982 Oct 10 05:56 .swatchdog_script.7425
-rw-r----- 1 zimbra zimbra 2982 Oct 11 05:43 .swatchdog_script.8606
Re: zmlogswatchctl is not running after upgrade of Zimbra and Ubuntu
Posted: Thu Oct 15, 2020 3:36 am
by JDunphy
That is normal... It doesn't clean up after itself if it aborts or doesn't shutdown correctly. The only ones you need are the those listed in your ps output. The others you can delete as they are old. Those .swatchdog_script.20624 are generated from the swatchrc files where 20624 would be the pid in this example. You should have 2 swatchdog_scripts running when you do a ps.
Re: zmlogswatchctl is not running after upgrade of Zimbra and Ubuntu
Posted: Tue Oct 20, 2020 1:33 am
by boldlookup
When I run the script manually it starts without errors
Code: Select all
zimbra@zimbra:~/data/tmp$ /usr/bin/perl /opt/zimbra/common/bin/swatchdog --config-file=/opt/zimbra/conf/swatchrc --use-cpan-file-tail --script-dir=/opt/zimbra/data/tmp --tail-file /var/log/zimbra.log
*** swatchdog version 3.2.4 (pid:6615) started at Tue Oct 20 03:20:39 CEST 2020
zimbra@zimbra:~/data/tmp$ ps aux | grep swatch
zimbra 6615 0.1 0.2 37564 12716 pts/0 S+ 03:20 0:00 /usr/bin/perl /opt/zimbra/common/bin/swatchdog --config-file=/opt/zimbra/conf/swatchrc --use-cpan-file-tail --script-dir=/opt/zimbra/data/tmp --tail-file /var/log/zimbra.log
zimbra 6616 0.8 0.7 97880 38960 pts/0 S+ 03:20 0:00 /usr/bin/perl /opt/zimbra/data/tmp/.swatchdog_script.6615
zimbra 7200 0.0 0.0 11464 1012 pts/1 S+ 03:21 0:00 grep swatch
zimbra 10586 0.0 0.2 37856 10548 ? S Oct19 0:00 /usr/bin/perl /opt/zimbra/common/bin/swatchdog --config-file=/opt/zimbra/conf/swatchrc --use-cpan-file-tail --script-dir=/opt/zimbra/data/tmp --tail-file /var/log/zimbra.log
zimbra 10597 0.0 0.6 98096 34044 ? S Oct19 0:36 /usr/bin/perl /opt/zimbra/data/tmp/.swatchdog_script.10586
But status remains down
Code: Select all
zimbra@zimbra:~/data/tmp$ zmcontrol status
Host *****
amavis Running
antispam Running
antivirus Running
ldap Running
logger Stopped
zmlogswatchctl is not running
mailbox Running
memcached Running
mta Running
opendkim Running
proxy Running
service webapp Running
snmp Running
spell Running
stats Running
zimbra webapp Running
zimbraAdmin webapp Running
zimlet webapp Running
zmconfigd Running
Re: zmlogswatchctl is not running after upgrade of Zimbra and Ubuntu
Posted: Thu Oct 22, 2020 7:44 pm
by JDunphy
When you started manually, you need to replicate what the script does including the pid files. zmlogswatchctl has the following lines that get executed when you are asking for status if it's running.
Code: Select all
pidfile=${zimbra_log_directory}/logswatch.pid
zmrrdfetchpidfile=${zimbra_log_directory}/zmrrdfetch-server.pid
getpid()
{
if [ -f ${pidfile} ]; then
pid=$(cat ${pidfile})
fi
if [ -f ${zmrrdfetchpidfile} ]; then
zmrrdfetchpid=$(cat ${zmrrdfetchpidfile})
fi
}
checkrunning()
{
getpid
if [ "x$pid" = "x" ]; then
running=0
else
kill -0 $pid 2> /dev/null
if [ $? != 0 ]; then
pid=""
running=0
else
running=1
fi
fi
}
So you can manually do the following to see how/why zmlogswatchchtl might think programs it starts/stop isn't running.
Code: Select all
% cd /opt/zimbra/log
% cat logswatch.pid; cat zmrrdfetch-server.pid
19693
19795
% ps aux |egrep '(19693|19795)'
zimbra 19693 0.0 0.2 56596 17628 ? SNs 04:50 0:15 /opt/zimbra/common/bin/swatchdog --config-file=/opt/zimbra/conf/logswatchrc --use-cpan-file-tail --pid-file=/opt/zimbra/log/logswatch.pid --daemon --script-dir=/opt/zimbra/data/tmp --tail-file /var/log/zimbra-stats.log
zimbra 19795 0.0 0.0 60340 7828 ? SN 04:50 0:00 zmlogger: zmrrdfetch: server
The kill -0 construct from above is an odd way to determine if something is running. It doesn't send any signal and the kernel simply returns back a fail/success that you would be able to send a signal to that process if you have correct permissions. Most of the zimbra status scripts use this paradigm so that is why a lot of forum posts say to try and rm the pid file in case it existed with the wrong permissions and zimbra couldn't overwrite it. Note: what if pid's wrap and something else is running or if that pid file contains a low pid number and you have rebooted and one of the other programs now has that pid? Anyway...
start work like this. Note the amount of information that is saved in /var/log/zimbra.log
Code: Select all
Is there a swatch configuration file? If not log that "logswatchrc is missing"
Are the programs already running? If not log that" logwatch is already running"
Is logger service enabled? if not log that "logger service is not enabled! failed"
start swatchdog with --pid-file saved to the location shown above. Note: it will also start zmlogger from the swatchrc action.
Check that it is running and if it is... return back to zmcontrol status that everything is running
Normally, whenever something doesn't run from zmcontrol status - the first thing you do is something like:
Code: Select all
# su - zimbra
% zmlogswatchchtl restart
zmcontrol [start|stop|restart|status] calls zmlogswatchctl with the same argument. If that fails to fix it, look at the pid files in /opt/zimbra/log for permission problems or tail -f /var/log/zimbra.log as you attempt to restart it and look for why.
HTH,
Jim