Sporadic account auth failure / timeout

Discuss your pilot or production implementation with other Zimbra admins or our engineers.
tjessy
Posts: 7
Joined: Tue May 25, 2021 2:14 pm

Sporadic account auth failure / timeout

Post by tjessy »

We are receiving sporadic reports from our clients for some days now. A few of their accounts is unable to access their mails from Outlook / Thunderbird,
while all the other accounts are working fine from the same location. The weirdest situation is when on a sinlge machine one out of three accounts became
inaccessbile inside Outlook - the two others working just fine...

Upon investigating we found:
- the account is accessible via the web interface without any problems (so the IP is not locked out)
- the account is working as expected, it is in Active state
- no relevant errors found in zimbra.log, mailbox.log, zmmailboxd.out, authd.log, nginx.log
- there is no sign of tried connections in authd.log at all for these accounts - so they are not locked out because of password failures (this is handled by f2b)
- after a server restart they work for a few hours or days - but always ending up in a timeout loop when trying to connect

It seems that somehow they stuck at the authentication phase and time out - but without any obvious log records it's hard to tell.
Have anyone experienced such before?

This server is working fine for almost 2 years with zero problems.

It is a standalone 8.8.15_GA_3918.FOSS P8 on an Ubuntu 16.04 LTS
phoenix
Ambassador
Ambassador
Posts: 27278
Joined: Fri Sep 12, 2014 9:56 pm
Location: Liverpool, England

Re: Sporadic account auth failure / timeout

Post by phoenix »

tjessy wrote:It is a standalone 8.8.15_GA_3918.FOSS P8 on an Ubuntu 16.04 LTS
Why are you not on the latest patch release? It is important to keep your sever up-to-date and I'd suggest you do that first then see if your problem still occurs.
Regards

Bill

Rspamd: A high performance spamassassin replacement

Per ardua ad astra
tjessy
Posts: 7
Joined: Tue May 25, 2021 2:14 pm

Re: Sporadic account auth failure / timeout

Post by tjessy »

Thanks for the quick reply - scheduled for Friday, wanna minimize downtime. :)
federgiannini
Posts: 6
Joined: Thu May 20, 2021 6:45 pm

Re: Sporadic account auth failure / timeout

Post by federgiannini »

The problem seems to be memcached. When that happens try to

Code: Select all

su - zimbra
zmmemcachedctl stop
i pull out my hair for few weeks and then noted (loggin in with openssl) that port 7993 worked while 993 does not, so my assumption was to stop the mentioned service. For unknown reason memchaed does not works properly
Hope this can help someone
federgiannini
Posts: 6
Joined: Thu May 20, 2021 6:45 pm

Re: Sporadic account auth failure / timeout

Post by federgiannini »

Okay,
much better now. My first reply was on the run an had no way to be much deeper.
First of all my environment

Code: Select all

Release 8.8.15.GA.3869.UBUNTU18.64 UBUNTU18_64 FOSS edition, Patch 8.8.15_P21.
The problems began when i installed the last patch, before everything worked like a charm. As tessy said no log, no trace nothing than nothing.
You can understand that without any trace the troubleshoot is very difficult. After pulling out my hairs for at least 3 weeks i did the following:
- try to connect via telnet from client machine on port 993 ... nothing back
- tryed to connect with user account and password from whitin the server itself on port 993 ... nothing
- finlly i opened a tcpdump on the server and tried the step above again. Unluckly i have no more the tcpdump packet but it was a RST (Reset). Those few bytes opened my mind
- Remeber the zimbra proxy port i tried to login with openssl on port 7993 .... BINGO the account worked
So i tried to stop for few days the zimbra memcached service (as shown in my previous quick response) just for testing purposes and had no more problems.
So, my opinion is that for some kind of strange behaviur the memcached service crashes somewhere (but without log it is diffuclt to say where it does)
I ended to stop 4 ever the memcached service .. or ... at least until a new patch is released.
Hope this description can help someone to better troubleshoot the problem.
Bye
tjessy
Posts: 7
Joined: Tue May 25, 2021 2:14 pm

Re: Sporadic account auth failure / timeout

Post by tjessy »

federgiannini wrote:Okay,
much better now. My first reply was on the run an had no way to be much deeper.
First of all my environment

Code: Select all

Release 8.8.15.GA.3869.UBUNTU18.64 UBUNTU18_64 FOSS edition, Patch 8.8.15_P21.
Thanks for your insights! Right now all accounts are working fine here, but before the update I'll give this shot as soon as they start failing... :)
federgiannini
Posts: 6
Joined: Thu May 20, 2021 6:45 pm

Re: Sporadic account auth failure / timeout

Post by federgiannini »

ok, i suggest, just to validate the problem, that when it will happen you could just do

Code: Select all

su - zimbra
zmmemcachedctl stop
zmmemcachedctl start
As soon you give the above commands the user will be able to login again
tjessy
Posts: 7
Joined: Tue May 25, 2021 2:14 pm

Re: Sporadic account auth failure / timeout

Post by tjessy »

federgiannini wrote:ok, i suggest, just to validate the problem, that when it will happen you could just do

Code: Select all

su - zimbra
zmmemcachedctl stop
zmmemcachedctl start
As soon you give the above commands the user will be able to login again
Totally CONFIRMED! Just tried when a user called stating he is not able to connect. Right after memcached restart he could.
It's pretty odd that this one could affect installations from P8 to P21. :/

Do you think it would worth filing a bug, or should we further investigate what the problem could be?
As we haven't touched this server for quite a while, at first I was blaming Windows Update - but one of our clients is on Win7 (omg...)
and that was also affected... so it's really strange.
federgiannini
Posts: 6
Joined: Thu May 20, 2021 6:45 pm

Re: Sporadic account auth failure / timeout

Post by federgiannini »

I don't know. I am just a sysadmin. IMHO i think it is a bug. I leave to zimbra gurus further investigations. By now i just stopped the memcached service and my users are happy again :-)
Glad to help.
tjessy wrote:
federgiannini wrote:ok, i suggest, just to validate the problem, that when it will happen you could just do

Code: Select all

su - zimbra
zmmemcachedctl stop
zmmemcachedctl start
As soon you give the above commands the user will be able to login again
Totally CONFIRMED! Just tried when a user called stating he is not able to connect. Right after memcached restart he could.
It's pretty odd that this one could affect installations from P8 to P21. :/

Do you think it would worth filing a bug, or should we further investigate what the problem could be?
As we haven't touched this server for quite a while, at first I was blaming Windows Update - but one of our clients is on Win7 (omg...)
and that was also affected... so it's really strange.
tjessy
Posts: 7
Joined: Tue May 25, 2021 2:14 pm

Re: Sporadic account auth failure / timeout

Post by tjessy »

Same stuff started on another a server... I think I'm gonna add a cronjob to restart memcached every 10 minutes... :/
Post Reply