[Solved] Upgrade and New Install 8.8.5 Web Stops Responding

Discuss your pilot or production implementation with other Zimbra admins or our engineers.
verticon
Posts: 16
Joined: Sat Sep 13, 2014 3:30 am
Location: Toronto

[Solved] Upgrade and New Install 8.8.5 Web Stops Responding

Post by verticon »

I am running two 8.8.5 servers - one is an upgrade and one is a new installation. Both seem to have a high CPU load and then the web mail login screen slows eventually to a timeout which produces the error:
HTTP ERROR 504
Problem accessing ZCS upstream server. Reason: Cannot connect to the ZCS upstream server. Connection timeout.
Possible reasons:

upstream server is blocked by a firewall
upstream server is failing to send back the response in time
upstream server is down
Please contact your ZCS administrator to fix the problem.

Powered by Nginx-Zimbra://
When I restart all the services, there is a lengthy shutdown of zimbra webapp process and then continues as normal. I don't see any errors in the logs and I am at a loss. Customers are not happy and I wish I could roll back to 8.7. Let me know what information may help in the troubleshooting.
Last edited by verticon on Thu Jan 04, 2018 6:30 pm, edited 1 time in total.
User avatar
msquadrat
Advanced member
Advanced member
Posts: 183
Joined: Mon Oct 14, 2013 10:09 am

Re: Upgrade and New Install 8.8.5 Web Stops Responding

Post by msquadrat »

The very nature of the issue you describe makes it hard to pin down the core issue. If I understand you right are these two single server installations? That makes any network issues between nginx and mailboxd less likely though there might be an issue with local port exhaustion or DNS anyway. Also, you say that the login screen starts to fail from which I read that existing webmail sessions keep working?

Things I'd look into are:
  • Which process is causing the high CPU load?
  • Is the system swapping?
  • Is the system maybe underpowered and you need a RAM or CPU upgrade?
  • If you just restart nginx (zmproxyctl restart) or mailboxd (zmmailboxdctl restart) or even LDAP (ldap restart) when this happens, does one of these issues mitigate the symptoms as well?
  • Are there really no fishy looking entries in the relevant log files like nginx.log mailbox.log zmmailboxd.out myslow.log mysql_error.log gc.log?
  • How many TCP connections do you see?
  • Judging by the nginx.access.log access_log sync.log ews.log do you maybe having a client which is running amok?
  • If you try to connect to the mailboxd backend directly, does that work?
  • Do you see different behaviour when you try to curl the HTTP endpoints from the host nginx is running on vs. your client?
  • Any network issues like DNS and is /etc/hosts properly set up?
  • Does the OS complain about any issues like a broken disk or a resyncing RAID?
That's just how I'd start; when I#d see something interesting I'd drill down from there :-)
nikonaum
Posts: 6
Joined: Wed Jan 03, 2018 6:48 pm

Re: Upgrade and New Install 8.8.5 Web Stops Responding

Post by nikonaum »

Does It occur to all the accounts or just to some of them. Check the /opt/zimbra/log/mailbox.log for some JAVA exceptions and some mailbox locks (not account locks). I have this problem after updating from latest in 8.7 series to 8.8.5. It happens to accounts that are heavily using IMAP (lots of folders to sync) and a lot of CalDav task shares (DAVdroid and K-9 Android mail client). Check the performance tuning guide here: https://wiki.zimbra.com/wiki/Performanc ... eployments
verticon
Posts: 16
Joined: Sat Sep 13, 2014 3:30 am
Location: Toronto

Re: Upgrade and New Install 8.8.5 Web Stops Responding

Post by verticon »

Thanks for your response. That is correct, they are both stand alone servers. I am using Zextras as well.

This is the process that is causing the high load:
/opt/zimbra/common/bin/java -Dfile.encoding=UTF-8 -server -Dhttps.protocols=TLSv1,TLSv1.1,TLSv1.2 -Djdk.tls.client.protocols=TLSv1,TLSv1.1,TLSv1.2 -Djava.awt.headless=true -Dsun.net.inetaddr.ttl=60 -Dorg.apache.jasper.compiler.disablejsr199=true -XX:+UseConcMarkSweepGC -XX:SoftRefLRUPolicyMSPerMB=1 -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCApplicationStoppedTime -XX:-OmitStackTraceInFastThrow -Xloggc:/opt/zimbra/log/gc.log -XX:-UseGCLogFileRotation -XX:NumberOfGCLogFiles=20 -XX:GCLogFileSize=4096K -Djava.net.preferIPv4Stack=true -Xss256k -Xms6389m -Xmx6389m -Xmn3194m -Djava.io.tmpdir=/opt/zimbra/mailboxd/work -Djava.library.path=/opt/zimbra/lib -Djava.endorsed.dirs=/opt/zimbra/mailboxd/common/endorsed -Dzimbra.config=/opt/zimbra/conf/localconfig.xml -Djetty.base=/opt/zimbra/mailboxd -Djetty.home=/opt/zimbra/common/jetty_home -DSTART=/opt/zimbra/mailboxd/etc/start.config -jar /opt/zimbra/common/jetty_home/start.jar --module=zimbra,server,servlet,servlets,jsp,jstl,jmx,resources,websocket,ext,plus,rewrite,monitor,continuation,webapp,setuid jetty.home=/opt/zimbra/common/jetty_home jetty.base=/opt/zimbra/mailboxd /opt/zimbra/mailboxd/etc/jetty.xml

The server is a 32 GB and processor is Intel(R) Xeon(R) CPU E3-1225 V2 @ 3.20GHz.

No raid issues reporting and the system has 32 GB of swap available.

zmproxyctl restart - no change in load
zmmailboxdctl restart - killed the process above and restarted services (from email notifications) zimbraAdmin, mailboxd, mailbox, service, zimlet, zimbra

I think I may have tracked it down to a single mailbox through reading the zmmailboxd.out. It's using IMAP, and threw the below error which seemed to start other errors related to locking.

2018-01-03 01:19:27,793 INFO [ImapSSLServer-1] [name=sales@xxx.ca;mid=6;ip=192.99.xxx.xxx;oip=192.99.xxx.xxx3;via=192.99.xxx.xxx(nginx/1.7.1);ua=ZimbraImapDataSource/8.8.5_GA_1894;] imap - selected folder INBOX/Stxxxx at com.zimbra.soap.SoapServlet.doPost(SoapServlet.java:213)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
at com.zimbra.cs.servlet.ZimbraServlet.service(ZimbraServlet.java:206)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:821)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1685)
at com.zimbra.cs.servlet.CsrfFilter.doFilter(CsrfFilter.java:169)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668)
at com.zimbra.cs.servlet.RequestStringFilter.doFilter(RequestStringFilter.java:54)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668)
at com.zimbra.cs.servlet.SetHeaderFilter.doFilter(SetHeaderFilter.java:59)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668)
at com.zimbra.cs.servlet.ETagHeaderFilter.doFilter(ETagHeaderFilter.java:47)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668)
at com.zimbra.cs.servlet.ContextPathBasedThreadPoolBalancerFilter.doFilter(ContextPathBasedThreadPoolBalancerFilter.java:107)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668)
at com.zimbra.cs.servlet.ZimbraQoSFilter.doFilter(ZimbraQoSFilter.java:116)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668)
at com.zimbra.cs.servlet.ZimbraInvalidLoginFilter.doFilter(ZimbraInvalidLoginFilter.java:117)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668)
at org.eclipse.jetty.servlets.DoSFilter.doFilterChain(DoSFilter.java:473)
at org.eclipse.jetty.servlets.DoSFilter.doFilter(DoSFilter.java:318)
at org.eclipse.jetty.servlets.DoSFilter.doFilter(DoSFilter.java:288)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:581)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1158)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:511)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1090)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:109)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:119)
at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:318)
at org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:437)
at org.eclipse.jetty.server.handler.DebugHandler.handle(DebugHandler.java:84)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:119)
at org.eclipse.jetty.server.Server.handle(Server.java:517)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:306)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:242)
at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:261)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
at org.eclipse.jetty.io.ssl.SslConnection.onFillable(SslConnection.java:192)
at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:261)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:75)
at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume.java:213)
at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:147)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)
at java.lang.Thread.run(Thread.java:748)
Lock Waiter - qtp998351292-4769:https:https://mail.xxx.com/service/soap/SearchRequest prio=5 id=4769 state=TIMED_WAITING
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.tryLock(ReentrantReadWriteLock.java:871)
at com.zimbra.cs.mailbox.MailboxLock.tryLockWithTimeout(MailboxLock.java:116)
at com.zimbra.cs.mailbox.MailboxLock.lock(MailboxLock.java:194)
at com.zimbra.cs.mailbox.Mailbox.beginTransaction(Mailbox.java:1759)
at com.zimbra.cs.mailbox.Mailbox.beginReadTransaction(Mailbox.java:1735)
at com.zimbra.cs.mailbox.Mailbox.getItemById(Mailbox.java:2864)
at com.zimbra.cs.mailbox.Mailbox.getItemById(Mailbox.java:2856)
at com.zimbra.cs.mailbox.Mailbox.getFolderById(Mailbox.java:4092)
at com.zimbra.cs.index.query.InQuery.create(InQuery.java:68)
at com.zimbra.cs.index.query.parser.QueryParser.createQuery(QueryParser.java:577)
at com.zimbra.cs.index.query.parser.QueryParser.toTerm(QueryParser.java:427)
at com.zimbra.cs.index.query.parser.QueryParser.toTextClause(QueryParser.java:399)
at com.zimbra.cs.index.query.parser.QueryParser.toClause(QueryParser.java:363)
at com.zimbra.cs.index.query.parser.QueryParser.toQuery(QueryParser.java:342)
at com.zimbra.cs.index.query.parser.QueryParser.parse(QueryParser.java:305)
at com.zimbra.cs.index.ZimbraQuery.<init>(ZimbraQuery.java:378)
at com.zimbra.cs.mailbox.MailboxIndex.search(MailboxIndex.java:166)
at com.zimbra.cs.service.mail.Search.handle(Search.java:111)
at com.zimbra.soap.SoapEngine.dispatchRequest(SoapEngine.java:607)
at com.zimbra.soap.SoapEngine.dispatch(SoapEngine.java:460)
at com.zimbra.soap.SoapEngine.dispatch(SoapEngine.java:273)
User avatar
msquadrat
Advanced member
Advanced member
Posts: 183
Joined: Mon Oct 14, 2013 10:09 am

Re: Upgrade and New Install 8.8.5 Web Stops Responding

Post by msquadrat »

Hmm… I've already seen locking issues on some largish 8.7.x installations (multi server with about 1000 mailboxes per server) with similar workloads as nikonaum described as well. Especially if Apple macOS products were involved. I wasn't able to track down the root cause yet though. There is an undocumented localconfig option to increase the possible number of parallel locks which just shifted the issue somewhere else. I forgot the details after the new year vacation, would have to look it up. I *think* these locking issues shouldn't be able to bring down a whole server, only a single mailbox though. But I might be wrong. Has anybody else seen something similar yet by any chance?

It doesn't sound like you have more than a few hundred mailboxes so as a rough ballpark number your server specs sound fine so far. Mixing in some SSD storage often helps with performance issues but I don't really think yours is a performance issue per se.

One question due to some oddities in the stacktrace you posted: Do you use (ie. install) the new separate IMAPD service?
Last edited by msquadrat on Wed Jan 03, 2018 8:47 pm, edited 1 time in total.
verticon
Posts: 16
Joined: Sat Sep 13, 2014 3:30 am
Location: Toronto

Re: Upgrade and New Install 8.8.5 Web Stops Responding

Post by verticon »

I have about 60 mailboxes across multiple domains (approx 10 domains). I have also increased the Imap threads on the server. I only have one user who uses Imap and the rest use the ActiveSync from Zextras.

I am not using the Beta Imap service.

I have also set the virtualhostname and virtualipaddress to match the server for each domain thinking that was it as well.

I looked at the posts by moebius and they are not the same errors.
User avatar
msquadrat
Advanced member
Advanced member
Posts: 183
Joined: Mon Oct 14, 2013 10:09 am

Re: Upgrade and New Install 8.8.5 Web Stops Responding

Post by msquadrat »

Oh, I just saw the following post by moebius; do you see similar issues in your nginx.log?
viewtopic.php?f=13&t=63353

And here's another user (mntwinsfan) with these locking issues:
viewtopic.php?f=13&t=63339
Last edited by msquadrat on Wed Jan 03, 2018 8:46 pm, edited 1 time in total.
verticon
Posts: 16
Joined: Sat Sep 13, 2014 3:30 am
Location: Toronto

Re: Upgrade and New Install 8.8.5 Web Stops Responding

Post by verticon »

nikonaum wrote:Does It occur to all the accounts or just to some of them. Check the /opt/zimbra/log/mailbox.log for some JAVA exceptions and some mailbox locks (not account locks). I have this problem after updating from latest in 8.7 series to 8.8.5. It happens to accounts that are heavily using IMAP (lots of folders to sync) and a lot of CalDav task shares (DAVdroid and K-9 Android mail client). Check the performance tuning guide here: https://wiki.zimbra.com/wiki/Performanc ... eployments
I increased the Imap services. It affects all accounts as the https page doesn't spawn. I am able to use the Jetty service 8443 as a workaround but would rather use the nginx server.
nikonaum
Posts: 6
Joined: Wed Jan 03, 2018 6:48 pm

Re: Upgrade and New Install 8.8.5 Web Stops Responding

Post by nikonaum »

As written in the upgrade notes of zimbra the new IMAPD is BETA, e.g. It'll not be enabled after install. Never the less I chose the option not to install it at all.
My server is a Supermicro one with 4 AMD processors with 12 threads each. Samsung EVO pro SSD and 64GB RAM, had no such issues on 8.7.11. My DAVdroid and K-9 mail clients are always up-to-date and 2 weeks ago I had no such JAVA exceptions messages in my logs. All appeared after the upgrade. I even increased the maximum number of IMAP sessions, which helped at first but than all errors come back.
The interesting thing is that the server is locking my own account and the one of a colleague whom I shared 10 tasks list with. Maybe It is something with IMAP and calDAV shares and the number of http sessions, I can't quit figure it out. I added this configs and It was ok for 2 days but then It lock happened again:

zmprov ms `zmhostname` zimbraImapNumThreads 500
zmprov ms `zmhostname` zimbraHttpNumThreads 600
zmprov mcf zimbraLmtpNumThreads 40

I have around 1200 mailboxes!
nikonaum
Posts: 6
Joined: Wed Jan 03, 2018 6:48 pm

Re: Upgrade and New Install 8.8.5 Web Stops Responding

Post by nikonaum »

I increased the Imap services. It affects all accounts as the https page doesn't spawn. I am able to use the Jetty service 8443 as a workaround but would rather use the nginx server.[/quote]
So you have problems with your zimbra proxy settings:
Take a look at this:
https://wiki.zimbra.com/wiki/Zimbra_Proxy_Guide
https://wiki.zimbra.com/wiki/Zimbra_Pro ... mbra_Proxy
Post Reply