I spent all day working on this problem with help from #zimbra IRC yesterday with no resolution. We made sure the proxy was properly configured but it seems to intermittently have issues connecting to mailboxd. I thought that zimbra-proxy was required for 8.7+ as well, but it seem that it is only required that it is installed, not that it is enabled.
This bug is what is driving the requirement in 8.7+, but it doesn't actually say why it is required - does anyone know why?
I've researched this problem quite extensively and cannot find any more debug information for why it was happening. There's not much in the error logs, just this:
/opt/zimbra/log/nginx.log:
2016/08/29 13:50:12 [error] 28997#0: *3892 no live upstreams while connecting to upstream, client: 192.168.1.157, server: zimbra.example.com.default, request: "POST /service/soap/ConvActionRequest HTTP/1.1", host: "zimbra.example.com", referrer: "
https://zimbra.example.com/zimbra/"
2016/08/29 13:50:23 [error] 28997#0: *3892 no live upstreams while connecting to upstream, client: 192.168.1.157, server: zimbra.example.com.default, request: "POST /service/soap/NoOpRequest HTTP/1.1", host: "zimbra.example.com", referrer: "
https://zimbra.example.com/zimbra/"
2016/08/29 13:50:24 [error] 28997#0: *3892 no live upstreams while connecting to upstream, client: 192.168.1.157, server: zimbra.example.com.default, request: "POST /service/soap/SearchRequest HTTP/1.1", host: "zimbra.example.com", referrer: "
https://zimbra.example.com/zimbra/"
/opt/zimbra/log/mailbox.log:
This page talks about what the 502 error means in regards to the proxy and mailboxd communcation:
2. 502 Bad Gateway: The server was acting as a gateway or proxy and received an invalid response from the upstream server.
This is a single-node instance, so the proxy and mailboxd are on the same server. The same action will fail with the 502 error, but then if you try it again (e.g. click on the same folder or message again), it will then work. This type of intermittent behavior makes me think it's some kind of throttling or timeout problem rather than a complete misconfiguration (which I would expect to work completely or not at all).
Based on this, what would you recommend as the next step of debug?