JAVA not reaping unused LMTP sessions

Discuss your pilot or production implementation with other Zimbra admins or our engineers.
linkfan
Posts: 2
Joined: Sun Jul 10, 2022 9:36 am

Re: JAVA not reaping unused LMTP sessions

Post by linkfan »

For me:

Code: Select all

zmlocalconfig -e postfix_lmtp_destination_concurrency_limit=20
fixed the problem, 24h no huge load.

Regards
BradC
Outstanding Member
Outstanding Member
Posts: 270
Joined: Tue May 03, 2016 1:39 am

Re: JAVA not reaping unused LMTP sessions

Post by BradC »

linkfan wrote:For me:

Code: Select all

zmlocalconfig -e postfix_lmtp_destination_concurrency_limit=20
fixed the problem, 24h no huge load.
Now the question is what got broken that required that workaround?
User avatar
L. Mark Stone
Ambassador
Ambassador
Posts: 2802
Joined: Wed Oct 09, 2013 11:35 am
Location: Portland, Maine, US
ZCS/ZD Version: 10.0.7 Network Edition
Contact:

Re: JAVA not reaping unused LMTP sessions

Post by L. Mark Stone »

linkfan wrote:For me:

Code: Select all

zmlocalconfig -e postfix_lmtp_destination_concurrency_limit=20
fixed the problem, 24h no huge load.

Regards
Good to hear! Support is watching this thread, so your reporting will help.

All the best,
Mark
___________________________________
L. Mark Stone
Mission Critical Email - Zimbra VAR/BSP/Training Partner https://www.missioncriticalemail.com/
AWS Certified Solutions Architect-Associate
User avatar
L. Mark Stone
Ambassador
Ambassador
Posts: 2802
Joined: Wed Oct 09, 2013 11:35 am
Location: Portland, Maine, US
ZCS/ZD Version: 10.0.7 Network Edition
Contact:

Re: JAVA not reaping unused LMTP sessions

Post by L. Mark Stone »

BradC wrote: Now the question is what got broken that required that workaround?
No idea what changed, but lots of updates in Patch 32 as we know...

If I hear specifics from Support I'll post.
___________________________________
L. Mark Stone
Mission Critical Email - Zimbra VAR/BSP/Training Partner https://www.missioncriticalemail.com/
AWS Certified Solutions Architect-Associate
BradC
Outstanding Member
Outstanding Member
Posts: 270
Joined: Tue May 03, 2016 1:39 am

Re: JAVA not reaping unused LMTP sessions

Post by BradC »

L. Mark Stone wrote:No idea what changed, but lots of updates in Patch 32 as we know...
I'm more wondering about what breakage is going to crawl out of the woodwork next. This patch set has been the gift that keeps on giving.
mgarbin
Posts: 35
Joined: Wed Jun 26, 2019 11:00 am

Re: JAVA not reaping unused LMTP sessions

Post by mgarbin »

L. Mark Stone wrote:
BradC wrote: Now the question is what got broken that required that workaround?
No idea what changed, but lots of updates in Patch 32 as we know...

If I hear specifics from Support I'll post.
I written it in this post http://forums.zimbra.org/viewtopic.php? ... bd#p305797 .....
Now they pushed a new fix .... https://github.com/Zimbra/zm-mailbox/pull/1317 :roll:
stasouv
Advanced member
Advanced member
Posts: 63
Joined: Sat Sep 13, 2014 2:25 am
ZCS/ZD Version: 8.8.15_GA_3869.RHEL7_64_20190917004

Re: JAVA not reaping unused LMTP sessions

Post by stasouv »

[This message has been edited with new findings]

We haven't restarted the mailbox server yet to pick the LMTP concurrency variable, but we did have periods of "Peace", of variable lengths.

The longest was about 38hours. Notwithstanding this, the problem persists, even after that tranquility.

Following up, regardless of the concurrency setting, we seem to have found the (or one) triggering mechanism for this behaviour.

Below is the analysis of a specific incoming message, from /var/log/zimbra.log:

Code: Select all

Jul 13 02:18:40 HOSTNAME postfix/smtpd[8042]: 7F25B88CBF56: client=EXTERNAL_RELAY.EXAMPLE.COM[XX.XX.XX.10]
Jul 13 02:18:40 HOSTNAME postfix/cleanup[14887]: 7F25B88CBF56: message-id=<1450444841.381643.1657667898512@SOME.MAIL.DOMAIN>
Jul 13 02:18:40 HOSTNAME postfix/qmgr[16714]: 7F25B88CBF56: from=<SENDER@DOMAIN.COM>, size=10815912, nrcpt=1 (queue active)
Jul 13 02:18:46 HOSTNAME postfix/dkimmilter/smtpd[5192]: 6BBEB88CBFDA: client=localhost[127.0.0.1]
Jul 13 02:18:46 HOSTNAME postfix/cleanup[14887]: 6BBEB88CBFDA: message-id=<1450444841.381643.1657667898512@SOME.MAIL.DOMAIN>
Jul 13 02:18:46 HOSTNAME opendkim[16013]: 6BBEB88CBFDA: no signing table match for 'SENDER@DOMAIN.COM'
Jul 13 02:18:46 HOSTNAME amavis[8078]: (08078-10) TkSEpdwjZ8Mg FWD from <SENDER@DOMAIN.COM> -> <RECEIVER@EXAMPLE.COM>, BODY=7BIT 250 2.0.0 from MTA(smtp:[127.0.0.1]:10030): 250 2.0.0 Ok: queued as 6BBEB88CBFDA
Jul 13 02:18:46 HOSTNAME postfix/qmgr[16714]: 6BBEB88CBFDA: from=<SENDER@DOMAIN.COM>, size=10816468, nrcpt=1 (queue active)
Jul 13 02:18:46 HOSTNAME amavis[8078]: (08078-10) Passed CLEAN {RelayedInternal}, ORIGINATING/MYNETS LOCAL [XX.XX.XX.10]:47294 [74.6.132.42] <SENDER@DOMAIN.COM> -> <RECEIVER@EXAMPLE.COM>, Queue-ID: 7F25B88CBF56, Message-ID: <1450444841.381643.1657667898512@SOME.MAIL.DOMAIN>, mail_id: TkSEpdwjZ8Mg, Hits: -, size: 10815908, queued_as: 6BBEB88CBFDA, dkim_sd=a2048:aol.com, 5922 ms
Jul 13 02:18:46 HOSTNAME postfix/smtp[13982]: 7F25B88CBF56: to=<RECEIVER@EXAMPLE.COM>, relay=127.0.0.1[127.0.0.1]:10026, delay=6.2, delays=0.29/0/0/5.9, dsn=2.0.0, status=sent (250 2.0.0 from MTA(smtp:[127.0.0.1]:10030): 250 2.0.0 Ok: queued as 6BBEB88CBFDA)
Jul 13 02:18:46 HOSTNAME postfix/qmgr[16714]: 7F25B88CBF56: removed
Jul 13 02:18:46 HOSTNAME postfix/amavisd/smtpd[13435]: BFC5488CBE7C: client=localhost[127.0.0.1]
Jul 13 02:18:46 HOSTNAME postfix/cleanup[25437]: BFC5488CBE7C: message-id=<1450444841.381643.1657667898512@SOME.MAIL.DOMAIN>
Jul 13 02:18:47 HOSTNAME postfix/qmgr[16714]: BFC5488CBE7C: from=<SENDER@DOMAIN.COM>, size=10816613, nrcpt=1 (queue active)
Jul 13 02:18:47 HOSTNAME postfix/smtp[13433]: 6BBEB88CBFDA: to=<RECEIVER@EXAMPLE.COM>, relay=127.0.0.1[127.0.0.1]:10025, delay=0.65, delays=0.27/0.01/0.04/0.33, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as BFC5488CBE7C)
Jul 13 02:18:47 HOSTNAME postfix/qmgr[16714]: 6BBEB88CBFDA: removed
Jul 13 02:28:47 HOSTNAME postfix/lmtp[21640]: BFC5488CBE7C: to=<RECEIVER@EXAMPLE.COM>, relay=HOSTNAME.EXAMPLE.COM[XX.XX.XX.26]:7025, delay=601, delays=0.31/0.01/0.11/600, dsn=4.4.2, status=deferred (conversation with HOSTNAME.EXAMPLE.COM[XX.XX.XX.26] timed out while sending end of data -- message may be sent more than once)
Jul 13 02:41:06 HOSTNAME postfix/qmgr[16714]: BFC5488CBE7C: from=<SENDER@DOMAIN.COM>, size=10816613, nrcpt=1 (queue active)
Jul 13 02:51:08 HOSTNAME postfix/lmtp[13137]: BFC5488CBE7C: conversation with HOSTNAME.EXAMPLE.COM[XX.XX.XX.26] timed out while sending end of data -- message may be sent more than once
Jul 13 03:01:09 HOSTNAME postfix/lmtp[13137]: BFC5488CBE7C: to=<RECEIVER@EXAMPLE.COM>, relay=HOSTNAME.EXAMPLE.COM[XX.XX.XX.26]:7025, delay=2542, delays=1339/0.9/601/601, dsn=4.4.2, status=deferred (conversation with HOSTNAME.EXAMPLE.COM[XX.XX.XX.26] timed out while sending end of data -- message may be sent more than once)
In this particular case, we see that the delivered Queue number eventually is: BFC5488CBE7C

From the relevant abstract from /opt/zimbra/log/mailbox.log (for that day), we see that, for no apparent reason, the message takes 13 minutes to be delivered on the same server, without any high CPU, RAM or IO readings recorded at that time:

Code: Select all

# zgrep 1450444841.381643.1657667898512  log/mailbox.log.2022-07-14.gz
2022-07-13 02:31:47,878 INFO  [LmtpServer-13697] [ip=XX.XX.XX.26;] lmtp - Delivering message: size=10816613 bytes, nrcpts=1, sender=SENDER@DOMAIN.COM, msgid=<1450444841.381643.1657667898512@SOME.MAIL.DOMAIN>
2022-07-13 02:31:47,929 INFO  [LmtpServer-13697] [name=RECEIVER@EXAMPLE.COM;mid=314;ip=XX.XX.XX.26;] mailop - Adding Message: id=91452, Message-ID=<1450444841.381643.1657667898512@SOME.MAIL.DOMAIN>, parentId=90543, folderId=2, folderName=Inbox acct=66429842-14da-45a1-a808-0f1591b8232e.
2022-07-13 02:54:37,190 INFO  [LmtpServer-13786] [ip=XX.XX.XX.26;] lmtp - Delivering message: size=10816613 bytes, nrcpts=1, sender=SENDER@DOMAIN.COM, msgid=<1450444841.381643.1657667898512@SOME.MAIL.DOMAIN>
2022-07-13 02:54:37,191 INFO  [LmtpServer-13786] [name=RECEIVER@EXAMPLE.COM;mid=314;ip=XX.XX.XX.26;] lmtp - Not delivering message with duplicate Message-ID <1450444841.381643.1657667898512@SOME.MAIL.DOMAIN>
2022-07-13 03:05:18,743 INFO  [LmtpServer-13826] [ip=XX.XX.XX.26;] lmtp - Delivering message: size=10816613 bytes, nrcpts=1, sender=SENDER@DOMAIN.COM, msgid=<1450444841.381643.1657667898512@SOME.MAIL.DOMAIN>
2022-07-13 03:05:18,743 INFO  [LmtpServer-13826] [name=RECEIVER@EXAMPLE.COM;mid=314;ip=XX.XX.XX.26;] lmtp - Not delivering message with duplicate Message-ID <1450444841.381643.1657667898512@SOME.MAIL.DOMAIN>
[... goes on indefinitely ...]

In the meantime, for any reason, 10 minutes after the first delivery, there is a new delivery attempt:

Code: Select all

Jul 13 02:28:47 HOSTNAME postfix/lmtp[21640]: BFC5488CBE7C: to=<RECEIVER@EXAMPLE.COM>, relay=HOSTNAME.EXAMPLE.COM[XX.XX.XX.26]:7025, delay=601, delays=0.31/0.01/0.11/600, dsn=4.4.2, status=deferred (conversation with HOSTNAME.EXAMPLE.COM[XX.XX.XX.26] timed out while sending end of data -- message may be sent more than once)
At that exact time, 02:28:47, an LMTP process starts appearing as flagged CLOSE_WAIT. Could be coincidence? Probably not, exact same thing was found with other failed deliveries, due to "message may be sent more than once".

The message is been retried indefinitely, every 10 minutes, until, eventually, reaching the TTL. However, no new "CLOSE_WAIT" threads appear for subsequent delivery attempts of that specific message.

The above is merely an observation, we wish it could be of any use.

Regards,
Stavros
User avatar
L. Mark Stone
Ambassador
Ambassador
Posts: 2802
Joined: Wed Oct 09, 2013 11:35 am
Location: Portland, Maine, US
ZCS/ZD Version: 10.0.7 Network Edition
Contact:

Re: JAVA not reaping unused LMTP sessions

Post by L. Mark Stone »

stasouv wrote: <snip>
The above is merely an observation, we wish it could be of any use.

Regards,
Stavros
Hi Stavros, and thank you for posting this.

After making the LC change and restarting things, we too experienced about a day and a half of normal operations before this started happening again.

I see the exact same log file entries on the mailbox and MTA servers as you do.

The difference now versus before the LC change is that the issue seems to be somewhat self-clearing, in that after some hours emails do get delivered and the Deferred queue goes back down to normal. But then a few hours later it happens again, and customer (rightly so) don't want (nor should they have to) wait for email to be delivered with a delay of several hours.

I have uploaded fresh zmdiaglog outputs to Zimbra Support, and also pointed them to your post.

Thanks again,
Mark
___________________________________
L. Mark Stone
Mission Critical Email - Zimbra VAR/BSP/Training Partner https://www.missioncriticalemail.com/
AWS Certified Solutions Architect-Associate
User avatar
L. Mark Stone
Ambassador
Ambassador
Posts: 2802
Joined: Wed Oct 09, 2013 11:35 am
Location: Portland, Maine, US
ZCS/ZD Version: 10.0.7 Network Edition
Contact:

Re: JAVA not reaping unused LMTP sessions

Post by L. Mark Stone »

Zimbra Support have let me know this issue is covered by ZBUG-2901 and that a fix will be available with patch 26/33.
___________________________________
L. Mark Stone
Mission Critical Email - Zimbra VAR/BSP/Training Partner https://www.missioncriticalemail.com/
AWS Certified Solutions Architect-Associate
chad.lewis
Posts: 1
Joined: Thu Jul 21, 2022 6:33 pm

Re: JAVA not reaping unused LMTP sessions

Post by chad.lewis »

This thread has been tremendously helpful. Thank you. We have been battling this bug all week. We attempted the setting

Code: Select all

zmlocalconfig -e postfix_lmtp_destination_concurrency_limit=20

to no avail.

Fortunately, Zimbra does have a hotfix for this. If you are still experiencing the issue, request the hotfix from Zimbra Support for ZBUG-2901.

So far that hotfix is working great.
Post Reply