Crashed mailbox after move it between Zimbra servers using zxsuite hsm doMailboxMove command

Forum to discuss, report issues, or provide feedback about Zimbra HSM Plus
Post Reply
User avatar
rcardozo1987
Posts: 23
Joined: Tue Sep 10, 2019 9:14 pm
ZCS/ZD Version: NETWORK edition, Patch 8.8.15_P11

Crashed mailbox after move it between Zimbra servers using zxsuite hsm doMailboxMove command

Post by rcardozo1987 »

Gentlemen,
We noticed Zimbra 8.8.15 P4 has some doMailboxMove command issue. We were also able to reproduce this issue in Zimbra 8.8.15 P11.

So far we get this:

The damage:
After run a doMailboxMove command to move a mailbox from server A to server B, the mailbox moved gets at server B missing a lot of Blobs. Sometimes even almost empty.

The issue:
Purge operations were started at server B while a mailbox was beeing moved from server A to server B.
It seems Zimbra didn't get the whole picture and identified this mailbox as an orphan at server B. So, this mailbox was removed by the doPurgeMailboxes process at Server B.
Next, the doMailboxMove process created the mailbox again (at Server B), but moved only the server A reamaining data.

Side effects:
Blob erros show up because part of the mailbox was deleted.

Evidences:
  • Log signature: unlike many others mailboxes, the affected ones were created twice, getting 2 mailboxID (mid) during the move process.
    zmb-serverB:

    Code: Select all

    root@zmb-serverB:/opt/zimbra/log# grep -Ei 'Creating mailbox' mailbox.log
    ...
    2020-07-29 16:28:41,723 INFO  [ZxLink Handler Thread] [] mailbox - Creating mailbox with id 4792 and group id 92 for firstname.secondname18@example.com.
    2020-07-29 17:07:59,210 INFO  [ZxLink Handler Thread] [] mailbox - Creating mailbox with id 4793 and group id 93 for firstname.secondname19@example.com.
    2020-07-29 17:22:00,774 INFO  [ZxLink Handler Thread] [] mailbox - Creating mailbox with id 4794 and group id 94 for firstname.secondname20@example.com.
    2020-07-29 17:45:30,064 INFO  [ZxLink Handler Thread] [tid=2142672;account=firstname.secondname20@example.com;accountId=62f6bbe6-d524-4c6b-a4d1-9c3829751dff;stage=MailboxMoveStoreIndex:zmb-serverA->zmb-serverB;] mailbox - Creating mailbox with id 4795 and group id 95 for firstname.secondname20@example.com.
    2020-07-29 17:52:16,416 INFO  [ZxLink Handler Thread] [] mailbox - Creating mailbox with id 4796 and group id 96 for firstname.secondname21@example.com.
    2020-07-29 19:30:10,346 INFO  [ZxLink Handler Thread] [] mailbox - Creating mailbox with id 4797 and group id 97 for firstname.secondname22@example.com.
    ...
    
  • Exclusion of first MID created during the move process:
    zmb-serverB:

    Code: Select all

    root@zmb-serverB:/opt/zimbra/log# zgrep ' 4794 ' op_doPurgeMailboxes_*
    op_doPurgeMailboxes_c7c22cfe-a006-4e4d-9f43-352f035bd26e.log.gz:2020-07-29 17:44:28,050 INFO Mailbox 4794 deleted
    
  • Reason for MID exclusion:
    zmb-serverB:

    Code: Select all

    root@zmb-serverB:/opt/zimbra/log# grep 'Purge Orphaned Mailboxes' mailbox.log | grep '4794 deleted'
    2020-07-29 17:44:28,050 INFO  [Purge Orphaned Mailboxes] [tid=2135243;operationId=c7c22cfe-a006-4e4d-9f43-352f035bd26e;module=ZxPowerstore;operation=doPurgeMailboxes;] mailbox - Mailbox 4794 deleted
    
Pay attention to:
  • 2020-07-29 17:22:00,774 INFO [ZxLink Handler Thread] [] mailbox - Creating mailbox with id 4794 and group id 94 for firstname.secondname20@example.com.
  • 2020-07-29 17:44:28,050 INFO Mailbox 4794 deleted
  • 2020-07-29 17:44:28,050 INFO [Purge Orphaned Mailboxes] [tid=2135243;operationId=c7c22cfe-a006-4e4d-9f43-352f035bd26e;module=ZxPowerstore;operation=doPurgeMailboxes;] mailbox - Mailbox 4794 deleted
  • 2020-07-29 17:45:30,064 INFO [ZxLink Handler Thread] [tid=2142672;account=firstname.secondname20@example.com;accountId=62f6bbe6-d524-4c6b-a4d1-9c3829751dff;stage=MailboxMoveStoreIndex:zmb-serverA->zmb-serverB;] mailbox - Creating mailbox with id 4795 and group id 95 for firstname.secondname20@example.com.
The major problem was that we ran the purge operation at 2020-07-29 early in the morning in server B, but it was scheduled somehow. The command started to run, in fact, only near 5PM! :o

Does anyone know anything else about this problem? Thanks.
Post Reply