Upgrading with a clear backout path

Looking to migrate to ZCS? Ask here. Got a great tip or script that helped you migrate? Post it here.
Post Reply
Baylink
Outstanding Member
Outstanding Member
Posts: 381
Joined: Fri Sep 12, 2014 11:42 pm

Upgrading with a clear backout path

Post by Baylink »

So... I'm *finally* licensed (CFOs can be such a pain), and running 5.0.9, and pretty clearly, it's time to upgrade now to .15.
The question is: how?
More specifically: how do I do an upgrade without a) setting up a whole new machine, or b) overwriting the current program directory, not to mention restructuring any databases that I won't be able to restructure back -- and let's not forget that if I have to back-out, I need to be able to *reprocess* any incoming mail (either from the outside world or from users) that arrives while I'm on the new system.
I'm *really* big on back-out plans, partially from reading Tom Limoncelli's excellent The Practice Of System And Network Administration, which you should run right out and buy, and partially because it's *email* -- the bosses get cranky when email breaks; that's why I'm moving to Zimbra in the first place.
Anyone else out there who's as paranoid as me, and has a good answer here?
User avatar
L. Mark Stone
Ambassador
Ambassador
Posts: 2802
Joined: Wed Oct 09, 2013 11:35 am
Location: Portland, Maine, US
ZCS/ZD Version: 10.0.7 Network Edition
Contact:

Upgrading with a clear backout path

Post by L. Mark Stone »

[quote]SO... I'M *FINALLY* LICENSED (CFOS CAN BE SUCH A PAIN), AND RUNNING 5.0.9, AND PRETTY CLEARLY, IT'S TIME TO UPGRADE NOW TO .15.
THE QUESTION IS: HOW?
MORE SPECIFICALLY: HOW DO I DO AN UPGRADE WITHOUT A) SETTING UP A WHOLE NEW MACHINE, OR B) OVERWRITING THE CURRENT PROGRAM DIRECTORY, NOT TO MENTION RESTRUCTURING ANY DATABASES THAT I WON'T BE ABLE TO RESTRUCTURE BACK -- AND LET'S NOT FORGET THAT IF I HAVE TO BACK-OUT, I NEED TO BE ABLE TO *REPROCESS* ANY INCOMING MAIL (EITHER FROM THE OUTSIDE WORLD OR FROM USERS) THAT ARRIVES WHILE I'M ON THE NEW SYSTEM.
I'M *REALLY* BIG ON BACK-OUT PLANS, PARTIALLY FROM READING TOM LIMONCELLI'S EXCELLENT THE PRACTICE OF SYSTEM AND NETWORK ADMINISTRATION, WHICH YOU SHOULD RUN RIGHT OUT AND BUY, AND PARTIALLY BECAUSE IT'S *EMAIL* -- THE BOSSES GET CRANKY WHEN EMAIL BREAKS; THAT'S WHY I'M MOVING TO ZIMBRA IN THE FIRST PLACE.
ANYONE ELSE OUT THERE WHO'S AS PARANOID AS ME, AND HAS A GOOD ANSWER HERE?[/QUOTE]
THE ULTIMATE BACKOUT/ROLLBACK PLAN WITH A FAILED UPGRADE OF ZIMBRA INVOLVES REINSTALLING THE OLD RPMS AND COPYING BACK THE ENTIRE /OPT/ZIMBRA TREE AS ZIMBRA UPGRADES ARE DONE IN-PLACE ON THE SAME SERVER.
THE ZIMBRA INSTALLER THAT DOES THE UPGRADE, DEPENDING UPON THE VERSION, TENDS TO UPDATE MYSQL DATABASES, LDAP SCHEMAS, AND OTHER CORE ITEMS THAT JUST WON'T WORK IF YOU SIMPLY REINSTALL THE OLD RPMS.
MIGRATING TO A NEW SERVER CAN BE A PAIN, AND THE IN-PLACE UPGRADES ARE GENERALLY RELIABLE, SO THAT'S WHAT WE HAVE ALWAYS DONE.
SO, AT THE RISK OF OVERSIMPLIFYING, HERE'S WHAT WE DO:



HAVE A BACKUP MX SOMEWHERE ELSE TO HANDLE INBOUND EMAILS WHILE YOUR ZIMBRA SERVER IS DOWN. THERE'S A SCRIPT ON THE WIKI TO EXTRACT ALL THE VALID EMAIL ADDRESSES ON YOUR ZIMBRA SYSTEM. WE HAVE A PLAIN-JANE POSTFIX BOX AT A SECOND DATA CENTER AS A BACKUP MX FOR ALL OF OUR HOSTED DOMAINS. NOT HAVING TO WORRY ABOUT BOUNCING INBOUND EMAIL DURING THE UPGRADE IS A GOOD THING.

DEPENDING UPON HOW BIG THE /OPT/ZIMBRA TREE IS, EITHER A FEW DAYS OR A FEW HOURS BEFORE YOUR PLANNED UPGRADE, WITH ZIMBRA STILL RUNNING, RSYNC THE ENTIRE /OPT/ZIMBRA TREE SOMEWHERE SAFE. YOU CAN RUN THIS RSYNC AGAIN SAY, 30 MINUTES BEFORE YOUR PLANNED MAINTENANCE WINDOW BEGINS. THE COMMAND WE USE IS: RSYNC -AVZH --DELETE /OPT/ZIMBRA ROOT@DESTINATION_SERVER:/OPT_ZIMBRA_BACKUP.

ONCE YOU HIT THE MAINTENANCE WINDOW AND FOLKS EXPECT ZIMBRA TO BE DOWN, BLOCK ALL INBOUND PORTS AT THE FIREWALL TO PREVENT ZIMBRA FROM ACCEPTING ANY NEW EMAILS AND USERS FROM ACCESSING ZIMBRA DURING THE UPGRADE PROCESS.

RUN ZMBACKUP -F -A ALL TO GET A FULL ZIMBRA BACKUP.

RUN ZMCONTROL SHUTDOWN (STOPS THE ZIMBRA MANAGEMENT PROCESSES), AND CHECK THERE ARE NO ZIMBRA PROCESSES RUNNING BY USING TOP, PS, OR WHATEVER ELSE YOU LIKE.

RUN THAT RSYNC JOB ONE LAST TIME, NOW THAT ZIMBRA IS STOPPED. THIS GIVES YOU THE CLEAN COPY OF THE /OPT/ZIMBRA YOU'LL NEED TO RSYNC BACK IF THE UPGRADE TRULY FAILS AND A RESTORE FROM THE ZIMBRA BACKUP DOESN'T WORK EITHER.

RUN THE INSTALLER FROM THE NEW VERSION, AND IF NO ERRORS, TAKE A NEW FULL ZIMBRA BACKUP AND THEN OPEN UP THE PORTS ON THE FIREWALL.

YOUR ARE DONE!


FWIW, OUR EXPERIENCE HAS BEEN THAT IF YOU:



SHUTDOWN ZIMBRA MANUALLY, INSTEAD OF LETTING THE INSTALLER DO IT;

READ THE RELEASE NOTES CAREFULLY FOR ANY POTENTIAL "GOTCHAS" THAT REQUIRE MANUAL WORK BEFORE THE ACTUAL UPGRADE;

LET THE INSTALLER TAKE WHATEVER TIME IT NEEDS TO RUN WHATEVER CHECKS IT WANTS DURING THE UPGRADE PROCESS, AND;

LET THE INSTALLER COMPLETE EVEN WHEN IT LOOKS LIKE IT'S DOING NOTHING;



THEN YOUR UPGRADES WILL BE PRETTY SEAMLESS.
LAST TIP: SINCE WE DO ALMOST ALL OF OUR UPGRADES REMOTELY VIA SSH, WE TYPICALLY LIKE TO HAVE A SECOND SSH WINDOW OPEN RUNNING TOP TO KEEP OUR BLOOD PRESSURE DOWN DURING THE UPGRADE. THE ZIMBRA INSTALLER SCRIPTS HAVE A LOT OF SLEEP COMMANDS IN THERE, AND IT CAN BE VERY NERVE-WRACKING WATCHING THE INSTALLER DO NOTHING FOR A WHILE. THERE HAVE BEEN SEVERAL POSTS HERE WHERE USERS ABORTED THE INSTALLER BECAUSE THEY THOUGHT IT HAD HUNG. WE HAVE NEVER SEEN THAT, BUT, IT HELPS TO BE PATIENT (AND MAYBE SCARF A VALIUM BEFOREHAND IF YOU ARE THE NERVOUS TYPE...)!
WE HAVE ALSO NEVER HAD AN UPGRADE FAIL, AND WE HAVE BEEN RUNNING NE SINCE 4.0.3. I EXPECT SOMEDAY WE WILL; THESE THINGS EVENTUALLY HAPPEN DESPITE EVERYONE'S BEST INTENTIONS. BUT THE PLAN ABOVE GIVES US TWO BITES AT THE APPLE FOR ROLLING BACK, AND WE THINK THAT'S A GOOD AMOUNT OF INSURANCE.
HOPE THAT HELPS,

MARK
___________________________________
L. Mark Stone
Mission Critical Email - Zimbra VAR/BSP/Training Partner https://www.missioncriticalemail.com/
AWS Certified Solutions Architect-Associate
Baylink
Outstanding Member
Outstanding Member
Posts: 381
Joined: Fri Sep 12, 2014 11:42 pm

Upgrading with a clear backout path

Post by Baylink »

That covers me for failures *during* the upgrade ... but it doesn't let me back-out afterwards, if I find some nassty bug that eats me precioussesss. :-)
I guess I'll have to cook up some way to tee off incoming mail on the way in so I have it to replay if I need to.
Outgoing mail sent during a posited interregnum between upgrade and backout is even harder to deal with... and may be impossible.
Ok; well, I feel a little bit better now; thanks.
User avatar
L. Mark Stone
Ambassador
Ambassador
Posts: 2802
Joined: Wed Oct 09, 2013 11:35 am
Location: Portland, Maine, US
ZCS/ZD Version: 10.0.7 Network Edition
Contact:

Upgrading with a clear backout path

Post by L. Mark Stone »

[quote user="Baylink"]That covers me for failures *during* the upgrade ... but it doesn't let me back-out afterwards, if I find some nassty bug that eats me precioussesss. :-)
I guess I'll have to cook up some way to tee off incoming mail on the way in so I have it to replay if I need to.
Outgoing mail sent during a posited interregnum between upgrade and backout is even harder to deal with... and may be impossible.
Ok; well, I feel a little bit better now; thanks.[/QUOTE]
Jay,
You are right that our process gives you rollback protection during the upgrade but not afterwards.
The way we protect against having to do a rollback hours or days after an upgrade is to avoid that possibility. We don't have a plan to be able to do that, and I'm not sure you can, really since I don't know how you would record and later play back things like users moving emails from one folder to another or changing their passwords--especially when many upgrades make fundamental changes to the LDAP schema and the MySQL databases (as well as sometimes updating OpenLDAP and MySQL themselves)..
When we are nervous, we do upgrades on some test virtual machines we have, with restored copies of our production stores.
We also watch the forums here, to see if others who have a need to upgrade sooner experience any issues.
If you look at the history of 5.0.12 > 5.0.15, .12 was a highly QA'd release with one nasty "gotcha". .13 through .15 were released in quick succession to address single issues.
FWIW, we held pat on 5.0.8, avoiding all upgrades until 5.0.15, and we have had zero issues as I said with the upgrade process and with the post-upgrade running of 5.0.15.
I'm not going to say you may not have problems with 5.0.15, but if you search the forums for "5.0.15" you will quickly see the challenges others have faced with this version, and then you can evaluate whether your installation will likely face those challenges--or not.
But the bottom line is that I have never heard of anyone doing a post-upgrade rollback several days later without the loss of all inbound emails and changes since the upgrade was performed.
All the best,

Mark
___________________________________
L. Mark Stone
Mission Critical Email - Zimbra VAR/BSP/Training Partner https://www.missioncriticalemail.com/
AWS Certified Solutions Architect-Associate
gmsmith
Outstanding Member
Outstanding Member
Posts: 432
Joined: Fri Sep 12, 2014 10:09 pm

Upgrading with a clear backout path

Post by gmsmith »

Yeah, let me echo Mark's comments...don't jump on an upgrade the first day it is out...watch the forums and do your own off production system testing. We usually upgrade about 30 days after Zimbra releases. This gives us internal QA time and lets others flush out some issues that may arise.
Mark's backup process is outstanding and should be followed, this will protect you during the upgrade. I might also suggest if you are doing remote ssh connections to use screen, so in case you lose local connectivity you don't run the risk of any issues and always monitor /tmp/zmsetup.log with a tail -f, keeps the blood pressure down (for the most part).
Baylink
Outstanding Member
Outstanding Member
Posts: 381
Joined: Fri Sep 12, 2014 11:42 pm

Upgrading with a clear backout path

Post by Baylink »

Good points, all. I'm a professional paranoid, but I guess at some point... :-)
Thanks, guys.
bdial
Elite member
Elite member
Posts: 1633
Joined: Fri Sep 12, 2014 10:39 pm

Upgrading with a clear backout path

Post by bdial »

this is one of the reasons i am absolutely in love with vmware. by snapshotting a server, you can do whatever you want to it and then if you really hose things up just revert to the snapshot wtihin seconds. it will even snapshot the running memory so when you revert your server returns to the exact running point it was at.
currently our zimbra mta & ldap servers are virtual. i always snapshot them both before upgrade in case something goes wrong with either upgrade. of course my mailbox servers are still physical and they're usually the most problematic so i still sweat a bit there! one of my goals this year is to get them virtualized.
Post Reply