Script: (Thunberbird) mbox tree to (zimbra desktop) importable .eml tree

General discussion about Zimbra Desktop.
Post Reply
fbongartz
Posts: 15
Joined: Sat Sep 13, 2014 1:21 am

Script: (Thunberbird) mbox tree to (zimbra desktop) importable .eml tree

Post by fbongartz »

OUR THUNDERBIRD CLIENTS HAVE A LOT OF LOCAL FOLDERS THAT WE'D LIKE TO MIGRATE TO ZIMBRA DESKTOP LOCAL FOLDERS. THUNDERBIRD STORES LOCAL FOLDERS IN MBOX FILES THAT CAN BE STRUCTURED IN SUBDIRECTORIES. ZIMBRA DESKTOP CAN IMPORT TGZ FILES THAT CONTAIN A DIRECTORY TREE WITH .EML FILES WHERE EACH .EML FILE IS ONE MESSAGE.
AFTER PLAYING WITH MB2MD.PL AND A BASH SCRIPT, I GAVE UP AND STARTED TO WRITE MY OWN PYTHON SCRIPT WHICH I'M SHARING HERE:


'''

CREATED ON 02.11.2011
@VERSION 0.9

@AUTHOR: FABRICE BONGARTZ (FABRICE (AT) FABRICE D.O.T. ME)

@COPYRIGHT: (C) 2011 FABRICE BONGARTZ
THIS PROGRAM IS FREE SOFTWARE; YOU CAN REDISTRIBUTE IT AND/OR MODIFY IT UNDER

THE TERMS OF THE GNU GENERAL PUBLIC LICENSE AS PUBLISHED BY THE

FREE SOFTWARE FOUNDATION; EITHER VERSION 3 OF THE LICENSE, OR (AT YOUR OPTION)

ANY LATER VERSION.

THIS PROGRAM IS DISTRIBUTED IN THE HOPE THAT IT WILL BE USEFUL, BUT

WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED WARRANTY OF MERCHANTABILITY OR

FITNESS FOR A PARTICULAR PURPOSE. SEE THE GNU GENERAL PUBLIC LICENSE FOR MORE

DETAILS. YOU SHOULD HAVE RECEIVED A COPY OF THE GNU GENERAL PUBLIC LICENSE

ALONG WITH THIS PROGRAM; IF NOT, SEE <HTTP://WWW.GNU.ORG/LICENSES/>.

'''
IMPORT SYS, OS, OS.PATH, TIME, EMAIL.UTILS, RE, ARGPARSE, TARFILE

FROM ARGPARSE IMPORT ARGUMENTTYPEERROR
RE_FROM = RE.COMPILE("^FROM ")

RE_DATE = RE.COMPILE("^DATE: +")

RE_STRIP_FROM = RE.COMPILE("FROM ([^@]+@)*((.+)|S+) *")
DEF GET_DATE_FROM_FROMLINE(FROM_LINE):

RETURN RE_STRIP_FROM.SUB("", FROM_LINE)
DEF GET_DATE_FROM_DATELINE(DATE_LINE):

RETURN RE_DATE.SUB("", DATE_LINE)
DEF MBOX2EML(MBOX_FILE, DEST_DIR, CHANGE_MTIME, PREFER_DATEHEADER):

MBOX_FILE.SEEK(0) # JUMP BACK TO THE BEGINNING OF THE FILE

PREVIOUS_WAS_NEWLINE = TRUE

NR = 0

IN_HEADERS = FALSE

DATE_FROM = NONE # TO STORE DATES FROM THE "FROM " HEADER.

DATE_DATE = NONE # TO STORE DATES FROM THE "DATE: " HEADER.

CUR_PATH = NONE

CUR_FILE = NONE

BOF = TRUE



FOR LINE IN MBOX_FILE:

IF (PREVIOUS_WAS_NEWLINE): # TRUE FOR THE FIRST ITERATION

IF (RE_FROM.MATCH(LINE) != NONE):

# WE'RE AT THE BEGINNING OF A NEW MESSAGE

IF (BOF == FALSE): # TRUE FOR THE FIRST ITERATION

# FINISH THE LAST MESSAGE

IF (CUR_FILE.CLOSED == FALSE): CUR_FILE.CLOSE()

IF (CHANGE_MTIME):

IF (PREFER_DATEHEADER AND DATE_DATE != NONE):

MTIME = TIME.MKTIME(DATE_DATE)

ELSE: MTIME = TIME.MKTIME(DATE_FROM)

OS.UTIME(CUR_PATH, (MTIME, MTIME))



# PREPARE FOR A NEW MESSAGE

CUR_PATH = OS.PATH.JOIN(DEST_DIR, STR(NR) + ".EML")

CUR_FILE = OPEN(CUR_PATH, "W")

DATE_FROM = EMAIL.UTILS.PARSEDATE(GET_DATE_FROM_FROMLINE(LINE))

DATE_DATE = NONE

IN_HEADERS = TRUE

BOF = FALSE

NR += 1

ELIF (IN_HEADERS): IN_HEADERS = FALSE

IF (PREFER_DATEHEADER AND IN_HEADERS AND RE_DATE.MATCH(LINE) != NONE):

DATE_DATE = EMAIL.UTILS.PARSEDATE(GET_DATE_FROM_DATELINE(LINE))



# WRITE THE CURRENT LINE

CUR_FILE.WRITE(LINE)



# DETERMINE IF WE'RE AT A NEWLINE

IF LINE.REPLACE("RN", "N") == "N": PREVIOUS_WAS_NEWLINE = TRUE

ELSE: PREVIOUS_WAS_NEWLINE = FALSE



# TREAT THE LAST REMAINING MESSAGE

IF (BOF == FALSE): # SHOULD ONLY BE TRUE HERE IF THE FILE WAS EMPTY

IF (CUR_FILE.CLOSED == FALSE): CUR_FILE.CLOSE()

IF (CHANGE_MTIME):

IF (PREFER_DATEHEADER AND DATE_DATE != NONE):

MTIME = TIME.MKTIME(DATE_DATE)

ELSE: MTIME = TIME.MKTIME(DATE_FROM)

OS.UTIME(CUR_PATH, (MTIME, MTIME))



RETURN TRUE
DEF IS_MBOX_FILE(F):

"""

DETERMINE OF THE GIVEN FILE OBJECT IS AN MBOX FILE. THIS SIMPLY CHECKS

IF THE FILE'S FIRST LINE HAS THE STRING "FROM " AT ITS START.

"""

IF (RE_FROM.MATCH(F.READLINE()) != NONE):

RETURN TRUE

RETURN FALSE
DEF RECURSE_MBOX(MBOX_START_DIR, TMP_DIR, CHANGE_MTIME = TRUE,

PREFER_DATEHEADER = TRUE):



FOR ROOT, DIRS, FILES IN OS.WALK(MBOX_START_DIR, TRUE, NONE, FALSE):

FOR F_STR IN FILES:

IF ((LEN(F_STR) > 3 AND F_STR[-4:] != ".MSF") OR LEN(F_STR)
F = OPEN(OS.PATH.JOIN(ROOT, F_STR), "R")

IF (IS_MBOX_FILE(F)):

PRINT "TREATING MBOX " + OS.PATH.JOIN(ROOT, F_STR)

REL_PATH = OS.PATH.RELPATH(ROOT, MBOX_START_DIR)

IF (REL_PATH == "."): REL_PATH = ""

DEST_DIR = OS.PATH.JOIN(TMP_DIR, REL_PATH, F_STR).REPLACE(".SBD", "")

IF (OS.PATH.ISDIR(DEST_DIR) == FALSE): OS.MAKEDIRS(DEST_DIR)

MBOX2EML(F, DEST_DIR, CHANGE_MTIME, PREFER_DATEHEADER)

F.CLOSE()
DEF CREATE_TGZ(DIR, TGZ_PATH):

TGZ = TARFILE.OPEN(TGZ_PATH, "W:GZ")

FOR ROOT, DIRNAMES, FILENAMES IN OS.WALK(DIR):

FOR F IN FILENAMES:

FILEPATH = OS.PATH.JOIN(ROOT, F)

RELPATH = OS.PATH.RELPATH(FILEPATH, DIR)

PRINT "ADDING TO TARGZ:", FILEPATH

TGZ.ADD(FILEPATH, RELPATH)

TGZ.CLOSE()
DEF INITIALIZE_OPTIONS():

AP = ARGPARSE.ARGUMENTPARSER()

AP.ADD_ARGUMENT("-S", "--MBOX-START-DIR", REQUIRED = TRUE,

DEST = "MBOX_START_DIR", HELP = "A SOURCE DIRECTORY "

+ "THAT CONTAINS A TREE OF MBOX FILES.")

AP.ADD_ARGUMENT("-D", "--DESTINATION", REQUIRED = TRUE,

DEST = "DEST_DIR", HELP = "DIRECTORY WHERE THE "

+ "EML-FILE TREE SHOULD BE CREATED.")

AP.ADD_ARGUMENT("-Z", "--TGZ", DEST = "TGZ", HELP = "CREATE A GZIPPED "

+ "TAR ARCHIVE THAT CONTAINS THE DIRECTORY TREE "

+ "SPECIFIED WITH -D/--DESTINATION AT THE GIVEN "

+ "PATH. THIS IS OPTIONAL. NOTE THAT THIS MIGHT BE "

+ "FASTER USING AN OPTIMIZED COMMANDLINE TOOL LIKE "

+ "GNU TAR.")

AP.ADD_ARGUMENT("-M", "--DONT-CHANGE-MTIMES", DEST = "DONT_CHANGE_MTIMES",

ACTION = "STORE_TRUE", DEFAULT = FALSE, HELP = "BY "

+ "DEFAULT, THE MTIME AND ATIME OF THE CREATED EML "

+ "FILES WILL BE CHANGED TO A DATE FOUND IN EACH EMAIL "

+ "HEADER. THIS OPTION DISABLED CHANGING MTIME/ATIME.")

AP.ADD_ARGUMENT("-I", "--IGNORE-DATE-HEADER", DEFAULT = FALSE,

ACTION = "STORE_TRUE", DEST = "IGNORE_DATE_HEADER",

HELP = "BY DEFAULT, AND IF -M/--DONT-CHANGE-MTIMES "

+ "WASN'T SPECIFIED, IN ORDER TO CHANGE THE "

+ "ATIME+MTIME OF EACH EML FILE, THE PROGRAM WILL LOOK "

+ "FOR A DATE: LINE IN THE EMAIL HEADERS. IF NO DATE: "

+ "LINE WAS FOUND, THE DATE FROM THE "FROM " LINE AT "

+ "THE BEGINNING OF THE MESSAGE WILL BE USED. THIS "

+ "OPTION DISABLES LOOKING FOR THE DATE: LINE SO THAT "

+ "THE "FROM " LINE WILL ALWAYS BE USED.")

RETURN AP.PARSE_ARGS()
IF __NAME__ == '__MAIN__':

OPTS = INITIALIZE_OPTIONS()



# CHECK ARGS

IF (NOT OS.PATH.ISDIR(OPTS.MBOX_START_DIR)):

RAISE ARGUMENTTYPEERROR("THE GIVEN MBOX START DIR IS NOT A DIRECTORY")

IF (NOT OS.PATH.ISDIR(OPTS.DEST_DIR)):

RAISE ARGUMENTTYPEERROR("THE GIVEN DESTINATION DIR IS NOT A DIRECTORY")



# WALK THE MBOXES AND CREATE THE DESTINATION EML FILE STRUCTURE

RECURSE_MBOX(OS.PATH.ABSPATH(OPTS.MBOX_START_DIR),

OS.PATH.ABSPATH(OPTS.DEST_DIR),

(OPTS.DONT_CHANGE_MTIMES == FALSE),

(OPTS.IGNORE_DATE_HEADER == FALSE))



# OPTIONALLY CREATE A GZIPPED TAR ARCHIVE

IF (OPTS.TGZ): CREATE_TGZ(OPTS.DEST_DIR, OPTS.TGZ)


SAVE THE SCRIPT ABOVE SOMEWHERE AS MBOX2ZEML.PY. FOR EXAMPLE: /HOME/USER/MIGRATE/MBOX2ZEML.PY
TO GET COMMANDLINE HELP, EXECTUTE THE SCRIPT WITH THE -H OPTION LIKE THIS:

PYTHON MBOX2ZEML.PY -H
SO, IN ORDER TO MIGRATE THUNDERBIRD'S LOCAL FOLDERS TO A ZIMBRA DESKTOP INSTALLATION, FOLLOW THESE STEPS:



MAKE SURE YOU READ THE SCRIPT'S COMMANDLINE HELP. IDEALLY ALSO READ THE SOURCE CODE SO YOU UNDERSTAND WHAT THE SCRIPT DOES.

MAKE SURE YOU HAVE PYTHON2 INSTALLED. THERE SHOULD BE NO OTHER DEPENDENCIES. (TESTED WITH PYTHON2.7 ON ARCH LINUX).

COPY THE LOCAL FOLDERS FROM THUNDERBIRD TO THE MACHINE WHERE YOU'LL BE EXECUTING THE SCRIPT. IF YOU'RE USING THE SCRIPT ON THE PC THAT HOLDS THE THUNDERBIRD PROFILE, YOU MAY SKIP THIS STEP.

CREATE A DESTINATION DIRECTORY. FOR EXAMPLE: /HOME/USER/MIGRATE/DEST. MAKE SURE, THE USER THAT WILL EXECUTE THE SCRIPT HAS WRITE PRIVILEGES IN THAT DIRECTORY.

LAUNCH THE SCRIPT. EXAMPLE:

PYTHON /HOME/USER/MIGRATE/MBOX2ZEML.PY -S THUNDERBIRD_DIR -D /HOME/USER/MIGRATE/DEST /HOME/USER/MIGRATE/IMPORTME.TGZ

THE DIRECTORY TREE WITH EML FILES WILL BE CREATED IN THE DEST DIRECTORY. AFTER THAT, DEST'S CONTENTS WILL BE TAR/GZIPPED TO IMPORTME.TGZ

OPEN ZIMBRA DESKTOP, GO TO PREFERENCES -> LOCAL FOLDERS -> IMPORT / EXPORT AND IMPORT THE TGZ FILE.


ANY IMPROVEMENTS TO THE SCRIPT ARE WELCOME. FOR EXAMPLE, I DIDN'T IMPLEMENT ANY ERROR HANDLING OR LOGGING. THERE COULD BE MORE COMMENTS AND SO ON.
EDIT: I NOTICED THAT ZIMBRA DESKTOP DOESN'T SEEM TO LIKE TGZ FILES OVER 2 GB IN SIZE ON WINDOWS (CAN SOMEONE CONFIRM THIS?), SO I HAD TO SPLIT THE LOCAL FOLDERS FOR SOME BIG ACCOUNTS.
I HOPE THIS CAN BE USEFUL FOR SOMEONE.
Service
Elite member
Elite member
Posts: 1023
Joined: Tue Apr 14, 2009 2:44 pm

Script: (Thunberbird) mbox tree to (zimbra desktop) importable .eml tree

Post by Service »

hello fbongartz,
THANKS A LOT for this script !!!!

In the past, I have change from outlook to thunderbird easily. But today, I have to go back on outlook. Since a couple of days I search and test a lot of tools that do not work.

I use your script to convert my mbox to eml keeping all the structure (folders/subfolders).

I appreciate your work, and I'm sure to not have a virus like if I test .exe files...

Now that I have the eml structure, I just have to do the rest by importing eml to outlook.

Thanks again.
fbongartz
Posts: 15
Joined: Sat Sep 13, 2014 1:21 am

Script: (Thunberbird) mbox tree to (zimbra desktop) importable .eml tree

Post by fbongartz »

[quote]hello fbongartz,

THANKS A LOT for this script !!!!

.[/QUOTE]
Welcome. I'm glad you could use it for something :-)
ggrussenmeyer
Posts: 2
Joined: Sat Sep 13, 2014 2:58 am

Script: (Thunberbird) mbox tree to (zimbra desktop) importable .eml tree

Post by ggrussenmeyer »

Thanks for this efficient script!
For future users, it is important to note that the compressed file's prefix must be "tgz" for Zimbra Desktop to accept it as a valid import file.
Post Reply