zm_build FOSS Upgrade from 8.8.15_P43 to 10.0.5 on Rocky Linux 8 fails

Ask questions about your setup or get help installing ZCS server (ZD section below).
Post Reply
liverpoolfcfan
Elite member
Elite member
Posts: 1225
Joined: Sat Sep 13, 2014 12:47 am

zm_build FOSS Upgrade from 8.8.15_P43 to 10.0.5 on Rocky Linux 8 fails

Post by liverpoolfcfan »

I followed the zm_build instructions to build 10.0.5 on a Rocky Linux 8 development system using

Code: Select all

ENV_CACHE_CLEAR_FLAG=true ./build.pl --ant-options -DskipTests=true --git-default-tag=10.0.5,10.0.4,10.0.3,10.0.2,10.0.1,10.0.0-GA,10.0.0 --build-release-no=10.0.5 --build-type=FOSS --build-release=DAFFODIL --build-release-candidate=GA --build-thirdparty-server=files.zimbra.com --build-no=1234 --no-interactive
The build completed.

On my test installation, I tried upgrading 8.8.15P43 using the installer. All went fine until it got to updating mysql

Code: Select all

Restoring existing configuration file from /opt/zimbra/.saveconfig/localconfig.xml...done
Operations logged to /tmp/zmsetup.20231023-170834.log
Adding /opt/zimbra/conf/ca/ca.pem to cacerts
Upgrading from 8.8.15_GA_4362 to 10.0.5_GA_1234
Stopping zimbra services...done.
This appears to be 8.8.15_GA
Starting mysql...done.
Checking ldap status...not running.
Checking ldap status...not running.
Starting ldap...done.
Running mysql_upgrade...done.
Schema upgrade required from version 111 to 118.
Running /opt/zimbra/libexec/scripts/migrate20210506-BriefcaseApi.pl
Mon Oct 23 17:09:43 2023: Verified schema version 111.
Mon Oct 23 17:13:25 2023: Verified schema version 111.
Mon Oct 23 17:13:25 2023: Updating DB schema version from 111 to 112.
Running /opt/zimbra/libexec/scripts/migrate20200625-MobileDevices.pl
Mon Oct 23 17:13:34 2023: Verified schema version 112.
Mon Oct 23 17:13:34 2023: Adding mobile_operator column to ZIMBRA.MOBILE_DEVICES table.
ERROR 1050 (42S01) at line 1: Table 'zimbra/#sql-ib276659-1384999336' already exists
Mon Oct 23 17:13:35 2023: Error while running '/opt/zimbra/bin/mysql --user=zimbra --password=REMOVED --database=zimbra --batch --skip-column-names'.
Script failed with code 256:  - exiting
UPGRADE FAILED - exiting.

Have I done something incorrectly in the build process?

How can I fix this?

[EDIT] Updated title to include "Rocky Linux 8"
Last edited by liverpoolfcfan on Thu Oct 26, 2023 10:47 am, edited 1 time in total.
liverpoolfcfan
Elite member
Elite member
Posts: 1225
Joined: Sat Sep 13, 2014 12:47 am

Re: zm_build FOSS Upgrade from 8.8.15_P43 to 10.0.5 fails

Post by liverpoolfcfan »

Replying to my own thread to provide learnings to others ...

It appears as though this particular issue might have been caused by a temp table not being cleared quickly enough in mysql on my test build. I re-started the whole upgrade process a couple of times and each time ran into this.

Finally, I decided to see what would happen if I just re-ran the zcs-10... installer a second time without reverting to zimbra 8. The second time the installer went through the basic re-install again, and then recognised it had already updated the schema to 112, and continued from there. The temp table was obviously cleared already so that didn't cause a problem this time. The installation completed up to the point where it could not re-install my ESDCA commercial certificate due to the bug in zmcertmgr that does not handle non-RSA certs. This is a known issue and I hope will be fixed by the time 10.0.6 is released. See this thread for details viewtopic.php?t=72284

I applied the patch to zmcertmgr and re-deployed my commercial certificate. All went well.

However, upon starting the services, mailbox and the related webapps failed to start.

Looking in /opt/zimbra/log/zmmailboxd.out I found the problem

Code: Select all

2023-10-25 17:16:29.432:INFO::main: Logging initialized @5114ms to org.eclipse.jetty.util.log.StdErrLog
2023-10-25 17:16:30.419:WARN:oejx.XmlParser:main: FATAL@null line:989 col:13 : org.xml.sax.SAXParseException; lineNumber: 989; columnNumber: 13; The string "--" is not permitted within comments.
2023-10-25 17:16:30.420:WARN:oejx.XmlConfiguration:main: 
java.security.PrivilegedActionException: org.xml.sax.SAXParseException; lineNumber: 989; columnNumber: 13; The string "--" is not permitted within comments.
	at java.base/java.security.AccessController.doPrivileged(AccessController.java:573)
	at org.eclipse.jetty.xml.XmlConfiguration.main(XmlConfiguration.java:1857)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
	at org.eclipse.jetty.start.Main.invokeMain(Main.java:218)
	at org.eclipse.jetty.start.Main.start(Main.java:491)
	at org.eclipse.jetty.start.Main.main(Main.java:77)
Caused by: 
org.xml.sax.SAXParseException; lineNumber: 989; columnNumber: 13; The string "--" is not permitted within comments.
It doesn't specifically call out the offending XML file but I started with what I thought would be the obvious one - /opt/zimbra/jetty_base/etc/jetty.xml

Sure enough jetty.xml has "--" in line 989 at column 13.
Line 989

Code: Select all

        <!-- Modern UI uses build time compression -->
The problem arises because the comment marks around this line are contained in a larger comment block - commenting out HTTPCOMPRESSION

Code: Select all

    
    <!-- HTTPCOMPRESSIONBEGIN 
    <Get id="next" name="handler" />
    <Set name="handler">
      <New id="GzipHandler" class="org.eclipse.jetty.server.handler.gzip.GzipHandler">
        <Set name="handler"><Ref refid="next" /></Set>
        <Set name="minGzipSize"><Property name="jetty.gzip.minGzipSize" deprecated="gzip.minGzipSize" default="2048"/></Set>
        <Set name="checkGzExists"><Property name="jetty.gzip.checkGzExists" deprecated="gzip.checkGzExists" default="false"/></Set>
        <Set name="compressionLevel"><Property name="jetty.gzip.compressionLevel" deprecated="gzip.compressionLevel" default="-1"/></Set>
        <Set name="excludedAgentPatterns">
          <Array type="String">
            <Item><Property name="jetty.gzip.excludedUserAgent" deprecated="gzip.excludedUserAgent" default=".*MSIE.6\.0.*"/></Item>
          </Array>
        </Set>

        <Set name="includedMethods">
          <Array type="String">
            <Item>GET</Item>
            <Item>POST</Item>
          </Array>
        </Set>

        <!-- Modern UI uses build time compression -->
        <Set name="excludedPaths">
          <Array type="String">
            <Item>/modern/*</Item>
          </Array>
        </Set>
      </New>
    </Set>
     HTTPCOMPRESSIONEND -->
    
but the presense of comment marks nested inside a comment block breaks the parser rules. See screenshot of code editor - you can see that the presense of the nested comment breaks the larger comment block.
ErrantCommentAtLine989ofJettyXML.png
ErrantCommentAtLine989ofJettyXML.png (71.17 KiB) Viewed 10653 times

Code: Select all

XML Comments Rules
Following rules should be followed for XML comments −

Comments cannot appear before XML declaration.
Comments may appear anywhere in a document.
Comments must not appear within attribute values.
Comments cannot be nested inside the other comments.
Deleting the nested comment from line 989 and saving the file allowed mailbox and the webapps to start.

It looks like this jetty.xml fix will not survive a restart though - I assume the file is geting exploded from a WAR file periodically. This needs Synacor to fix it.

In the meantime, if anyone can tell me where the XML file comes from it would be appreciated. UPDATE: Looks like zmconfigd periodically re-writes jetty.xml based on jetty.xml.in in the same directory with local configurations added. Deleting line 976 of jetty.xml.in allows the update to persist (until the next build) Still needs a Synacor fix.

Other than these two small issues (zmcertmgr and jetty.xml comment) the whole process went quite well.

As noted elsewhere in the forum (perhaps Ian's thread) you need at least 4GB of RAM in your build system to build zcs-10... If you try with less the build will randomly stop, and in my experience it is an all or nothing build. It either runs all the way through from start to finish successfully in one go, or it fails in a terminal way. In my environment I could not restart a failed build and have it complete. I had to delete the build folder and start the process again.
liverpoolfcfan
Elite member
Elite member
Posts: 1225
Joined: Sat Sep 13, 2014 12:47 am

Re: zm_build FOSS Upgrade from 8.8.15_P43 to 10.0.5 on Rocky Linux 8 fails

Post by liverpoolfcfan »

Additional observations :-

Help links may not work - presumably they would work If I created the build with an exact matching build number to a NETWORK build. I entered build number 1005 when building, and the resulting URL below shows utm_content=10.0.5_GA_1005 so whatever you enter will be used when looking for help. Clicking on my username in the top-right, and selecting "Help Central Online" (should that be Centre?) I get the URL

Code: Select all

https://help.zimbra.com/?utm_source=mail&utm_medium=zcs&utm_content=10.0.5_GA_1005&utm_campaign=help
and a page could not be found error.
liverpoolfcfan
Elite member
Elite member
Posts: 1225
Joined: Sat Sep 13, 2014 12:47 am

Re: zm_build FOSS Upgrade from 8.8.15_P43 to 10.0.5 on Rocky Linux 8 fails

Post by liverpoolfcfan »

Service shutdown for the first webapp looks to be considerably slower on 10.0.5 than 8.8.15

On 8.8.15 Stopping zimlet webapp...took 145 second(s)...Done.
On 10.0.5 Stopping zimlet webapp...took 665 second(s)...Done.

This could be an issue for FOSS users who shutdown nightly to perform a backup

Anybody have a suggestion as to why this would be?

Code: Select all

[zimbra@mail ~]$ zmcontrol stop
Host 8.8.15
        Stopping zmconfigd...took 1 second(s)...Done.
        Stopping zimlet webapp...took 145 second(s)...Done.
        Stopping zimbraAdmin webapp...took 0 second(s)...Done.
        Stopping zimbra webapp...took 0 second(s)...Done.
        Stopping service webapp...took 0 second(s)...Done.
        Stopping stats...took 1 second(s)...Done.
        Stopping mta...took 12 second(s)...Done.
        Stopping spell...took 0 second(s)...Done.
        Stopping snmp...took 1 second(s)...Done.
        Stopping cbpolicyd...took 0 second(s)...Done.
        Stopping archiving...took 0 second(s)...Done.
        Stopping opendkim...took 5 second(s)...Done.
        Stopping amavis...took 0 second(s)...Done.
        Stopping antivirus...took 3 second(s)...Done.
        Stopping antispam...took 3 second(s)...Done.
        Stopping proxy...took 2 second(s)...Done.
        Stopping memcached...took 1 second(s)...Done.
        Stopping mailbox...took 0 second(s)...Done.
        Stopping logger...took 2 second(s)...Done.
        Stopping dnscache...took 5 second(s)...Done.
        Stopping ldap...took 2 second(s)...Done.
[zimbra@mail ~]$ exit

Code: Select all

[zimbra@mail ~]$ zmcontrol stop
Host 10.0.5
        Stopping zmconfigd...took 2 second(s)...Done.
        Stopping zimlet webapp...took 665 second(s)...Done.
        Stopping zimbraAdmin webapp...took 0 second(s)...Done.
        Stopping zimbra webapp...took 0 second(s)...Done.
        Stopping service webapp...took 1 second(s)...Done.
        Stopping stats...took 1 second(s)...Done.
        Stopping mta...took 12 second(s)...Done.
        Stopping onlyoffice...took 0 second(s)...Done.
        Stopping spell...took 1 second(s)...Done.
        Stopping snmp...took 1 second(s)...Done.
        Stopping cbpolicyd...took 0 second(s)...Done.
        Stopping archiving...took 0 second(s)...Done.
        Stopping opendkim...took 5 second(s)...Done.
        Stopping amavis...took 0 second(s)...Done.
        Stopping antivirus...took 5 second(s)...Done.
        Stopping antispam...took 7 second(s)...Done.
        Stopping proxy...took 1 second(s)...Done.
        Stopping memcached...took 1 second(s)...Done.
        Stopping mailbox...took 1 second(s)...Done.
        Stopping logger...took 2 second(s)...Done.
        Stopping dnscache...took 5 second(s)...Done.
        Stopping ldap...took 1 second(s)...Done.
[zimbra@mail ~]$ exit
zmcontrol
Advanced member
Advanced member
Posts: 70
Joined: Fri Jul 24, 2020 12:43 am

Re: zm_build FOSS Upgrade from 8.8.15_P43 to 10.0.5 on Rocky Linux 8 fails

Post by zmcontrol »

liverpoolfcfan wrote: Fri Oct 27, 2023 2:49 pm Service shutdown for the first webapp looks to be considerably slower on 10.0.5 than 8.8.15

On 8.8.15 Stopping zimlet webapp...took 145 second(s)...Done.
On 10.0.5 Stopping zimlet webapp...took 665 second(s)...Done.

This could be an issue for FOSS users who shutdown nightly to perform a backup

Anybody have a suggestion as to why this would be?

liverpoolfcfan,

I'm not sure if it's the same with v10 but this has been ongoing for some v8.8 users.
While shutting down, zmstorectl uses a 600 second delay waiting for the number of dirty pages in MySQL to become 0.

viewtopic.php?p=285807#p285807
https://bugzilla.zimbra.com/show_bug.cgi?id=107669

I noticed this on a system that was upgraded from v8.0 to v8.8, zmstorectl will always hang.
I safely edited zmstorectl to use a 10 second delay without issue.
Post Reply