Solved: DANGER: 8.8.15p20 broke working machine openssl

Ask questions about your setup or get help installing ZCS server (ZD section below).
User avatar
JDunphy
Outstanding Member
Outstanding Member
Posts: 889
Joined: Fri Sep 12, 2014 11:18 pm
Location: Victoria, BC
ZCS/ZD Version: 9.0.0_P39 NETWORK Edition

Solved: DANGER: 8.8.15p20 broke working machine openssl

Post by JDunphy »

Have a RedHat 6 fully patched OS running 8.8.15 that was working flawlessly... Saw the critical update and the comments they had it working so thought I would get a head start on the weekend and do this on my home machine. I observed the following during my zmcontrol restart after the yum update during the stop phase of zmcontrol restart.

Code: Select all

	Stopping proxy...Failed.
Stopping proxy...nginx: [emerg] SSL_CTX_new() failed (SSL: error:2406C06E:random number generator:RAND_DRBG_instantiate:error retrieving entropy error:2406B072:random number generator:RAND_DRBG_generate:in error state error:2406C06E:random number generator:RAND_DRBG_instantiate:error retrieving entropy error:2406C06E:random number generator:RAND_DRBG_instantiate:error retrieving entropy error:2406B072:random number generator:RAND_DRBG_generate:in error state error:2406C06E:random number generator:RAND_DRBG_instantiate:error retrieving entropy error:2406B072:random number generator:RAND_DRBG_generate:in error state error:2406C06E:random number generator:RAND_DRBG_instantiate:error retrieving entropy error:2406B072:random number generator:RAND_DRBG_generate:in error state error:2406C06E:random number generator:RAND_DRBG_instantiate:error retrieving entropy error:2406C06E:random number generator:RAND_DRBG_instantiate:error retrieving entropy error:2406B072:random number generator:RAND_DRBG_generate:in error state err:2406c06e:24:6c:6e err:240)
/opt/zimbra/bin/zmproxyctl: line 54:  9238 Segmentation fault      sudo /opt/zimbra/common/sbin/nginx -c /opt/zimbra/conf/nginx.conf -s stop
failed.
I then attempted to perform a zmcontrol restart and got this after the stop ran to completion with all the daemons.

Code: Select all

Host mail.example.com
	Starting ldap...Done.
Failed.
Failed to start slapd.  Attempting debug start to determine error.
main: TLS init def ctx failed: -1
I traced the certs and attempted to use openssl via zmcertmgr to re-push my certs and saw the same exact message from that command:

Code: Select all

** Copying '/opt/zimbra/ssl/zimbra/commercial/commercial.crt' to '/opt/zimbra/conf/imapd.crt'
** Copying '/opt/zimbra/ssl/zimbra/commercial/commercial.key' to '/opt/zimbra/conf/imapd.key'
** Creating file '/opt/zimbra/ssl/zimbra/jetty.pkcs12'
ERROR: openssl pkcs12 export to '/opt/zimbra/ssl/zimbra/jetty.pkcs12' failed(1):
140076806330112:error:2406C06E:random number generator:RAND_DRBG_instantiate:error retrieving entropy:crypto/rand/drbg_lib.c:340:
140076806330112:error:2406C06E:random number generator:RAND_DRBG_instantiate:error retrieving entropy:crypto/rand/drbg_lib.c:340:
140076806330112:error:2406B072:random number generator:RAND_DRBG_generate:in error state:crypto/rand/drbg_lib.c:593:
140076806330112:error:2406C06E:random number generator:RAND_DRBG_instantiate:error retrieving entropy:crypto/rand/drbg_lib.c:340:
140076806330112:error:2406C06E:random number generator:RAND_DRBG_instantiate:error retrieving entropy:crypto/rand/drbg_lib.c:340:
140076806330112:error:2406B072:random number generator:RAND_DRBG_generate:in error state:crypto/rand/drbg_lib.c:593:
140076806330112:error:2406C06E:random number generator:RAND_DRBG_instantiate:error retrieving entropy:crypto/rand/drbg_lib.c:340:
140076806330112:error:2406B072:random number generator:RAND_DRBG_generate:in error state:crypto/rand/drbg_lib.c:593:
140076806330112:error:23073041:PKCS12 routines:PKCS12_pack_p7encdata:malloc failure:crypto/pkcs12/p12_add.c:110:
[Wed Mar 31 18:26:25 PDT 2021] Error deploy for domain:mail.example.com
[Wed Mar 31 18:26:25 PDT 2021] Deploy error.
I verified permissions again and it all looks good. Feels like they have a version of openssl that won't work with the kernel 5.10.13-x86_64-linode ( we are running is my current guess given that message about RAND_DRBG. I opened a ticket with support but my email address is down so that isn't going to fly.

Any ideas? I am tempted to install -s to get something back up and running but wanted to check if anyone else is seeing this. My concern is that a later patch from 8.8.15 will have changed some database tables so it isn't possible to actually do that.

Is there any way to update only say through patch a patch or do I have to go full patch 20 or nothing with this?

Thanks for any help or ideas

Jim
Last edited by JDunphy on Thu Apr 01, 2021 4:41 pm, edited 1 time in total.
User avatar
vavai
Advanced member
Advanced member
Posts: 174
Joined: Thu Nov 14, 2013 2:41 pm
Location: Indonesia
ZCS/ZD Version: 0
Contact:

Re: DANGER: 8.8.15p20 broke working machine openssl

Post by vavai »

I got similar experience with Zimbra 8.8.15 patch 20 on Ubuntu 14.04 64 bit. Took hours with no solution. Attempting to revert back to previous patch but it seems each package has relation to others and downgrade zimbra-openssl package lead to removing others package.

As I have backup system and also a cloned backup, I finally solved it by do release upgrade from 14.04 to 16.04.

It seems that Ubuntu 14.04, CentOS 6 and/or RHEL 6 will have similar problem as it shipped with similar openssl version.

As all the above OS version will be Zimbra EOL soon, I took dist upgrade decision but sure you must have full backup to prevent any unwanted result.

Vavai
User avatar
JDunphy
Outstanding Member
Outstanding Member
Posts: 889
Joined: Fri Sep 12, 2014 11:18 pm
Location: Victoria, BC
ZCS/ZD Version: 9.0.0_P39 NETWORK Edition

Re: DANGER: 8.8.15p20 broke working machine openssl

Post by JDunphy »

I also punted at present but I took a snapshot of the broken zimbra instance and re-created a new VPS from that broken image to continue to debug this on.
I then restored our production box back to the previous working state prior to the patch... so the pressure is off and I can get my emails and work with support on this.

Thanks for you comment as that is good information to know.

Jim
User avatar
zimico
Outstanding Member
Outstanding Member
Posts: 225
Joined: Mon Nov 14, 2016 8:03 am
Location: Vietnam
ZCS/ZD Version: 8.8.15 P3
Contact:

Re: DANGER: 8.8.15p20 broke working machine openssl

Post by zimico »

Hi Jim and everyone,
How are you?
We are planning to carry out this update this weekend but we should postpone it a bit :)
BTW, our system is quite large (3TB mailstore servers). We have NGBackup. Normally we do vmware snapshot before doing any update. However, now the data on VM is quite big and creating/removing snapshot take long long time. I see that you use install -s to do backup and I am not very clear about this. My question is what is the best way to backup/restore when updating has problem?

My warmest reards,
Minh.
User avatar
axslingr
Outstanding Member
Outstanding Member
Posts: 256
Joined: Sat Sep 13, 2014 2:20 am
ZCS/ZD Version: 8.8.15.GA.3869.UBUNTU18.64 UBUNTU18

Re: DANGER: 8.8.15p20 broke working machine openssl

Post by axslingr »

Does disabling TLS for LDAP allow it to start?

Code: Select all

su - zimbra 
zmlocalconfig -e ssl_allow_untrusted_certs=true 
zmlocalconfig -e ldap_starttls_supported=0
zmlocalconfig -e ldap_starttls_required=false
zmlocalconfig -e ldap_common_require_tls=0
zmcontrol restart
User avatar
JDunphy
Outstanding Member
Outstanding Member
Posts: 889
Joined: Fri Sep 12, 2014 11:18 pm
Location: Victoria, BC
ZCS/ZD Version: 9.0.0_P39 NETWORK Edition

Re: DANGER: 8.8.15p20 broke working machine openssl

Post by JDunphy »

This is resolved... the problem was the latest kernel. Kernels newer than 4.8 have bad entropy according to change notes and various others discussing the problem.
OpenSSL is full of surprises...

CHANGES says:
Linux kernels 4.8 and later, don't have a reliable way to detect
that /dev/urandom has been properly seeded, so a failure is raised
for this case (i.e. the getentropy(2) call has already failed).
Observe:

Code: Select all

[zimbra@mail ~]$  cat /proc/sys/kernel/random/entropy_avail 
425
[zimbra@mail ~]$ uname -r
4.7.3-x86_64-linode73
That older kernel resolved everything and 4.8.15P20 has no issue. I am sure that when Zimbra tested this patch, they went against the 2.6+ kernel from RedHat and probably didn't have an issue and called it a day. Next, I am going to investigate if a simple recompile with a different seed option can also resolve this. Note: Patch 20 includes openssl-libs-1.1.1h-1 but I believe 1.1.1k fixed the recently DoS bug against it so we will probably be right back here next month.

This platform is end of life for us so we will be moving to a more modern version once I can figure out what that will be now that Centos 8 was abruptly pulled and re-launced as stream.

Jim
User avatar
JDunphy
Outstanding Member
Outstanding Member
Posts: 889
Joined: Fri Sep 12, 2014 11:18 pm
Location: Victoria, BC
ZCS/ZD Version: 9.0.0_P39 NETWORK Edition

Re: DANGER: 8.8.15p20 broke working machine openssl

Post by JDunphy »

axslingr wrote:Does disabling TLS for LDAP allow it to start?

Code: Select all

su - zimbra 
zmlocalconfig -e ssl_allow_untrusted_certs=true 
zmlocalconfig -e ldap_starttls_supported=0
zmlocalconfig -e ldap_starttls_required=false
zmlocalconfig -e ldap_common_require_tls=0
zmcontrol restart
For this bug, it would not work as the library was causing it to abort and you need a functioning library to generate and install certificates(because of verification), etc.
User avatar
JDunphy
Outstanding Member
Outstanding Member
Posts: 889
Joined: Fri Sep 12, 2014 11:18 pm
Location: Victoria, BC
ZCS/ZD Version: 9.0.0_P39 NETWORK Edition

Re: DANGER: 8.8.15p20 broke working machine openssl

Post by JDunphy »

zimico wrote:Hi Jim and everyone,
How are you?
We are planning to carry out this update this weekend but we should postpone it a bit :)
BTW, our system is quite large (3TB mailstore servers). We have NGBackup. Normally we do vmware snapshot before doing any update. However, now the data on VM is quite big and creating/removing snapshot take long long time. I see that you use install -s to do backup and I am not very clear about this. My question is what is the best way to backup/restore when updating has problem?

My warmest reards,
Minh.
Hi Minh,

The install -s has no effect as they are pulling the latest from the repositories so unless your kernel is really new like we were running, you probably will not have issue I suspect. As a precaution, don't do what I did :-) ;-) ... take a snapshot first if you can, then test on that snapshot. I was so sure this was a non issue that I did it in reverse.

I'll update my ticket with Zimbra now that I have one resolution.

Jim
User avatar
JDunphy
Outstanding Member
Outstanding Member
Posts: 889
Joined: Fri Sep 12, 2014 11:18 pm
Location: Victoria, BC
ZCS/ZD Version: 9.0.0_P39 NETWORK Edition

Re: Solved: DANGER: 8.8.15p20 broke working machine openssl

Post by JDunphy »

Found another solution to this problem so that it will work with kernel 4.8+ .... here is the recipe to recompile with the following options.

Code: Select all

% mkdir openssl
% cd openssl
% wget https://www.openssl.org/source/old/1.1.1/openssl-1.1.1h.tar.gz
% tar zxvf openssl-1.1.1h.tar.gz
% cd openssl-1.1.1h
% ./config --prefix=/opt/zimbra/common --with-rand-seed=devrandom,rdcpu,os,getrandom
% make
% su -
# chown zimbra /opt/zimbra/common/lib
# chown zimbra:zimbra /opt/zimbra/common/lib/libcryp*
% su - zimbra
% cd to-wherver-you-put-the compiled-openssl-1.1.1h
% make install
% openssl --version
% exit
# chown root /opt/zimbra/common/lib
# chown root:root /opt/zimbra/common/lib/libcryp*
This isn't extensively tested but it appears to work provided you first attempt patch20 and it fails as we are building on that work.

Then we are overlaying the binaries and libraries from patch20 with the same version but compiled ourselves with the addition of getrandom and thus preventing the need to look in the kernel for the function. I don't know how Zimbra compiled this but it appears they may not have added a list like --with-rand-seed to cover more cases.

Please excuse the chown as I wanted to leave as much installed as zimbra and this was the fastest method to see what needed to be root vs zimbra for ownership. There were only 2 files and a directory which we change back after the install. The make install will abort when it comes time to include the headers files (developers) which we don't need so that is fine. You can do a make -n install first if you want to see what it would do without the actual install.

This method would allow openssl to work with all kernels as I also tested it with a 4.7 kernel. Seems to work perfectly with newer and older kernels.

Jim
jhurley
Zimbra Employee
Zimbra Employee
Posts: 34
Joined: Wed Apr 27, 2016 7:04 pm

Re: Solved: DANGER: 8.8.15p20 broke working machine openssl

Post by jhurley »

Zimbra support working with Zimbra QE verified that there was an issue with Kernel version 4.8 and 4.9.
Zimbra Development addressed this and updated the repository.
If anyone runs into this issue, please let us know the kernel version for further investigation.

Sincerely
Zimbra Support
Post Reply