Operational Defect Database

BugZero updated this defect 52 days ago.

VMware | 94664

NSX Manager/NSX Global-Manager UI Not Accessible After Replacing CBM_* Certificates in 4.1.1

Last update date:

3/28/2024

Affected products:

NSX-T

Affected releases:

No affected releases provided.

Fixed releases:

No fixed releases provided.

Description:

Symptoms

Certificate Name and Service Type Mapping NSX Manager Certificate Documentation For Reference: https://docs.vmware.com/en/VMware-NSX/4.1/administration/GUID-3DD19193-770C-47F3-A0F3-7B7703F274C8.html Certificate Name Service TypeAPI-Corfu Clientservice_type=CBM_APIAR-Corfu Clientservice_type=CBM_ARCCP-Corfu Clientservice_type=CBM_CCPCluster Manager-Corfuservice_type=CBM_CLUSTER_MANAGERCM Inventory-Corfu Clientservice_type=CBM_CM_INVENTORYCorfu Serverservice_type=CBM_CORFUIDPS reporting-Corfu Clientservice_type=CBM_IDPS_REPORTINGMessaging Manager-Corfu Clientservice_type=CBM_MESSAGING_MANAGERMonitoring-Corfu Clientservice_type=CBM_MONITORINGMP-Corfu Clientservice_type=CBM_MPSite Manager-Corfu Clientservice_type=CBM_SITE_MANAGERUpgrade Coordinator-Corfu Clientservice_type=CBM_UPGRADE_COORDINATORGM-Corfu Clientservice_type=CBM_GM After replacing some CBM_* certificates on the NSX manager or NSX global-manager nodes, UI is NOT accessible with any of the manager node IPs. After replacing CBM_MP or CBM_AR or CBM_GM certificate for all the 3 NSX manager or NSX global-manager nodes, the corresponding service is DOWN on all the 3 NSX manager or NSX global-manager nodes.For the example below, this was the "get cluster status" CLI output after replacing the CBM_MP certificate on all 3 NSX manager nodes. Cluster status Group Type: MANAGER Group Status: UNAVAILABLE Members: UUID FQDN IP IPv6 STATUS <UUID_MGR1> <FQDN_MGR1> <IP_MGR1> - DOWN <UUID_MGR2> <FQDN_MGR2> <IP_MGR2> - DOWN <UUID_MGR3> <FQDN_MGR3> <IP_MGR3> - DOWN Group Type: HTTPS Group Status: UNAVAILABLE Members: UUID FQDN IP IPv6 STATUS <UUID_MGR1> <FQDN_MGR1> <IP_MGR1> - DOWN <UUID_MGR2> <FQDN_MGR2> <IP_MGR2> - DOWN <UUID_MGR3> <FQDN_MGR3> <IP_MGR3> - DOWN /var/log/cbm/cbm.log show that certificate replacement operation failed while replacing the private key due to FileNotFoundException as shown in the following example: 2023-09-16T19:10:58.975Z WARN pool-18-thread-4 Step 83803 - [nsx@6876 comp="nsx-manager" level="WARNING" subcomp="cbm"] javax.net.ssl.SSLException: java.io.FileNotFoundException: File '/config/cluster-manager/mp/private/keystore.password' does not exist at com.vmware.nsx.cbm.cert.CertUtils.readFromFile(CertUtils.java:73) at com.vmware.nsx.cbm.cert.impl.SelfSignedTrustArtifactory.replaceCertificatesOnDisk(SelfSignedTrustArtifactory.java:180) at com.vmware.nsx.cbm.tasks.impl.ReplaceCertificatesTask$ReplaceCertificatesOnDisk.executeStep(ReplaceCertificatesTask.java:261) at com.vmware.nsx.cbm.tasks.Task.executeTask(Task.java:329) at com.vmware.nsx.cbm.tasks.Task.executeTaskWithCheck(Task.java:300) at com.vmware.nsx.cbm.tasks.Task.call(Task.java:280) at com.vmware.nsx.cbm.tasks.Task.call(Task.java:46) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750)Caused by: java.io.FileNotFoundException: File '/config/cluster-manager/mp/private/keystore.password' does not exist at org.apache.commons.io.FileUtils.openInputStream(FileUtils.java:2368) at org.apache.commons.io.FileUtils.readFileToString(FileUtils.java:2486) at com.vmware.nsx.cbm.cert.CertUtils.readFromFile(CertUtils.java:71) ... 12 more2023-09-16T19:10:58.975Z ERROR pool-18-thread-4 Task 83803 - [nsx@6876 comp="nsx-manager" errorCode="CBM41" level="ERROR" subcomp="cbm"] Step ReplaceCertificatesOnDisk (6/7) failed for Task com.vmware.nsx.cbm.tasks.impl.ReplaceCertificatesTask: java.io.FileNotFoundException: File '/config/cluster-manager/mp/private/keystore.password' does not exist2023-09-16T19:10:58.975Z ERROR pool-18-thread-4 Task 83803 - [nsx@6876 comp="nsx-manager" errorCode="CBM411" level="ERROR" subcomp="cbm"] [CBM411] Error occurred while replacing certificates in private keyStores.javax.net.ssl.SSLException: java.io.FileNotFoundException: File '/config/cluster-manager/mp/private/keystore.password' does not exist2023-09-16T19:10:59.074Z ERROR CertificateStreamListener-1-1 CertificateStreamListener 83803 - [nsx@6876 comp="nsx-manager" errorCode="CBM100" level="ERROR" subcomp="cbm"] ReplaceCertificatesTask error: Optional[[CBM411] Error occurred while replacing certificates in private keyStores.], task status: FAILED. Checking the permissions on the filesystem of all the NSX Manager nodes for /config/cluster-manager/<service>/private/ shows that the permissions are not set to 770(-rwxrwx---) as needed. Bad State: # ls -l /config/cluster-manager/*/private/ /config/cluster-manager/api/private/: total 8 -rw------- 1 uproxy uproxy 2053 May 4 2021 keystore.jks -rw------- 1 uproxy uproxy 44 May 4 2021 keystore.password /config/cluster-manager/ar/private/: total 8 -rw------- 1 nsx-replicator nsx-replicator 2051 May 4 2021 keystore.jks -rw------- 1 nsx-replicator nsx-replicator 44 May 4 2021 keystore.password /config/cluster-manager/ccp/private/: total 8 -rw------- 1 nsx nsx 2050 May 4 2021 keystore.jks -rw------- 1 nsx nsx 44 May 4 2021 keystore.password /config/cluster-manager/cluster-manager/private/: total 8 -rw------- 1 nsx-cbm nsx-cbm 2076 May 4 2021 keystore.jks -rw------- 1 nsx-cbm nsx-cbm 44 May 4 2021 keystore.password /config/cluster-manager/cm-inventory/private/: total 8 -rw------- 1 ucminv ucminv 2071 Jul 27 2022 keystore.jks -rw------- 1 ucminv ucminv 44 Jul 27 2022 keystore.password /config/cluster-manager/corfu/private/: total 8 -rwxrwx--- 1 corfu corfu 2059 May 4 2021 keystore.jks -rwxrwx--- 1 corfu corfu 44 May 4 2021 keystore.password /config/cluster-manager/csm/private/: total 8 -rw------- 1 uproton uproton 2055 May 4 2021 keystore.jks -rw------- 1 uproton uproton 44 May 4 2021 keystore.password /config/cluster-manager/gm/private/: total 8 -rw------- 1 uproton uproton 2050 May 4 2021 keystore.jks -rw------- 1 uproton uproton 44 May 4 2021 keystore.password /config/cluster-manager/idps-reporting/private/: total 8 -rw------- 1 nsx-idps nsx-idps 2077 May 4 2021 keystore.jks -rw------- 1 nsx-idps nsx-idps 44 May 4 2021 keystore.password /config/cluster-manager/messaging-manager/private/: total 8 -rw------- 1 nsx-messaging nsx-messaging 2079 Jul 27 2022 keystore.jks -rw------- 1 nsx-messaging nsx-messaging 44 Jul 27 2022 keystore.password /config/cluster-manager/monitoring/private/: total 8 -rw------- 1 uphc uphc 2067 May 4 2021 keystore.jks -rw------- 1 uphc uphc 44 May 4 2021 keystore.password /config/cluster-manager/mp/private/: total 8 -rw------- 1 uproton uproton 2052 May 4 2021 keystore.jks -rw------- 1 uproton uproton 44 May 4 2021 keystore.password /config/cluster-manager/policy/private/: total 8 -rw------- 1 uproton uproton 2059 May 4 2021 keystore.jks -rw------- 1 uproton uproton 44 May 4 2021 keystore.password /config/cluster-manager/site-manager/private/: total 8 -rwxrwx--- 1 nsx-sm nsx-sm 2073 Sep 16 15:20 keystore.jks -rwxrwx--- 1 nsx-sm nsx-sm 44 Sep 16 15:20 keystore.password /config/cluster-manager/upgrade-coordinator/private/: total 8 -rw------- 1 uuc uuc 2085 Jul 27 2022 keystore.jks -rw------- 1 uuc uuc 44 Jul 27 2022 keystore.password /config/cluster-manager/vmc/private/: total 8 -rw------- 1 uproton uproton 2053 May 4 2021 keystore.jks -rw------- 1 uproton uproton 44 May 4 2021 keystore.password

Purpose

This article provides the steps to bring up UI on one NSX manager node. Once the UI becomes accessible and file permissions are fixed, then a new certificate can be generated via UI and expired certificate can be replaced via Apply Certificate API.

Cause

After upgrade from 3.2.x to 4.1.1, "nsx-cbm" linux user should be part of the service linux group and should have write permissions on the private keystore files of that service. But, the file permissions were not modified after upgrade from 3.2.x to 4.1.1 due to a bug in CBM's init script. So without updating the permissions of private keystore files, CBM fails to replace the CBM_* certificate private key for a service in 4.1.1.

Impact / Risks

The CBM_<service> that has had it's certificates replaced may be unable to connect to the CorfuDB. This can have varying impact and may result in the UI/API being inaccessible in the case of CBM_MP certificates having been replaced prior to permissions being fixed.This issue has been found in environments upgraded to 4.1.1 from 3.2.x.Greenfield environments deployed with NSX 4.1.1 or brownfield 4.0.x environments upgraded to 4.1.1 should not be impacted. Found In: NSX 4.1.1

Resolution

Resolved in NSX release 4.1.2 and above.

Workaround

Please contact VMware NSX GSS by opening a service request and referencing this KB article.

Additional Resources / Links

Share:

BugZero® Risk Score

What's this?

Coming soon

Status

Unavailable

Learn More

Search:

...