Operational Defect Database

BugZero updated this defect 32 days ago.

VMware | 96624

After replacing Managers or while running Upgrade prechecks, Repo_Sync is Failed

Last update date:

4/17/2024

Affected products:

NSX

Affected releases:

4.1

Fixed releases:

No fixed releases provided.

Description:

Symptoms

NSX 4.1.xAfter 1 or more NSX Managers are redeployed, REPO_SYNC is in Failed stateNSX Manager log /var/log/proton/nsxapi.log 2024-02-24T12:00:26.882Z INFO RepoSyncThread-1707748646882 RepoSyncServiceImpl 4841 SYSTEM [nsx@6876 comp="nsx-manager" level="INFO" subcomp="manager"] Starting Repo sync thread RepoSyncThread-123456789643212024-02-24T12:00::32.208Z INFO RepoSyncThread-1707748646882 RepoSyncFileHelper 4841 SYSTEM [nsx@6876 comp="nsx-manager" level="INFO" subcomp="manager"] Command to get server info for https://192.168.1.1:443/repository/4.1.1.0.0.22224312/HostComponents/rhel77_x86_64_baremetal_server/upgrade.sh returned result CommandResultImpl [commandName=null, pid=2227086, status=SUCCESS, errorCode=0, errorMessage=null, commandOutput=HTTP/1.1 404 Not Found2024-02-24T12:00::11.583Z INFO RepoSyncThread-1707748646882 RepoSyncFileHelper 4841 SYSTEM [nsx@6876 comp="nsx-manager" level="INFO" subcomp="manager"] Command to check if remote file exists for https://192.168.1.1:443/repository/4.1.1.0.0.22224312/Manager/vmware-mount/libvixMntapi.so.1 returned result CommandResultImpl [commandName=null, pid=2228965, status=SUCCESS, errorCode=0, errorMessage=null, commandOutput=HTTP/1.1 404 Not Found2024-02-24T12:00::11.583Z ERROR RepoSyncThread-1707748646882 RepoSyncServiceImpl 4841 SYSTEM [nsx@6876 comp="nsx-manager" errorCode="MP21057" level="ERROR" subcomp="manager"] Unable to start repository sync operation.See logs for more details. While preparing for an upgrade the Check Upgrade Readiness UI shows an error "Upgrade-coordinator upgrade failed. Error - Repository Sync status is not success on node <node IP>.""Repository sync is not complete" NSX Manager log /var/log/syslog 2024-02-24T12:00:52.800Z NSX_Manager NSX 98866 SYSTEM [nsx@6876 comp="nsx-manager" errorCode="MP30487" level="ERROR" subcomp="upgrade-coordinator"] Repository sync is not successful on <Managers IPs>. Please ensure Repository Sync Status is successful on all MP cluster nodes.2024-02-24T12:00:52.800Z NSX_Manager NSX 98866 SYSTEM [nsx@6876 comp="nsx-manager" errorCode="MP30040" level="ERROR" subcomp="upgrade-coordinator"] Error while updating upgrade-coordinator due to error Repository Sync status is not success on node <Managers IPs>. Please ensure Repository Sync status is success on all MP nodes before proceeding..

Resolution

This is a known issue impacting NSX.

Workaround

Warning this procedure involves the use of the "rm" command which irreversibly removes files from the system.Ensure backups are taken and restore passphrase is known before proceeding.1) Download VMware-NSX-upgrade-bundle-<version>.mub MUB file from the Customer connect portal. The downloaded version should match the version reported not found in the logs, in this example 4.1.1.0.0.22224312.2) Log into any Manager as admin and run the admin cli nsx-mngr> get service install-upgrade Service name: install-upgrade Service state: stopped Enabled on: 192.168.1.1 <<< orchestrator node Copy the mub file to /image directory of orchestrator node.3) Extract MUB file - # cd /image # tar -xf VMware-NSX-upgrade-bundle-<version>.mub This will create a new file with the same name and .tar.gz extension4) Delete the folder for your current version under /repository. For example in this example the system runs 4.1.1 # rm -rf /repository/4.1.1.0.0.222243125) Extract tar.gz to /repository # tar -xzf /image/VMware-NSX-upgrade-bundle-<version>.tar.gz -C /repository6) Set proper permissions and ownership of the /repository files by executing the following - /opt/vmware/proton-tomcat/bin/reposync_helper.sh7) From the UI Resolve the REPO_SYNC on the orchestrator node System -> Appliances -> View Details Click Resolve for REPO_SYNC Once completed, repeat for each of the other 2 Managers.8) Clean up the downloaded mub file and extracted tar.gz file from /image rm -f /image/VMware-NSX-upgrade-bundle-<version>.mub rm -f /image/VMware-NSX-upgrade-bundle-<version>.tar.gz rm -f /image/VMware-NSX-upgrade-bundle-<version>.tar.gz.sigAdvanced LB (AVI)It is possible that this same issue can be caused if NSX ALB files are missing from the repository.This typically occurs if at one time NSX ALB was deployed but later removed. If a user manually deletes the ALB files from the repository, for example to free disk space, then it can cause this sync failure. Logs will explicitly refer to ALB files e.g.2024-03-19T09:41:34.557Z INFO RepoSyncThread-1710841232019 RepoSyncFileHelper 85527 SYSTEM [nsx@6876 comp="nsx-manager" level="INFO" subcomp="manager"] Command to get server info for https://192.168.1.1:443/repository/22.1.6-9191/Alb_controller/ovf/controller.cert returned result CommandResultImpl [commandName=null, pid=1677285, status=SUCCESS, errorCode=0, errorMessage=null, commandOutput=HTTP/1.1 404 Not Found2024-03-19T09:42:08.746Z INFO RepoSyncThread-1710841232019 RepoSyncFileHelper 85527 SYSTEM [nsx@6876 comp="nsx-manager" level="INFO" subcomp="manager"] Command to get server info for https://192.168.1.1:443/repository/22.1.6-9191/Alb_controller/ovf/controller-disk1.vmdk returned result CommandResultImpl [commandName=null, pid=1677876, status=SUCCESS, errorCode=0, errorMessage=null, commandOutput=HTTP/1.1 404 Not Found1) Identify the NSX ALB version, in the example above it is 21.1.42) Download the NSX ALB Controller ova from the VMware customer connects portal and copy it to the orchestrator node3) Create the directory if it does not exist #mkdir /repository/22.1.6-9191/Alb_controller/ovf Extract the ova files #tar -xzf /image/Controller.ova -C /repository/22.1.6-9191 Ensure there are 4 files controller.ovf controller.mf controller.cert controller-disk1.vmdk4) Set proper permissions and ownership of the /repository files by executing the following - /opt/vmware/proton-tomcat/bin/reposync_helper.sh5) From the UI Resolve the REPO_SYNC on the orchestrator node System -> Appliances -> View Details Click Resolve for REPO_SYNC Once completed, repeat for each of the other 2 Managers.

Additional Resources / Links

Share:

BugZero® Risk Score

What's this?

Coming soon

Status

Unavailable

Learn More

Search:

...