Operational Defect Database

BugZero found this defect 126 days ago.

Hewlett Packard Enterprise | a00025446en_us

Advisory: (Revision) Red Hat Enterprise Linux 7/8/9 or SUSE Linux Enterprise Server 12/15 - RPM Database Occasionally Becomes Corrupted After The "openibd" Systemd Startup Script From MLNX-OFED Is Executed During Boot

Last update date:

1/15/2024

Affected products:

HPE Synergy 660 Gen10 Compute Module

HPE FDR InfiniBand Adapters

HPE Infiniband QDR/Ethernet 10Gb 2P 544i Adapter

HPE QDR InfiniBand Adapters

HPE ProLiant BL460c Gen10 Server Blade

HPE ProLiant BL460c Gen9 Server Blade

HPE ProLiant BL660c Gen9 Server Blade

HPE ProLiant DL120 Gen9 Server

HPE ProLiant DL180 Gen9 Server

HPE ProLiant DL360 Gen10 server

HPE ProLiant DL360 Gen9 Server

HPE ProLiant DL380 Gen10 server

Affected releases:

No affected releases provided.

Fixed releases:

No fixed releases provided.

Description:

Info

Document Version Release Date Details 3 January 15, 2024 Added Red Hat Enterprise Linux 8, Red Hat Enterprise Linux 9, and SUSE Linux Enterprise Server 15 as affected. Updated Resolution section with Mellanox OFED InfiniBand and Ethernet Driver [ConnectX-4 and above] version 23.07-0.5.0.0 (or later). 2 January 23, 2018 Updated Resolution section with permanent fix, updated version of the Mellanox InfiniBand and Ethernet Driver for Linux. 1 September 01, 2017 Original Document Release. On an HPE System running Red Hat Enterprise Linux 7, Red Hat Enterprise Linux 8, Red Hat Enterprise Linux 9, SUSE Linux Enterprise Server 12, or SUSE Linux Enterprise Server 15, the RPM database may occasionally become corrupted after the "openibd" startup script from the MLNX-OFED package is executed at boot. As a result, RPM commands will not function.For example, typing the following command will not function: rpm -qa And will create a core file as shown below: Bus error (core dumped) In addition, restarting the openibd service does not load the Mellanox kernel modules, and displays messages similar to the following:> service openibd restart Module mlx4_core does not belong to MLNX_OFED, skipping... Module mlx4_ib does not belong to MLNX_OFED, skipping... ... The root cause is unsafe systemd ordering used by openibd.service (After=wickedd.service wickedd-nanny.service local-fs.target). In the transition from local-fs.target to sysinit.target, systemd has a service unit defined that cleans temporary files from the local filesystem (systemd-tmpfiles-setup.service). One of the activities of that service is to remove the files /var/lib/rpm/__db* at boot time. A race can occur between "openibd" script running the rpm command and systemd-tmpfiles-setup.service because /var/lib/rpm/__db* files may be removed while the rpm command is attempting to use them. This can cause a zero length __db.002 or __db.003 file being created in /var/lib/rpm. This causes all subsequent rpm commands to core dump.

Scope

Any HPE System running Red Hat Enterprise Linux 7/8/9 or SUSE Linux Enterprise Server 12/15 and any of the following MLNX-OFED versions: MLNX-OFED LTS package versions 4.1-1.0.2.0 (or earlier). MLNX-OFED package 5.x all versions. MLNX-OFED package version 23.04-1.1.3.0.

Resolution

The MLNX-OFED LTS version 4.2-1.0.0.0 includes a fix to ensure that "openibd" script will not run until at least sysinit.target (or later) has been reached (i.e. replace "After=wickedd.service wickedd-nanny.service local-fs.target" with "After=wickedd.service wickedd-nanny.service sysinit.target" in openibd.service). However, this fix was not carried forward to the subsequent 5.x and 23.04 releases, therefore these versions from the newer MLNX-OFED branch are also affected. Starting MLNX_OFED 23.07-0.5.0.0, the use of 'rpm -qf' in openibd has been discontinued, indicating that the reported issue no longer persists in versions 23.07-0.5.0.0 (or later). The latest version of the Mellanox InfiniBand and Ethernet Driver for Linux is available as follows: Click the following link: https://support.hpe.com/hpesc/public/home Enter a product name (e.g., "545FLR-QSFP") in the text search field and wait for a list of products to populate. From the products displayed, identify the desired product and click on the Drivers & software icon to the right of the product. From the Drivers & software dropdown menus on the left side of the page: Select the Software Type - (e.g. Driver) Select the Software Sub Type - (e.g. Driver - Network) For further filtering if needed - Select the specific Linux Operating System from the Operating Environment. Select the latest release of the Mellanox InfiniBand and Ethernet Driver for Linux version 4.2-1.0.0.0 (or later) OR Mellanox OFED InfiniBand and Ethernet Driver [ConnectX-4 and above] version 23.07-0.5.0.0 (or later) based on the requirement. Note : To ensure the latest version will be downloaded, click on the Revision History tab to check if a new version of the firmware/driver is available. Click Download . As a workaround, use RPM commands to fix the corrupt database as shown below: rm -rf /var/lib/rpm/__db* db_verify /var/lib/rpm/Packages rpm -rebuilddb After the RPM database is rebuilt, clean the YUM caches and verify that the running RPM commands/starting openibd script are successful as shown below: yum clean all rpm -qa theora-tools-1.1.1-8.el7.x86_64 tracker-1.2.6-3.el7.x86_64 postgresql-libs-9.2.13-1.el7_1.x86_64 jdom-1.1.3-6.el7.noarch binutils-2.23.52.0.1-55.el7.x86_64 ... service openibd restart Unloading HCA driver: [ OK ] Loading HCA driver and Access Layer: [ OK ] RECEIVE PROACTIVE UPDATES : Receive support alerts (such as Customer Advisories), as well as updates on drivers, software, firmware, and customer replaceable components, proactively in your e-mail through HPE Support Alerts. Sign up for Support Alerts at the following URL: HPE Email Preference Center. NAVIGATION TIP: For hints on navigating HPE.com to locate the latest drivers, patches and other support software downloads, refer to the Navigation Tips document. SEARCH TIP: For hints on locating similar documents on HPE.com, refer to the Search Tips document.

Additional Resources / Links

Share:

BugZero® Risk Score

What's this?

Coming soon

Status

Unavailable

Learn More

Search:

...