BugZero found this defect 83 days ago.
Data sources
All data on this page is proprietary to BugZero® or gathered from public sources
2/27/2024
HPE Cray Supercomputing EX
HPE Cray supercomputers
HPE Slingshot for HPC Clusters
No affected releases provided.
No fixed releases provided.
A kernel panic was discovered on platforms running HPE Cray Slingshot 2.1.1 Host Software SLES15 SP5, RHEL 8.8, or COS 3.0. Some applications making high memory registration counts may encounter the defect. On the platforms SLES15 SP5 and above or RHEL 8.8 and above, a missed call to iova_domain_init_rcaches() during initialization of the iova domain can lead to a kernel panic. This was encountered when applications were making high registration counts causing the kernel iova subsystem to access an uninitialized variable.
This advisory applies to all HPE Cray EX Supercomputer and HPE Cray Supercomputer systems running SLES 15 SP5, RHEL 8.8, or COS 3.0 that have installed or will be installing Slingshot Host Software (SHS)2.1.1 software.
The problem is fixed by making a separate call to iova_domain_init_rcaches() during initialization of the iova domain. We advise customers to not update to SLES15 SP5, RHEL 8.8, or COS 3.0 with Slingshot Host Software (SHS) 2.1.1. If systems running SLES15 SP5, RHEL 8.8, or COS 3.0 have been updated to Slingshot Host Software (SHS) 2.1.1, HPE recommends contacting customer support. This defect is resolved in Slingshot Host Software versions 2.1.2 and higher.