Operational Defect Database

BugZero found this defect 83 days ago.

Hewlett Packard Enterprise | a00138085en_us

Advisory: HPE Cray Slingshot 2.1.1 Host Software - Kernel Panic in SLES15 SP5, RHEL 8.8, or COS 3.0

Last update date:

2/27/2024

Affected products:

HPE Cray Supercomputing EX

HPE Cray supercomputers

HPE Slingshot for HPC Clusters

Affected releases:

No affected releases provided.

Fixed releases:

No fixed releases provided.

Description:

Info

A kernel panic was discovered on platforms running HPE Cray Slingshot 2.1.1 Host Software SLES15 SP5, RHEL 8.8, or COS 3.0. Some applications making high memory registration counts may encounter the defect. On the platforms SLES15 SP5 and above or RHEL 8.8 and above, a missed call to iova_domain_init_rcaches() during initialization of the iova domain can lead to a kernel panic. This was encountered when applications were making high registration counts causing the kernel iova subsystem to access an uninitialized variable.

Scope

This advisory applies to all HPE Cray EX Supercomputer and HPE Cray Supercomputer systems running SLES 15 SP5, RHEL 8.8, or COS 3.0 that have installed or will be installing Slingshot Host Software (SHS)2.1.1 software.

Resolution

The problem is fixed by making a separate call to iova_domain_init_rcaches() during initialization of the iova domain. We advise customers to not update to SLES15 SP5, RHEL 8.8, or COS 3.0 with Slingshot Host Software (SHS) 2.1.1. If systems running SLES15 SP5, RHEL 8.8, or COS 3.0 have been updated to Slingshot Host Software (SHS) 2.1.1, HPE recommends contacting customer support. This defect is resolved in Slingshot Host Software versions 2.1.2 and higher.

Additional Resources / Links

Share:

BugZero® Risk Score

What's this?

Coming soon

Status

Unavailable

Learn More

Search:

...