Operational Defect Database

BugZero found this defect 962 days ago.

Hewlett Packard Enterprise | a00118666en_us

Advisory: HPE B-Series Switches - Switches Running FOS 9.0.1b May Reboot When Attempting To Collect Support Data

Last update date:

2/28/2024

Affected products:

HPE Storage Fibre Channel Switch B-series SN3600B

HPE Storage Fibre Channel Switch B-series SN6600B

HPE Storage Fibre Channel Switch B-series SN6650B

HPE Storage Fibre Channel Switch B-series SN6700B

HPE Storage SAN Director Switch

HPE Storage SAN Extension Switch B-series SN2600B

Brocade 32Gb Fibre Channel SAN Switch for HPE Synergy

Affected releases:

No affected releases provided.

Fixed releases:

No fixed releases provided.

Description:

Info

A memory overrun error may occur when attempting to capture a supportSave , supportShow, or femDump output on HPE B-series switches running Fabric OS (FOS) version 9.0.1b, or versions built on 9.0.1b prior to 9.0.1b4. This issue may cause an HPE B-series switch class product to reboot (cold boot), or a director class product to fail-over to the standby CP. When capturing a supportSave , one of the steps performed is an internal data table dump of ASIC memory. This ASIC memory content contains Flow Vision device initiator-target pair flow statistics for the flows being monitored, and does not contain any customer data. The number of entries contained within the ASIC memory will typically increase with the number of devices connected to any ASIC, including any NPIV virtual devices (Access Gateway [AG], for example). Due to an error within the supportSave code, the amount of memory contained within this ASIC memory capture can become too large to fit within the allocated memory block. When the ASIC data is copied into the supportSave predefined memory location, a memory overrun may occur and cause corruption of adjacent memory areas. Switches that are operating as an Access Gateway (AG) switch are more likely to encounter this issue as the number of flows monitored per ASIC will typically be increased. Switches that have a higher number of Flow Vision flows being monitored will also be at greater risk of encountering this condition. The failure is dependent on the number of initiator-target flows being monitored within an ASIC, and while many switches may never encounter this issue, it is more likely to occur on Gen 6 switches with more than 350 flows and Gen 7 switches with more than 700 flows. The total number of flows monitored at a switch level can be displayed with the following CLI command: flow --show sys_flow_monitor -resource This command provides the total number of flows being monitored on the switch or director. The failure is triggered when the flow count being monitored by a single ASIC exceeds the limitation in allocation size. While it can be assumed that the number of total flows on a switch are equally distributed among all ASICs, this may not always be the case, depending on how devices are attached and zoned with other devices on the switch. When the condition is experienced a switch can panic and perform a cold reboot and a director may observe a HA fail-over at the time of or shortly after executing a supportSave , supportShow , or femDump operation. The following message will be observed in the logs: BUG: Bad page map in process asicswdump The switch may report an "out of memory" failure after executing a supportSave , supportShow , or femDump operation.

Scope

Only HPE B-series switches running FOS v9.0.1b (including the b1-b3 CCE patches) are susceptible to the memory allocation overrun during a supportSave ASIC data capture. Switches running FOS v9.0.1a or earlier versions, including all HPE B-series FOS v8.x and v7.x releases, do not experience this issue. FICON flows will not contribute to the total count of monitored flows. Systems that run FICON traffic with a minimal number of non-FICON initiator-target flows would be much less susceptible to this issue. This issue only occurs during a supportSave , supportShow , or femDump capture action. These operations can be initiated directly by CLI command, from a management application such as SANnav, or through the REST scripting interface.

Resolution

This issue is fixed in HPE B-series FOS v9.0.1b4 and later. Workaround Option 1 : Disable the system automatic monitoring of flows. Customers that do not require the automatic system monitoring of flows can disable their monitoring and avoid encountering any issue during a supportSave, supportShow, or femDump operation by using the following command: flow -deact sys_flow_monitor Option 2 : Limit the system automatic monitoring of flows to only critical flows or important flows. Customers that do not require all system flows to be monitored, can reduce the total number of initiator-target flows that are monitored. Reducing the scope of monitoring to only critical flows in order bring the total number of monitored flows below the failure condition threshold can be done via the following commands: flow -deact sys_flow_monitor flow -modify sys_flow_monitor -port <specific F port or logical group of F ports> flow -act sys_flow_monitor RECEIVE PROACTIVE UPDATES : Receive support alerts (such as Customer Advisories), as well as updates on drivers, software, firmware, and customer replaceable components, proactively in your e-mail through HPE Support Alerts. Sign up for Support Alerts at the following URL: Proactive Updates Subscription Form

Additional Resources / Links

Original Vendor Announcement

BugZero® Risk Score

What's this?

Coming soon

Status

Unavailable

Search:

...