Operational Defect Database

BugZero found this defect 3174 days ago.

Hewlett Packard Enterprise | c04462689

Advisory: (Revision) HP Virtual Connect - Non-Volatile RAM Storage May Contain Unusable Information and VC Profile Operations May Fail Due to Sudden Power Interruption

Last update date:

2/29/2024

Affected products:

HPE ProLiant BL420c Gen8 Server Blade

HPE ProLiant BL460c Gen8 Server Blade

HPE ProLiant BL460c Gen9 Server Blade

HPE ProLiant BL465c Gen8 Server Blade

HPE ProLiant BL660c Gen8 Server Blade

HPE Integrated Lights-Out 4 (iLO 4)

HPE Virtual Connect Enterprise Manager Software

HPE BladeSystem c3000 Enclosures

HPE BladeSystem c7000 Enclosures

HPE Onboard Administrator

HPE Virtual Connect 8Gb 20-port Fibre Channel Module for c-Class BladeSystem

HPE Virtual Connect 8Gb 24-port Fibre Channel Module for c-Class BladeSystem

Affected releases:

No affected releases provided.

Fixed releases:

No fixed releases provided.

Description:

Info

Document Version Release Date Details 3 09/11/2015 Added details about the server possibly becoming inaccessible if the iLO "Cold Boot" option is used to reboot the server after making profile changes, or assigning or unassigning network connections while the server is powered on, and added the step to resolve it if it occurs. 2 05/08/2015 Adding other conditions that may cause the issue, indications and possible fix for ProLiant BL460c or ProLiant BL660c Gen9 Server Blades. 1 09/25/2014 Original Document Release. If power is suddenly removed, momentary power button pressed, or iLO virtual power button pressed on an HP ProLiant c-Class server blade while the server is busy writing to the Non-Volatile RAM (NVRAM) store, then information may be incorrectly reported and become unusable for assigned settings, such as Virtual Connect (VC) profile operations or serial numbers. The NVRAM store is used to preserve environmental information between subsequent server boots and is used exclusively by Integrated Lights-Out 4 (iLO 4), BIOS and Virtual Connect Manager (VCM) to store environmental information such as Boot Order, Virtual Serial Number, Virtual UUID, Boot configuration, IPL devices, etc. NVRAM store information may become unusable if the power is suddenly removed during any of the following: While using the iLO 4 user interface While using the Onboard Administrator (OA) user interface While using the VC user interface Manually pressing the power button on the front of the server Pressing iLO Virtual power button Removing the power source from the server Removing the power source from the chassis Hot-plugging the server from the enclosure server bay When this issue occurs, the profile operations to assign, re-assign or un-assign will fail on every attempt. The incorrect NVRAM store information will be detected in different ways for different server generations: ProLiant Gen8 (or earlier) server blades : The server startup console log will report the following message when the BIOS restores the system defaults after it detects the NVRAM store issue: "System currently defaulted to typical configurations settings. Please run RBSU to modify default settings." ProLiant Gen9 server blades : The Integrated Lights-Out 4 (iLO 4) web interface may report the following error in the Integrated Management Log (IML). In some instances, this error may not be reported: "266-Non-Volatile Memory Corruption Detected. Configuration settings restored to defaults. If enabled, Secure Boot security settings may be lost. Action: Restore desired configuration settings. Contact HP if issue persists." Note : This error will also be posted to the startup log on the iLO 4 remote console when the server is booting: In Virtual Connect 4.30 (or later), a mechanism has been added to detect this situation and report it into the System Log. The following messages are logged into the System Log when VC encounters a problem writing to the NVRAM store: 2014-07-30T13:30:15-05:00 VCEFXTW2049000J vcmd: [PRO:P1:6012:Critical] Profile state FAILED : Bay 2 : Profile state failed , because VCM could not configure environment settings on the server. 2014-07-30T13:30:15-05:00 VCEFXTW2049000J vcmd: [PRO:P1:6012:Critical] Profile state FAILED : Bay 2 : To correct this issue reseat the server physically or issue the OA CLI command reset server #. Then, re-apply the profile. If issue still persists, clear the server BIOS settings, and re-apply the profile. Note : The System Log can be accessed either through the "Domain Settings->System Log" Menu in the VCM web interface or by executing the "show systemlog" command in the VCM CLI: For VC Version 4.20 (or earlier), this issue will be indicated by the following: The "System Log" will have a record that a mechanism was not available to detect this situation. Detecting unavailable profile operations or inaccurate NVRAM store information are the only ways to identify this issue. In the Virtual Connect Enterprise Manager (VCEM) environment when applying a profile, VCEM may report following messages: VCEM received an unexpected error from VC Manager. Verify that VC Manager is working properly and perform the operation again. The exception received from VC API: ERROR-UNDEFINED The VCM (Version 4.31) reports: Bay # : Profile state failed, because VCM could not configure environment settings on the server Note : After making profile changes, assigning or unassigning network connections while the server is powered on, and then using iLO "Cold Boot" option to reboot the server, the server may become inaccessible after the reboot. This issue will occur when Virtual Connect (VC) detects invalid data in the server’s NVRAM. The changes will fail causing the downlink ports of the server to be disabled. (The fix for the issue is in Step 7 of the Resolution below.)

Scope

Any HP ProLiant Gen8 or Gen9 server blade running any Virtual Connect (VC) firmware version with all of the following: HP Integrated Lights-Out 4 (iLO 4) HP BladeSystem c-Class Onboard Administrator (OA) HP Virtual Connect Manager (VCM) HP Virtual Connect Enterprise Manager (VCEM)

Resolution

An unexpected power off event may be initiated by a person or other force and impossible to predict when NVRAM stored information may be lost. If this issue does occur, follow the detailed actions below. If a ProLiant BL460c Gen9 Server Blade exhibits this issue, try the following steps first: Install the System ROM Version 1.32, dated 03-06-2015 (or later) available at: https://www.hp.com/swpublishing/MTX-5cd63a96849549c8ac1f04e43c OR Install HP Service Pack for ProLiant (SPP) Version 2015.04.0 (or later) and check the server. A similar fix for HP ProLiant BL660c Gen9 server blade will be available in a future release of HP System ROM. For more information about this issue and fix for the ProLiant Gen9 Server Blades, refer to the following Customer Advisory, c04619425: https://support.hpe.com/hpesc/public/docDisplay?docId=c04619425 If ProLiant Gen9 server blade is already updated and running System ROM Version 1.32 and this issue still exists on the server, OR this issue exists on a ProLiant Gen8 server blade, follow these recovery steps: Reseat the server either physically or by issuing the OA CLI command "reset server #". The profile assignment will occur automatically when the server is reinserted. Review the profile assignment status in the VC interface- it should be Green. For VC 4.30 (and later), review the System Log to make sure the following messages are not reported for the latest profile operation: Profile state failed, because VCM could not configure environment settings on the server. To correct this issue, reseat the server physically or issue the OA CLI command reset server #. Then, re-apply the profile. If issue still persists, clear the server BIOS settings, and re-apply the profile. Note : The above error messages are not reported in VC version 4.20 (or earlier). The following steps are necessary only if the above steps are not successful: 1. Power the server ON. Note: There are some cases in which power on will be denied due to the previous failure of the profile assignment operation. This can be verified by referring to the Status tab of the device bay in the OA interface: Note : There are some cases in which power on will be denied due to the previous failure of the profile assignment operation. This can be verified by referring to the Status tab of the device bay in the OA interface: The power on denial can be released by any of the following methods: Physically remove the server Un-assign the profile if there is a profile currently assigned to the server bay. Otherwise, assign a new profile to the empty bay and un-assign it back. Insert the server blade into the bay and power the server on. OR Move the server to a different unassigned (no profile assigned) bay in the same or different enclosure and power the server on. 2. Clear the server BIOS settings to factory defaults via the "Restore Default System Settings" option as shown below: ProLiant Gen9 server blades ProLiant Gen8 (or earlier) server blades 3. Power down the server using the "Momentary Press" option. 4. Assign the profile to the server bay after restoring the BIOS settings to factory defaults. This will clear out the profile configuration on the server. If the profile was assigned already, then reassign it to the server. 5. Review the profile assignment status in the VC interface. The status should appear "Green." 6. For VC 4.30 (or later), review the "System Log" to ensure that the following messages are not reported for the latest profile operation: Profile state failed, because VCM could not configure environment settings on the server. To correct this issue, reseat the server physically or issue the OA CLI command reset server #. Then, re-apply the profile. If issue still persists, clear the server BIOS settings, and re-apply the profile. Note : The above error messages do not exist in VC version 4.20 (or earlier). 7. If there are pending profile changes and the server is inaccessible with disabled downlink ports after using a "Cold Boot," power down the server reapply the profile to enable the links. To prevent this issue : Normally, NVRAM store accesses are frequent when a server is in any of the following stages: During Power-On Self-Test (POST) When a Bare Metal (no local or network boot is configured) server sits in either a PXE boot loop or UEFI Shell, depending on the generation: Gen 9 configured in Legacy Mode; Gen 8 and earlier generations: Normally a Bare Metal server will stay in a PXE boot loop, if there is no local or network OS to boot. Gen 9 configured in UEFI Mode: Normally a Bare Metal server will stay in the UEFI shell, if there is no local or Network OS to boot. Removing the server power when a server is in any of the above stages will cause a higher chance of corrupting the NVRAM store. Recommendations to avoid this issue: Never use any of the following "forced" methods to power off the server: Using the "Press and Hold" option in iLO 4, OA, VC, VCEM and OneView web interfaces. Specifying the "force" option when using iLO 4, OA, VC, VCEM and OneView command line interface to power off the server. The "force" option is equivalent to "Press and Hold." Continuously pressing the power button on the front of the server. Always use the "Momentary Press" option to power off the server when using iLO 4, OA, VC, VCEM and OneView web interfaces. The "Momentary Press" option is safer for powering off the server, because it holds the server power until the operation is complete. The "Momentary Press" may fail if it occurs very early in the boot process or the server has stopped responding. The server will remain powered on if the "Momentary Press" fails. In this case, retry the "Momentary Press" option until the server is actually powered off or use the cold boot option if the desired behavior is a server restart. Note : "Momentary Press" can also cause a NVRAM store corruption on ProLiant Gen8 or earlier servers but the probability is much lower than using a "Press and Hold" option. The "Momentary Press" will not cause an NVRAM store corruption issue on ProLiant Gen9 server blades. Figure 1 VC Screen Figure 2 OA Screen Figure 3 iLO 4 Screen Do not use the "force" option when using iLO 4, OA or VC command line interfaces to power off a server. The command will be considered a "Momentary Press" if the "force" option is not specified. Do not use the "Cold Boot" option when using iLO 4, OA or VC command line interfaces to power cycle a server after making server profile changes, such as a "Network" assignment on the Flex ports, during a powered on state. Apply the profile into an empty bay, before inserting the server. Inserting a server into an empty bay will automatically power on the server. VC requires a server to be in the power off state to get the profile applied. Powering off the server when it is in any of these states (POST loop or PXE boot loop or UEFI shell depending on the configuration) will likely cause a NVRAM store corruption. Applying a profile before inserting the server will eliminate such a situation. The profile gets applied automatically when the server is inserted. In this case, there is no additional power off required to get the profile applied. Keep unassigned (no profile assigned) Bare Metal servers powered off. Normally, Bare Metal servers without a local or network boot will always remain in either a PXE boot loop or the UEFI shell. Since there is a high chance of hitting a NVRAM store corruption if the server gets powered off when the server is in those states, it is recommended to keep the Bare Metal servers without assigned profiles powered off. To prevent the server from being powered on automatically after being inserted, set the "Always Remain Off" option for the server power by using the iLO 4 interface. This option is used mostly for servers without assigned profiles: RECEIVE PROACTIVE UPDATES : Receive support alerts (such as Customer Advisories), as well as updates on drivers, software, firmware, and customer replaceable components, proactively via e-mail through HP Subscriber's Choice. Sign up for Subscriber's Choice at the following URL: Proactive Updates Subscription Form NAVIGATION TIP : For hints on navigating HP.com to locate the latest drivers, patches, and other support software downloads for ProLiant servers and Options, refer to the Navigation Tips document . SEARCH TIP : For hints on locating similar documents on HP.com, refer to the Search Tips document .

Additional Resources / Links

Share:

BugZero® Risk Score

What's this?

Coming soon

Status

Unavailable

Learn More

Search:

...