HGX H100 - PCI-e and Stand-Down S.O.P.
This document is for SUPER MICRO AS-8125GS-TNHT Barebone PCI-e AOC installation and how to properly Power Down the System.
SoP for PCI-e Installation:
IF: System was on, please follow proper Stand Down SoP. * Below
PCI-e Tray is to be removed 1st. [Pictures to be added]
Mother Board Tray is to be slid out but not completely removed. Let it rest at the click lock down. [Pictures to be added]
DO NOT Touch the GPU trays unless clearly instructed by Super Micro.
Install AOC on to the PCI-e Tray. [Pictures to be added]
Mother Board Tray install and screw in properly. [Pictures to be added]
PCI-e Tray reinstall last. [Pictures to be added]
SOP for proper Stand Down SoP.
If you have IPMI Go through the IPMI GUI to power off, Gracefully or Immediate Power Off. [Pictures to be added]
Additive: The system will rav up high fan and then down to resting fan. [Pictures to be added and noise dbs added]
Additive: Another indication that the system has stand-down properly would be the IPMI-GUI power option. Only has Power ON option. [Pictures to be added]
Ref. Email: RE: Super Micro AS -8125GS 8U failures F.A.E. Help.
Ref. Kevin Zhu (FAE) Visit on April 26th 2024 on site trouble shooting.
Ref.
SN: S520596X4313597
Case number: SM2404223006
Issue: error code 62
Resolution: Reseat CPU 1.
SN: S520596X3505842
Case number: SM2404223021
Issue: Only 8 of 10 SSD installed are detected
Resolution: swap MCIO cable 25/26 and swap MCIOcable 17/18, then all 12 bay works.
SN: S520596X4313596
Case number: SM2404181913
Issue: Slot 9 and 10 does on rear IO panel not recognize any NIC.
Troubleshoot: tried other NIC in slot 9 and 10, issue remains. Swapped the whole AOM tray, and issue stays.