HGX H100 - PCI-e and Stand-Down S.O.P.

This document is for SUPER MICRO AS-8125GS-TNHT Barebone PCI-e AOC installation and how to properly Power Down the System.

 

SoP for PCI-e Installation:

  1. IF: System was on, please follow proper Stand Down SoP. * Below

  2. PCI-e Tray is to be removed 1st. [Pictures to be added]

  3. Mother Board Tray is to be slid out but not completely removed. Let it rest at the click lock down. [Pictures to be added]

  4. DO NOT Touch the GPU trays unless clearly instructed by Super Micro.

  5. Install AOC on to the PCI-e Tray. [Pictures to be added]

  6. Mother Board Tray install and screw in properly. [Pictures to be added]

  7. PCI-e Tray reinstall last. [Pictures to be added]

SOP for proper Stand Down SoP.

  1. If you have IPMI Go through the IPMI GUI to power off, Gracefully or Immediate Power Off. [Pictures to be added]

  2. Additive: The system will rav up high fan and then down to resting fan. [Pictures to be added and noise dbs added]

  3. Additive: Another indication that the system has stand-down properly would be the IPMI-GUI power option. Only has Power ON option. [Pictures to be added]

 

Ref. Email: RE: Super Micro AS -8125GS 8U failures F.A.E. Help.

Ref. Kevin Zhu (FAE) Visit on April 26th 2024 on site trouble shooting.

Ref.
SN: S520596X4313597

Case number: SM2404223006

Issue: error code 62

Resolution: Reseat CPU 1.

 

SN: S520596X3505842

Case number: SM2404223021

Issue: Only 8 of 10 SSD installed are detected

Resolution: swap MCIO cable 25/26 and swap MCIOcable 17/18, then all 12 bay works. 

 

SN: S520596X4313596

Case number: SM2404181913

Issue: Slot 9 and 10 does on rear IO panel not recognize any NIC.  

Troubleshoot: tried other NIC in slot 9 and 10, issue remains. Swapped the whole AOM tray, and issue stays.