[INTERNAL USE]
...
Contents
Table of Contents | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
|
HOW TO INSTALL TOOL
Tool File Name:
Hopper (H100 GPU): 629-24287-XXXX-FLD-38780.tgz
Ampre (A100 GPU): 629-23587-XX86-FLD-38782.tgz
Download Location (INTERNAL QA SERVER):
Hopper (H100 GPU): scp root@172.25.10.35:/root/629-24287-XXXX-FLD-38780.tgz .
Ampre (A100 GPU): scp root@172.25.10.35:/root/629-23587-XX86-FLD-38782.tgz .
The tool is expected to be placed in the /var/diags
folder. Created this folder if it does not exist.
Extracted folder (Hopper) content:
Code Block |
---|
root@rdlab:/var/diags/629-24287-XXXX-FLD-38780# ll total 473124 drwxr-xr-x 2 root root 4096 Oct 4 10:55 ./ drwxr-xr-x 3 exx exx 4096 Oct 4 10:55 ../ -rwxr-xr-x 1 exx exx 17895 Sep 11 09:47 fdmain.sh* -rwxr-xr-x 1 exx exx 32888 Sep 11 09:47 fieldiag.sh* -r-xr-xr-x 1 exx exx 11194648 Sep 11 09:47 nvflash* -rwxr-xr-x 1 exx exx 473142559 Sep 11 09:47 onediagfield.r6.252.tgz* -r-xr-xr-x 1 exx exx 2906 Sep 11 09:47 README.txt* -r-xr-xr-x 1 exx exx 1702 Sep 11 09:47 relnotes.txt* -r-xr-xr-x 1 exx exx 18541 Sep 11 09:47 sku_hopper-hgx-8-gpu.json* -r-xr-xr-x 1 exx exx 18477 Sep 11 09:47 sku_hopper-hgx-8-gpu_tpol.json* -rw-rw-r-- 1 exx exx 3428 Sep 11 09:47 spec_hopper-hgx-8-gpu_level1_field.json -rw-rw-r-- 1 exx exx 3428 Sep 11 09:47 spec_hopper-hgx-8-gpu_level2_field.json -rw-rw-r-- 1 exx exx 2312 Sep 11 09:47 spec_hopper-hgx-8-gpu_sit_field.json -r-xr-xr-x 1 exx exx 6832 Sep 11 09:47 testargs_hopper-hgx-8-gpu.json* |
PROBLEM SITUATION
Supermicro provided this file to diagnose HGX H100 GPU issues. Related to ZD-6179 / SMC CRM Case: SM2310022368.
...
View file | ||
---|---|---|
|
View file | ||
---|---|---|
|
TOOL USAGE
Review the README.txt
...
INVESTIGATION DETAILS
The
Tool Installationfor details on usage and options.
View file | ||
---|---|---|
|
If the following error is encountered when running the fielddiag.sh, uninstall the existing Nvidia driver on the system. The existing Nvidia driver is conflicting with the tool.
Code Block |
---|
root@rdlab:/home/exx/smc_fieldiag/629-24287-XXXX-FLD-38780# ./fieldiag.sh
Unpacking onediag...
Could not determine HGX baseboard SKU |