[INTERNAL USE]
Contents
Table of Contents | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
|
...
Tool File Name:
Hopper (H100 GPU): 629-24287-XXXX-FLD-38780.tgz
Ampre Ampere (A100 GPU): 629-23587-XX86-FLD-38782.tgz
Download Location (INTERNAL QA SERVER):
Hopper (H100 GPU): scp root@172.25.10.35:/root/HGX_Tool/629-24287-XXXX-FLD-38780.tgz .
Ampre Amepre (A100 GPU): scp root@172.25.10.35:/root/HGX_Tool/629-23587-XX86-FLD-38782.tgz .
Unload Nvidia Driver: scp root@172.25.10.35:/root/HGX_Tool/unload_nvidia_driver.sh .
...
Code Block |
---|
root@rdlab:/var/diags/629-24287-XXXX-FLD-38780# ll total 473124 drwxr-xr-x 2 root root 4096 Oct 4 10:55 ./ drwxr-xr-x 3 exx exx 4096 Oct 4 10:55 ../ -rwxr-xr-x 1 exx exx 17895 Sep 11 09:47 fdmain.sh* -rwxr-xr-x 1 exx exx 32888 Sep 11 09:47 fieldiag.sh* -r-xr-xr-x 1 exx exx 11194648 Sep 11 09:47 nvflash* -rwxr-xr-x 1 exx exx 473142559 Sep 11 09:47 onediagfield.r6.252.tgz* -r-xr-xr-x 1 exx exx 2906 Sep 11 09:47 README.txt* -r-xr-xr-x 1 exx exx 1702 Sep 11 09:47 relnotes.txt* -r-xr-xr-x 1 exx exx 18541 Sep 11 09:47 sku_hopper-hgx-8-gpu.json* -r-xr-xr-x 1 exx exx 18477 Sep 11 09:47 sku_hopper-hgx-8-gpu_tpol.json* -rw-rw-r-- 1 exx exx 3428 Sep 11 09:47 spec_hopper-hgx-8-gpu_level1_field.json -rw-rw-r-- 1 exx exx 3428 Sep 11 09:47 spec_hopper-hgx-8-gpu_level2_field.json -rw-rw-r-- 1 exx exx 2312 Sep 11 09:47 spec_hopper-hgx-8-gpu_sit_field.json -r-xr-xr-x 1 exx exx 6832 Sep 11 09:47 testargs_hopper-hgx-8-gpu.json* |
Extracted folder (AmpreAmpere) content:
Code Block |
---|
root@rdlab:/var/diags/629-2268723587-XX86-FLD-38225#38782# ll total 243812243940 drwxr-xr-x 24 root root 4096 FebMar 124 22:2755 ./ drwxr-xr-x 34 root root 4096 Mar Feb 125 2217:2703 ../ -rwxrdrwxr-xr-x 18 exxroot root exx 4096 31232Mar Sep 134 22:39 fieldiag.sh* -rwxr-xr-x55 dgx/ -rw-r--r-- 1 exxroot root exx 238455527 Sep 13 22:39 hgxfieldiag.r3.100* 0 Mar 4 22:55 dgx_log_creation_lock -rw-r--r-- 1 root root 0 Mar 4 22:55 dgx_unpack_package_lock -rw-r--r-- 1 root root 26360 Mar 5 00:43 fieldiag.log -rwxr-xr-x 1 exx exx 31232 Sep 19 20:53 fieldiag.sh* -rwxr-xr-x 1 exx exx 238456629 Sep 19 20:53 hgxfieldiag.r3.102* drwxr-xr-x 2 root root 4096 Mar 5 00:43 logs/ -r-xr-xr-x 1 exx exx 1111599211104504 Sep 1319 2220:3953 nvflash* -rwxr-xr-x 1 exx exx 27192773 Sep 1319 2220:39 README53 README.txt* -rwxr-xr-x 1 exx exx 4497 Sep 19 20:53 relnotes.txt* -rwxr-xr-x 1 exx exx 16501823 Sep 1319 2220:39 relnotes.txt53 sku_hgx-a100-8-gpu_40g_aircooled.json* -rwxr-xr-x 1 exx exx 10461482 Sep 1319 2220:3953 sku_hgx-a100-48-gpu_40g_aircooledhybrid.json* -rwxr-xr-x 1 exx exx 10463787 Sep 1319 2220:3953 sku_hgx-a100-48-gpu_40g80g_liquidcooledaircooled.json* -rwxr-xr-x 1 exx exx 16763611 Sep 1319 2220:3953 sku_hgx-a100-48-gpu_64g80g_hybrid.json* -rwxr-xr-x 1 exx exx 187625076 Sep 1319 2220:3953 skutestargs_hgx-a100-48-gpu_80g_aircooled2tray.json* -rwxr-xr-x 1 exx exx 179824734 Sep 1319 2220:3953 skutestargs_hgx-a100-48-gpu_80gd00_liquidcooled2tray.json* -rwxr-xr-x 1 exx exx 14310 1048 Sep 1319 2220:3953 skutestargs_hgx-a100-48-gpu_96gd00.json* -rwxr-xr-x 1 exx exx 861814652 Sep 1319 2220:3953 testargs_hgx-a100-48-gpu.json* |
PROBLEM SITUATION
...
Example of running Field Test showing logs output location.
...
Failure example from ZD-12288: fieldiag.log
View file | ||
---|---|---|
|