How to use the GPU validation test. Tested with NVIDIA cards.
Pre-requisites:
- NVIDIA drivers installed (you can check with 'nvidia-smi' command to see if it properly outputs the NVIDIA hardware devices)
Instructions
Download/unpack files into root directoy
wget https://exxact-support.s3.us-west-1.amazonaws.com/Test+Folder/Stand_Alone_Validation_v4.2.1.tar.gz --no-check-certificate tar -xvzf Stand_Alone_Validation_v4.2.1.tar.gz
Change directory to unpacked folder
cd Stand_Alone_Validation
Duration of tests varies depending on GPU's being used. If you are using a smaller GPU specifically for display, you need to remove that GPU and use this system using terminal-view only or SSH to run the test.
Run test in the background by using (run as root)
nohup ./run_test.x &
Monitor GPU temps by opening another terminal and using 'nvidia-smi -l'; once you no longer see the 'standalone-test.bin' process being printed from 'nvidia-smi', you can check the logs to see if your set amount of cycles completed.
exx@ubuntu:~/Stand_Alone_Validation$ nvidia-smi -l Tue Jan 15 17:35:14 2019 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 410.78 Driver Version: 410.78 CUDA Version: 10.0 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce GTX 1080 On | 00000000:05:00.0 On | N/A | | 78% 86C P2 149W / 180W | 4767MiB / 8118MiB | 100% Default | +-------------------------------+----------------------+----------------------+ | 1 GeForce GTX 1080 On | 00000000:06:00.0 Off | N/A | | 77% 86C P2 155W / 180W | 4569MiB / 8119MiB | 100% Default | +-------------------------------+----------------------+----------------------+ | 2 GeForce GTX 1080 On | 00000000:09:00.0 Off | N/A | | 72% 86C P2 124W / 180W | 4569MiB / 8119MiB | 100% Default | +-------------------------------+----------------------+----------------------+ | 3 GeForce GTX 1080 On | 00000000:0A:00.0 Off | N/A | | 59% 83C P2 134W / 180W | 4569MiB / 8119MiB | 100% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 1910 G /usr/lib/xorg/Xorg 157MiB | | 0 2889 G compiz 40MiB | | 0 5848 C ../standalone-test.bin 4557MiB | | 1 5849 C ../standalone-test.bin 4557MiB | | 2 5850 C ../standalone-test.bin 4557MiB | | 3 5851 C ../standalone-test.bin 4557MiB | +-----------------------------------------------------------------------------+
As for the time it takes per cycle, I have not yet measured them per small, large, or xlarge cycles. I assume with the 5/5/2 cycles, it will complete in 6-8 hours.
Checking results
View the output logs in the 'Stand_Alone_Validation' directory and make sure the results are matching for each cycle. In this example, I only had 5 small tests on 4x GPU's. The large and Xlarge tests write their own files per GPU_x.
Example:
exx@ubuntu:~/Stand_Alone_Validation$ ./exx-getgpu-validation.sh
The test results will be saved in /tmp/<hostname>_Standard_GPU_validation.txt. View the file and copy the results to the Support Ticket if applicable.
Related articles