Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Added another no-display troubleshooting suggestion from ESSD-6581

Table of Contents

Table of Contents

...


Hardware failure details decision tree

I understand this is not formatted as a typical decision tree, but I was using basic macros (expand) in pre-2020 Confluence editor.

Expand
titleSystem does not power on when pressing the power button

System does not power on when pressing the power button

Expand
titlePOWER SUPPLY - Is there LED lights displaying on the Power Supplies while the system while it is powered off?

POWER SUPPLY - Is there LED lights displaying on the Power Supplies while the system while it is powered off?

Expand
titleYes

Yes

Expand
titleWhat color are they?

What color are they?

Expand
titleGreen

Green

Expand
titleIs it blinking or solid?

Is it blinking or solid?

Expand
titleBlinking - (system powered off)

Blinking - (system powered off)

  • Power supplies is working and is on stand by

Blinking - (system powered on; TYAN systems) 

  • PSU supplies is working and it is on standby for redundancy
Expand
titleDoes system power on yet after checking the power supplies?

Does system power on yet after checking the power supplies?

Expand
title Yes

(END) Yes 

  • Move on to next troubleshooting tree if necessary


Expand
titleNo

No

Expand
titleDoes system power on after re-seating all Memory DIMM's?

Does system power on after re-seating all Memory DIMM's?

Expand
title(END) Yes

(END) Yes

  • check topology in BIOS to make sure all installed memory are identified


Expand
titleNo

No

Expand
titleCPU/Memory/Motherboard - Does it power on when system is brought down to 1x CPU and 1x Memory DIMM (on primary/first CPU/memory slot)?

CPU/Memory/Motherboard - Does it power on when system is brought down to 1x CPU and 1x Memory DIMM (on primary/first CPU/memory slot)?

  • If they cannot perform this troubleshooting, they will need to ship this system back to Exxact for further troubleshooting; issue System RMA
  • In red, because this is a line whether the hardware diagnostics is more involved/invasive and customers may damage internal components if not handled properly
  • See other option to see if they can quickly check if the power button/ribbon cable is the root cause
Expand
titleYes - swap CPU to see if issue persists; does system power on after swapping CPU?

Yes - swap CPU to see if issue persists; does system power on after swapping CPU?

Expand
title(END) - Yes - Defective motherboard/slot; re-install memory and check topology in BIOS for CPU1 to make sure all installed memory are identified

(END) - Yes - Defective motherboard/slot; re-install memory and check topology in BIOS for CPU1 to make sure all installed memory are identified

  • Could be bad CPU pin/slot on the motherboard on secondary CPU slot
  • Ask if they are okay with performing RMA on the chassis+motherboard (honestly if they got this far, I'm sure they can swap the barebone)


Expand
titleNo - swap memory DIMM's; does system power on after swapping through all memory DIMM's that were uninstalled when CPU2 was removed??

No - swap memory DIMM's; does system power on after swapping through all memory DIMM's that were uninstalled when CPU2 was removed??

  • Still could be bad memory, try another memory DIMM to see if issue persists
Expand
title(END) Yes - Defective memory DIMM; issue component RMA for Memory

(END) Yes - Defective memory DIMM; issue component RMA for Memory

  • try to have them re-create issue by re-installing suspected DIMM to see if system fails to power on/POST
  • repopulate CPU2 and memory in pairs to ensure the rest of the memory DIMM's are allowing system to POST
  • check topology in BIOS to make sure all installed memory are identified


Expand
title(END) No - Defective CPU; issue component RMA for CPU

(END) No - Defective CPU; issue component RMA for CPU

  • Most likely confirmed to be bad CPU since:
    • CPU1 slot works
    • Installing either of the CPU's into secondary CPU slot does not allow system to power on
  • Have them swap the memory DIMM's that were previously installed for CPU2's row into CPU1's to see if all memory is working properly
  • System should still be able to power on with 1x CPU and DIMM's but they may lose half the PCI-e slots on certain systems (typically older ones using 2011-v3/v4 CPU's)




Expand
titleNo

(END) No - Defective motherboard; issue System RMA for confirmation of issue and repairs

  • Could be bad primary CPU1 slot or bad motherboard entirely; issue System RMA for confirmation of issue and repairs


Expand
titlePOWER BUTTON - Does pushing power button not power on the system?

POWER BUTTON - Does pushing power button not power on the system when all of the board and PSU LED lights are on?

  • In red, because this is a line whether the hardware diagnostics is more involved/invasive and customers may damage internal components if not handled properly
Expand
titleHave you tried removing the ribbon cable to manually jump the power pins?

Does system power on after removing the ribbon cable to manually jump the power pins?

  • If they are unable to do this, then issue System RMA

Expand
title(END) Yes - Defective Power Button

(END) Yes - Defective Power Button

  • Have them re-seat the ribbon cable and try again; we can try to RMA the power button assembly if:
    • They agree to perform the labor
    • The barebone makes it easily accessible that we can provide a short guide (usually we don't replace this, we would just have MFR send us the assembly)
  • Suggest if they are okay in swapping barebone, or make a judgement call whether we should issue System RMA (do you trust them to perform the labor?)
  • Make sure CPU/Memory all identified properly in BIOS


Expand
titleNo

No

  • Go back and have them try the following tree


    CPU/Memory/Motherboard - Does it power on when system is brought down to 1x CPU and 1x Memory DIMM (on primary/first CPU/memory slot)?












Expand
titleSolid

(END) - Solid 

  • it shouldn't be green and solid while system is powered off; power drain the system and see if issue persist




Expand
titleAmber - (system powered off, and all PSU's are amber)

(END) - Amber - (system powered off, and all PSU's are amber)

  • older systems use Amber for standby while system is powered off; try power button and see if they turn to solid green LED


Expand
titleOne is Green, the other(s) is off / Amber / yellow / different

(END) - One is Green, the other(s) is off / Amber / yellow / different (Defective PSU)

  • most likely one of the PSU's is bad; try re-seating it and swapping locations with another PSU module. If it follows PSU, then PSU needs Component RMA. If it follows the slot/insert, then barebone needs Component RMA (or System RMA if we need to swap components for customer and re-validate hardware)




Expand
titleNo - (all PSU's off/no lights)

(END) - No - (all PSU's off/no lights)

  • try different power cables, check outlets, re-seat the PSU module; if no lights/activity, PDB (or barebone) needs Component RMA (or System RMA if we need to swap components for customer and re-validate hardware)



...

Expand
titleSystem powers on when pressing the power button, but there is no display

System powers on when pressing the power button, but there is no display

Make sure you go over the following display topics:

  • Provide system info flier
  • State which display port(s) or graphics card is used for primary display
  • Have them double-check monitor that is powered on and using correct display source (VGA/DVI/HDMI/HDMI1/HDMI2/etc...)
  • Try other cables/adapters; highly recommend NOT using adapters

Display issues are common for newly received systems or new user of an older one. If customers fail to troubleshoot properly, it needs to be heavily noted that they will need to pay for shipping if we deem system NPF (No Problem Found).

Expand
titleAre they using correct display port/GPU?

Is correct display port/GPU being used?

Expand
titleYes

Yes

Expand
titleDoes display work after they power drained the system, and power it on with unsafe-ONLY the correct video display cable/port being used as noted in the system info flier?

Does display work after system was power drained completely, and powered on with ONLY the correct video display cable/port being used as noted in the system info flier?

Expand
title (END) Yes - display works, move on to next tree

(END) Yes - display works, move on to next tree


Expand
titleNo - is there display after checking monitor settings and other cables?

No - is there display after checking monitor settings and other cables?

  • have them double-check monitor that is powered on and using correct display source (VGA/DVI/HDMI/HDMI1/HDMI2/etc...)
  • try other cables/adapters that are known working for other systems at their site; highly recommend NOT using adapters
  • try other display GPU ports and cables of different display interface types
Expand
title(END) Yes - display works, move on to next tree

(END) Yes - display works, move on to next tree


Expand
titleNo - (if offboard display channel) is there display on the unsafe-onboard channel (typically VGA or Motherboard display ports)?

No - (if offboard display channel) is there display on the onboard channel (typically VGA or Motherboard display ports)?

Expand
titleYes - does motherboard splash screen display? (offboard channel)

Yes - does motherboard splash screen display? (offboard channel)

Expand
titleYes - does the system have any codes that does not involve the display being pushed to offboard channel? (see notes for this unsafe-one)

Yes - does the system have any codes unrelated to display being pushed to offboard channel? (see notes below)

  • Check for any POST code displayed in addition to the board manufacturer logo
    • Commonly observed codes
      • Supermicro - 91 - display pushed to offboard channel
      • Tyan - E3/AB - display pushed to offboard channel
      • ASUS (on workstation motherboards or back of 2U chassis) - OS loaded properly
      • B7/B9 (or B codes) is typically a motherboard component (mostly memory) causing system not to complete POST
      • Refer to MFR manual for other uncommon codes
      • Most important POST code is where the system gets stuck at
Expand
title(END) Yes - display works, move on to next tree that is more related to systems that do not complete POST or boot to OS
(END) Yes - display works, move on to next tree that is more related to systems that do not complete POST or boot to OS (offboard channel)


Expand
title(END) No - they are using ports meant for unsafe-onboard display; make sure any display cables installed to the motherboard (onboard channel) are unplugged and reboot system

(END) No - they are using ports meant for onboard display; make sure any display cables installed to the motherboard (onboard channel) are unplugged and reboot system



Expand
titleNo - does monitor appear to be receiving activity when system boots up?

No - does monitor appear to be receiving activity when system boots up?

Expand
titleYes - Is there display after either of the following below? (see notes)

Yes - Is there display after either of the following below? (see notes)

  • (mostly single GPU systems) re-seating the GPU(s)
  • (multi-GPU servers) swapping the GPU's to see if the one being used for display is not working properly?
Expand
title(END) Yes - display works, move on to next tree; issue Component RMA for GPU if necessary

(END) Yes - display works, move on to next tree; issue Component RMA for GPU if necessary

  • address the GPU if their system boots to OS; run commands to ensure it is being identified by drivers/system; issue Component RMA for the GPU if necessary


Expand
titleNoNo - do they have a known working GPU for display to test with the system?

No - is there a known working GPU, for display, to test with the system?

Expand
titleYes - does the display work after removing all GPU's, but unsafe-only using the known-working GPU for display?

Yes - does the display work after removing all GPU's, but only using the known-working GPU for display?

Expand
title(END) Yes - possible GPU RMA (see notes)

(END) Yes - (multi-GPU servers) install other GPU's and run commands to ensure they is being identified by drivers/system; the one GPU not being identified is possibly defective

(END) Yes - (single GPU systems) issue Component RMA for defective GPU



Expand
title(END) No - issue System RMA

(END) No - issue System RMA







Expand
title(END) No - Issue system RMA

(END) No - Issue system RMA

  • If we're at this point, please be sure they have tried everything above...
    • Cables
    • Monitors
    • Re-seating display ports/connectors
  • Send fair warning they will need to pay shipping if we deem system NPF



Expand
title(END) No - (if unsafe-onboard) system RMA

(END) No - (if onboard display channel) issue system RMA

  • If we're at this point, please be sure they have tried everything above...
    • Cables
    • Monitors
    • Re-seating display ports/connectors
  • Send fair warning they will need to pay shipping if we deem system NPF






Expand
titleNo

No - okay... make sure correct port is used; see other options in case CMOS/BIOS was reset/updated if system was initially set to offboard channel in BIOS



...