Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: system constant reboot (hardware troubleshooting)

...

Expand
titleSystem powers on when pressing the power button, and displays, but does not boot to OS

System powers on when pressing the power button, and displays, but does not boot to OS


Expand
titleIs system display stuck st splash screen?

Is system display stuck st splash screen?

Expand
titleYes - does the system have any codes unrelated to display being pushed to offboard channel? (see notes below)

Yes - does the system have any codes unrelated to display being pushed to offboard channel? (see notes below)

  • Check for any POST code displayed in addition to the board manufacturer logo
    • Commonly observed codes
      • Supermicro - 91 - display pushed to offboard channel
      • Tyan - E3/AB - display pushed to offboard channel
      • ASUS (on workstation motherboards or back of 2U chassis) - OS loaded properly
      • B7/B9 (or B codes) is typically a motherboard component (mostly memory) causing system not to complete POST
      • Refer to MFR manual for other uncommon codes
      • Most important POST code is where the system gets stuck at
Expand
titleYes - please continue to see if system fails to POST due to CPU/Memory/Motherboard

Yes - please continue to see if system fails to POST due to CPU/Memory/Motherboard

Expand
titleDoes system power on after re-seating all Memory DIMM's?

Does system POST after re-seating all Memory DIMM's?

Expand
title(END) Yes

(END) Yes

  • check topology in BIOS to make sure all installed memory are identified


Expand
titleNo

No

Expand
titleCPU/Memory/Motherboard - Does it POST when system is brought down to 1x CPU and 1x Memory DIMM (on primary/first CPU/memory slot)?

CPU/Memory/Motherboard - Does it POST when system is brought down to 1x CPU and 1x Memory DIMM (on primary/first CPU/memory slot)?

  • If they cannot perform this troubleshooting, they will need to ship this system back to Exxact for further troubleshooting; issue System RMA
  • In red, because this is a line whether the hardware diagnostics is more involved/invasive and customers may damage internal components if not handled properly
  • See other option to see if they can quickly check if the power button/ribbon cable is the root cause
Expand
titleYes - swap CPU to see if issue persists; does system power on after swapping CPU?

Yes - swap CPU to see if issue persists; does system power on after swapping CPU?

Expand
title(END) - Yes - Defective motherboard/slot; re-install memory and check topology in BIOS for CPU1 to make sure all installed memory are identified

(END) - Yes - Defective motherboard/slot; re-install memory and check topology in BIOS for CPU1 to make sure all installed memory are identified

  • Could be bad CPU pin/slot on the motherboard on secondary CPU slot
  • Ask if they are okay with performing RMA on the chassis+motherboard (honestly if they got this far, I'm sure they can swap the barebone)


Expand
titleNo - swap memory DIMM's; does system power on after swapping through all memory DIMM's that were uninstalled when CPU2 was removed??

No - swap memory DIMM's; does system power on after swapping through all memory DIMM's that were uninstalled when CPU2 was removed??

  • Still could be bad memory, try another memory DIMM to see if issue persists
Expand
title(END) Yes - Defective memory DIMM; issue component RMA for Memory

(END) Yes - Defective memory DIMM; issue component RMA for Memory

  • try to have them re-create issue by re-installing suspected DIMM to see if system fails to power on/POST
  • repopulate CPU2 and memory in pairs to ensure the rest of the memory DIMM's are allowing system to POST
  • check topology in BIOS to make sure all installed memory are identified


Expand
title(END) No - Defective CPU; issue component RMA for CPU

(END) No - Defective CPU; issue component RMA for CPU

  • Most likely confirmed to be bad CPU since:
    • CPU1 slot works
    • Installing either of the CPU's into secondary CPU slot does not allow system to power on
  • Have them swap the memory DIMM's that were previously installed for CPU2's row into CPU1's to see if all memory is working properly
  • System should still be able to power on with 1x CPU and DIMM's but they may lose half the PCI-e slots on certain systems (typically older ones using 2011-v3/v4 CPU's)




Expand
title(END) No - Defective motherboard; issue System RMA for confirmation of issue and repairs

(END) No - Defective motherboard; issue System RMA for confirmation of issue and repairs

  • Could be bad primary CPU1 slot or bad motherboard entirely; issue System RMA for confirmation of issue and repairs








Expand
titleNo(END) No - they are using ports meant for onboard display; make sure any display cables installed to the motherboard (onboard channel) are unplugged and reboot system

(END) No - they are using ports meant for onboard display; make sure any display cables installed to the motherboard (onboard channel) are unplugged and reboot system



Expand
titleNo - can BIOS be accessed using the 'del' key while system powers on?

No - can BIOS be accessed using the 'del' key while system powers on?

Expand
titleYes - have you checked boot priority to ensure all installed drives are identified, and that the boot priority is set correctly to use the disks containing the OS?

Yes - does system boot to OS after verifying boot priority is set correctly to first scan the disks containing the OS?

  • make sure to check all installed drives are identified in the 'boot' tab in BIOS
Expand
title(END) Yes - Cool

(END) Yes - Cool


Expand
titleNo - does OS boot after physically re-seating the drives?

No - does OS boot after physically re-seating the drives?

Expand
title(END) Yes - Cool

(END) Yes - Cool


Expand
titleNo - (if system was set to use offboard orginally) can you get OS if you change primary display setting to 'onboard'?

No - (if system was set to use offboard orginally) can you get OS if you change primary display setting to 'onboard'?

Expand
title(END) Yes - boots to OS, see next tree related to OS issues

(END) Yes - boots to OS, see next tree related to OS issues


Expand
titleNo - Does system boot up using another (separate) boot drive installed?

No - Does system boot up using another (separate) boot drive installed?

Expand
title(END) Yes - Corrupted boot drive/OS; possible Component RMA, or escalation (see notes)

(END) Yes - Corrupted boot drive/OS; possible Component RMA, or escalation (see notes)

  • Multiple boot drives and are setup as RAID1 - it is unlikely both drives corrupted at the same time unless the OS/kernel was altered
    • escalate to management to have them review scenario or to quote options for drive/OS/SW
  • Single boot drive - issue Component RMA
    • we need to pre-load the OS/SW at a loss


Expand
title(END) No - (unlikely) Motherboard/SATA port issue with board; issue System RMA

(END) No - (unlikely) Motherboard/SATA port issue with board; issue System RMA






Expand
titleNo - are you receiving any activity lights on the keyboard while system powers on?

No - are you receiving any activity lights on the keyboard while system powers on?

  • strike 'caps lock' or 'scroll lock' keys to see if the keyboard LED (if applicable) react
Expand
titleYes - can system reboot by using 'ctrl+alt+del'?

Yes - can system reboot by using 'ctrl+alt+del'?

Expand
titleYes - (if stuck at a blank screen after splash screens load) can the OS/kernel selection screen be accessed by using 'up+down arrow keys' after splash screen passes during boot?

Yes - (if stuck at a blank screen after splash screens load) can the OS/kernel selection screen be accessed by using 'up+down arrow keys' after splash screen passes during boot?

  • Proceed to OS issues tree if able to get to OS/kernel selection screen
Expand
title(END) Yes - Proceed to OS issues tree if able to get to OS/kernel selection screen

(END) Yes - Proceed to OS issues tree if able to get to OS/kernel selection screen


Expand
titleNo - Does system boot up using another (separate) boot drive installed?

No - Does system boot up using another (separate) boot drive installed?

Expand
title(END) Yes - Corrupted boot drive/OS; possible Component RMA, or escalation (see notes)

(END) Yes - Corrupted boot drive/OS; possible Component RMA, or escalation (see notes)

  • Multiple boot drives and are setup as RAID1 - it is unlikely both drives corrupted at the same time unless the OS/kernel was altered
    • escalate to management to have them review scenario or to quote options for drive/OS/SW
  • Single boot drive - issue Component RMA
    • we need to pre-load the OS/SW at a loss


Expand
title(END) No - (unlikely) Motherboard/SATA port issue with board; issue System RMA

(END) No - (unlikely) Motherboard/SATA port issue with board; issue System RMA



Expand
title(END) No - Proceed to OS issues tree if able to get to OS/kernel selection screen

(END) No - Proceed to OS issues tree if able to get to OS/kernel selection screen




Expand
titleNo - can you get to BIOS using another keyboard and/or USB port after restarting the system?

No - can you get to BIOS using another keyboard and/or USB port after restarting the system?

Expand
titleYes - go back to "No - can BIOS be accessed using the 'del' key while system powers on?"

Yes - go back to "No - can BIOS be accessed using the 'del' key while system powers on?"


Expand
titleNo - (see notes) reboot system and try to get BIOS or OS/kernel selection screen; can you access either?

No - (see notes) reboot system and try to get BIOS or OS/kernel selection screen; can you access either?

  • If the OS boots to a certain point, and display driver or core packages are corrupted, you may lose all keyboard/mouse activity; rebooting and trying to boot from a different point/kernel may help proceed with troubleshooting
  • (If cannot get to BIOS) Proceed to OS issues tree if able to get to OS/kernel selection screen
Expand
titleYes - if BIOS, go back up to "No - can BIOS be accessed using the 'del' key while system powers on?"

Yes - if BIOS, go back up to "No - can BIOS be accessed using the 'del' key while system powers on?"


Expand
titleNo - can you get to BIOS by removing all drives and then trying the 'del' key again?

No - can you get to BIOS by removing all drives and then trying the 'del' key again?

  • Make sure system is powered off before removing/installing drives
Expand
titleYes - are onboard/offboard settings correct?

Yes - are onboard/offboard settings correct?

  • This can impact display for the OS, and possibly a sanity check to ensure they are using the correct display configuration to access OS
Expand
titleYes - Does inserting the drive back in cause the same issue?

Yes - Does inserting the drive back in cause the same issue?

  • This can impact display for the OS, and possibly a sanity check to ensure they are using the correct display configuration to access OS
Expand
titleYes - Does system boot up using another (separate) boot drive installed?

Yes - Does system boot up using another (separate) boot drive installed?

Expand
title(END) Yes - Corrupted boot drive/OS; possible Component RMA, or escalation (see notes)

(END) Yes - Corrupted boot drive/OS; possible Component RMA, or escalation (see notes)

  • Multiple boot drives and are setup as RAID1 - it is unlikely both drives corrupted at the same time unless the OS/kernel was altered
    • escalate to management to have them review scenario or to quote options for drive/OS/SW
  • Single boot drive - issue Component RMA
    • we need to pre-load the OS/SW at a loss


Expand
title(END) No - (unlikely) Motherboard/SATA port issue with board; issue System RMA

(END) No - (unlikely) Motherboard/SATA port issue with board; issue System RMA



Expand
title(END) No - (unlikely) Motherboard/SATA port issue with board; issue System RMA

(END) No - (unlikely) Motherboard/SATA port issue with board; issue System RMA



Expand
titleNo - (if system was set to use offboard orginally) can you get OS if you change primary display setting to 'onboard'?

No - (if system was set to use offboard orginally) can you get OS if you change primary display setting to 'onboard'?

Expand
title(END) Yes - boots to OS, see next tree related to OS issues

(END) Yes - boots to OS, see next tree related to OS issues


Expand
titleNo - Does system boot up using another (separate) boot drive installed?

No - Does system boot up using another (separate) boot drive installed?

Expand
title(END) Yes - Corrupted boot drive/OS; possible Component RMA, or escalation (see notes)

(END) Yes - Corrupted boot drive/OS; possible Component RMA, or escalation (see notes)

  • Multiple boot drives and are setup as RAID1 - it is unlikely both drives corrupted at the same time unless the OS/kernel was altered
    • escalate to management to have them review scenario or to quote options for drive/OS/SW
  • Single boot drive - issue Component RMA
    • we need to pre-load the OS/SW at a loss


Expand
title(END) No - (unlikely) Motherboard/SATA port issue with board; issue System RMA

(END) No - (unlikely) Motherboard/SATA port issue with board; issue System RMA














Need to add

  • Constant reboot (for server chassis)
    1. Reseat power components (redundant, cables)
    2. Reduce to minimal POST config
    3. If none of the above resolves, most likely PDB or MB
      1. Annoying toss-up

Basic power/display issue for workstations (Dev Box or systems with single-display option)

...

/wiki/spaces/WAR/pages/789384865
/wiki/spaces/WAR/pages/789417550

System hangs during POST

Will elaborate later, but this is to generally troubleshoot BMC issues with server motherboards.

Issue: System hangs at BIOS:

IF:

- PSU's are properly plugged in and known-working (correct LED activity)

- All pre-installed [from Vendor] MB components required for POST have been reseated


THEN:

- Configure system for minimal POST configuration [Dual-CPU systems reduced to single-CPU systems installed on first CPU MB socket; single DIMM of memory installed on first Memory DIMM slot for that first CPU MB socket]

*- If dual-CPU system, swap CPU and DIMM to confirm whether root cause is from CPU or DIMM [Need to cycle both CPU's, and less likely that 2x DIMM's fail simultaneously]


IF MINIMAL POST CONFIGURATION STILL DOES NOT ALLOW POST, THEN:

- Reset BMC via IPMI [if accessible]

- Reset BMC via CMOS battery [power drain system, remove CMOS, press+hold power button for 30 seconds]