Physically inspecting damaged systems and parts

Table of Contents

Document Scope

Listing server/HPC systems and hardware component types, and what physical damages to look out for.

System Types

Servers

Server exterior (barebone/chassis)

Server chassis are pretty durable. However, if they were to take some sort of heavy shock damage or weight that slants their box-integrity shape, the components that are usually damaged are as follows:

  • Chassis, obviously
  • FOR SUPERMICRO 4028/4029 ONLY, look out for their power button and front information LED assembly
    • If this is damaged, this pretty much needs a whole barebone replacement. We cannot fish the ribbon cable through the chassis, that is done by Supermicro; only other option to powering the system is to jump the JB1 pins directly on the board
    •  Supermicro side power button and LED information assembly

      For Tyan, they have their power button built into the front, and not on the rack ears. Good on them; same goes with ASUS barebones.

  • Motherboard
    • Mainly the PCI-e slots, in case they shifted, or no longer flush with the board in case the plastic guiding sleeve for those slots start coming loose
  • Edge-most redundant power supplies
    • Tyan B7105/B7109 and Supermicro 7048/7049 series have their redundant power supplies on ONE side. If physical damage is on THAT side, you can assume those those capacitor's soldering points took a hit and SHOULD NOT be powered on.
      •  Example of PSU's being on one side of the barebone

    • Tyan B7119 and Supermicro 4028/4029 series have 4 PSU's across the back-bottom, most likely the outer ones are damaged
      •  Example of PSU's being spread evenly at the back

  • PCI-e devices
    • Tyan PCI-e mount frame is much stronger than Supermicro, but severe concave/convex bending of the port dividers could indicate severe weight was stacked onto the server
      •  Example of what I'm trying to point out here

        It's definitely not because of the image sizes. Supermicro have larger perforated holes which weakens the overall PCI-e frame integrity (marked with red lines). If slightly bent and everything works, it's a cosmetic damage. If they are bent, ESPECIALLY without a 0.5 lid, AND the PCI-e devices are not operating properly, something heavy was stacked or struck the top of the chassis. 

        For Tyan (blue lines), there is a native buffer until stack damage reaches the GPU's. 

Server interior parts

Damage for inspection points based off whether Vendor will choose to approve or reject Repair OR Replace RMA for that corresponding part.

Blue = Cosmetic damage, Vendor can use as excuse to reject due to physical damage, but depends on Vendor

Red = Internal/Severe damage, Vendor will most likely reject

Grey = Questionable whether Vendor will approve or reject

TypeMain inspection points
CPU/Heatsink
  • Dents on or in-between heat sink blades
  • CPU corners for bending/tear (in case something dislodged inside of the barebone and struck the heatsink from the side)
  • Gold contacts/connection that goes into the slot, check whether center capacitors are missing
Motherboard (mainboard or riser)
  • Bent/slanted slots where the components plug into
  • CPU pins
  • Capacitors/chips missing near edge of motherboard
Memory DIMM's
  • Memory DIMM clips on the motherboard
  • Scratches on the memory DIMM ships on either side, and dents on the corners of the board/stick
  • Missing memory chips on the board/stick
  • Gold contacts/connection that goes into the slot
Hard Drive / Solid State Drive
  • Wear/scratches on exterior
  • Punctures or scratches on the sticker
  • Plastic guide for the SATA interface on the drive
  • Not really the drive, but check the SATA cable connections to the backplane AND motherboard slot while you're here
  • Gold contacts/connection that goes into the slot
Graphics Card
  • Plastic shell scratches/dents
  • Guiding frame/bezel (where the ports are located)
  • Fan blades, whether they are bent or missing (fans for >1 fan-blower cards are very prone to this damage)
  • Missing capacitors/chips, scratched board 
  • Gold contacts/connection that goes into the slot (the GPU could STILL work in either case below, but of course requires testing them individually)
    • Dark/black stains, sign of trading plastic with the slot
    • Powering system on with improper seating can short out the card, or worse, but is a cause of those black plastic or soot marks
    • Scratches on gold contacts/connection
Network Card
  • Damage to the internal contacts of any port, any loose or irregular pins
  • Guiding frame/bezel (where the ports are located)
  • Dents/slants to heatsink, if it has one
  • Missing capacitors/chips, scratched board
  • Gold contacts/connection that goes into the slot
RAID Controller
  • Damage to the internal mini-SAS ports facing inwards towards the system
  • Guiding frame/bezel (where the ports are located)
  • Dents/slants to heatsink or BBU, if it has one
  • Missing capacitors/chips, scratched board
  • Gold contacts/connection that goes into the slot
Redundant Power Supply
  • "The metal 'wings' that guide the power supply contacts to the PDB"...
    • Cosmetic to us, but Vendors will use this as an excuse to reject, so best to bend those straight since you can do it without leaving a trace
  • Gold contacts/connection that goes into the slot

DevBox/Cube

Standard Corsair 540 chassis packaging is not optimal for shipping, so I am learning the hard way. This is just my guess. The hard styrofoam that comes default with the Corsair 540 packaging transfers shock damage instead of absorbing it. The 19x19x16" box only gives it maybe ~0.5-1.0" of foam/clearance on all sides, so even if it had better foam, it is still a really small gap of cushion. In addition to the default foam packaging, I think the Exxact plain boxes MIGHT be smaller, and the black sleeve combined with that dense styrofoam generates a noticeable amount of static, which is never great for any computer hardware, especially memory.

The way Exxact recommends and ships DevBoxes, is with window-panel facing up, allowing CPU, Memory, and GPU's to rest ON TOP of the board where they slot in most naturally. Resting the Corsair 540's chassis upright already puts stress on the PCI-e slot, more-so on the clips which loosely anchors the GPU at the center of the board; the GPU's are much more secure where they are held down by screws fixated to the chassis.

Screws > Plastic clip holding the back-end corner of a ~2mm board's connection that protrudes past the rest of the assembly. My GUESS, again, is wobbling in transit, and even just a 2" drop of the box will easily jostle the cheap plastic-clip-side of the PCI-e connector; resulting in customer's claiming they receive these DevBoxes with no graphics or their graphics card not fully seated-- and especially with the motherboard's plastic clips being first to give when it comes to standing the chassis upright. Could be the chassis outer material being really thin, and compounding the shock damage to anything anchored to it.

 Example of packaging

Sorry in advance for the watermark, I am not in the office to take my own picture for this...


DevBox/Cube exterior tangent parts

If there are reports of exterior damage to and DevBox/Cube HPC, these parts should automatically be replaced:

  • Chassis, especially for the windowed panel
  • Power Supply
    • If the system box took ANY shock damage (let's just say a 8" drop from the floor if somehow the box slipped out of their hands), the power supply is going to backfire due to some internal damage, it haven't been proven wrong yet
  • Liquid CPU cooler (water pump)
    • At least from my experience: the Enermax CPU liquid cooler water pump fails first, this is guaranteed CPU overheating issues down the road when it happens 

2nd down the list, the common internal parts are damaged:

  • Motherboard - PCI-e clips
    • ASUS, for example,  repair charge for broken PCI-e clips costs more than the board itself
  • Any drive(s) resting at the bottom 2 SATA ports of the chassis
    • At best, these drives just need to be re-seated or pushed in for a solid connection if these drives are not being recognized out-of-box

DevBox/Cube interior parts

PartMain inspection points
CPU
  • CPU corners for bending/tear
  • Gold contacts/connection that goes into the slot, check whether center capacitors are missing
Motherboard

I've seen motherboards get denied RMA for less- Customers should be told this in some form. If unsure whether we can get Repair/Replacement, submit to Manufacturer/Vendor anyways

  • All interfaces/connections for dents/damages
  • Hold-down/lock clips (PCI-e/Memory)
  • Capacitors
Memory
  • Memory DIMM clips on the motherboard
  • Scratches on the memory DIMM ships on either side, and dents on the corners of the board/stick
  • Missing memory chips on the board/stick
  • Gold contacts/connection that goes into the slot
Hard Drive / Solid State Drive
  • Wear/scratches on exterior
  • Punctures or scratches on the sticker
  • Plastic guide for the SATA interface on the drive
  • Not really the drive, but check the SATA cable connections to the backplane AND motherboard slot while you're here
  • Gold contacts/connection that goes into the slot
Graphics Card
  • Plastic shell scratches/dents
  • Guiding frame/bezel (where the ports are located)
  • Fan blades, whether they are bent or missing (fans for >1 fan-blower cards are very prone to this damage)
  • Missing capacitors/chips, scratched board 
  • Gold contacts/connection that goes into the slot (the GPU could STILL work in either case below, but of course requires testing them individually)
    • Dark/black stains, sign of trading plastic with the slot
    • Powering system on with improper seating can short out the card, or worse, but is a cause of those black plastic or soot marks
    • Scratches on gold contacts/connection
Network Card
  • Damage to the internal contacts of any port, any loose or irregular pins
  • Guiding frame/bezel (where the ports are located)
  • Dents/slants to heatsink, if it has one
  • Missing capacitors/chips, scratched board
  • Gold contacts/connection that goes into the slot
Power Supply

Hit/miss with PSU vendors; Enermax will accept nearly everything, while Sparkle will deny for the smallest of things. Since we cannot fully inspect this, nor should we, just submit to Vendor for further review

  • Usually internal damage that cannot be seen without dismantling the power supply; we don't, because that breaks Manufacturer warranty seal/sticker
  • Missing fan blades
  • Rails for burn damage (in case someone tried powering this on with improper seating)
  • Both ends of cables for burns, especially the 24-pin that plugs into the motherboard