RAID Best Practices

Table of Contents

Document Change History

Version Date Comment
Current Version (v. 1) Mar 09, 2021 01:18 Joshua DeRush (Unlicensed)

Document Scope & Audience

Document Scope

Best Practices for RAID integration and usage. What it is, how to use it and the best practices to maximize your experience. No matter how big or small your organization is, even if it is home use protecting data is important and there are two main ways this can be accomplished. 

  1. Backups
    1. Offloading data to another system.
    2. This addresses: Total System Failure, Viruses, Corruption, and more
  2. RAID 
    1. Protects data from drive failure 
    2. Can increase performance depending on configuration. 

Document Audience

All customers and employees to reference for better understanding. 

What exactly is RAID?

Redundant Array of Inexpensive/Independent Disks is a storage structure that unifies two or more disks to be used as one logical device. By spreading out data on multiple disks this can help overcome any one particular disk failing and/or increasing performance. 

The 3 Basic RAID configurations

  1. Striping (RAID 0)
    1. Data is written across multiple drives, this minimizes read, write, and access times since more than one disk can perform the request. In turn this increases I/O performance
  2. Mirroring (RAID 1)
    1. Replicates the same data on two or more drives, this addresses data redundancy and can help prevent data loss should any one drive fail. 
  3. Parity (RAID 5 & 6)
    1. This configuration provides fault tolerance by examining the data stored on 2 drives and storing the results on a third disk. Should any one disk fail there is enough data on the remaining drives to rebuild the missing disk on the replaced disk. 

NOTE: It is possible to combine features of each of the above into one array called RAID 10, 50, 60. THe RAID controller handles the combining of drives into these configurations to maximize performance, capacity, redundancy, and cost to suit the application at hand.

Hardware RAID vs. Software RAID

RAID can be obtained via a dedicated hardware controller or by software. Each has their Pros and Cons which we will examine below. 

Hardware RAID

This is accomplished via a dedicated RAID controller most often a PCIe controller card or in some cases via a RAID-on-Chip (ROC). THe RAID controller has its own processor and memory dedicated to the task, this prevents from any extra storage load on the system CPU allowing it to be used for all software requirements, operating system and applications.

HW Pros:

  • Better performance when compared to software RAID
  • Controller cards can be swapped out for replacements or upgrades
    • Unless a ROC was used as it would require the entire motherboard to be swapped out which is much more invasive and time consuming

HW Cons:

  • More hardware means more cost

Software RAID

Instead of having a dedicated controller, the storage workload is added to the system CPU and is driven within the OS. 

SW Pros

  • Lower Cost since the additional hardware is not needed

SW Cons

  • Lower RAID performance as this workload is added to CPU which is already processing the operating system and applications.

How Does RAID Work?

THe RAID system combines the individual drives into one logical disk. The OS treats the drive like any other drive present on the system. The OS does not know a difference between a single disk or a RAID being presented by the controller. There are some differences between HW and SW mechanics highlighted below.

HW Mechanics

The RAID card is either directly attached to the disks destined for the RAID or connected to a hot-swap backplane allowing the dedicated resources to manage the disks and present them as a logical volume to the OS. 

SW Mechanics

An application on the host that loads with the OS and will present the selected drives attached to the system as the logical volume, this will occur once the system is booted enough to engage the driver software.

Given the server hardware being used today it is always best to use a dedicated hardware RAID controller to take advantage of the increase in performance and the flexibility it provides. 

Hybrid RAID

The typical RAID is built with all spinning disks but also could consist of all SSD for better performance, however, the latter does come with a significant price tag associated with it. One clever solution is a hybrid RAID that uses both spinning disks and SSDs to achieve better performance while keeping the cost lower than the all SSD alternative. Essentially write operations are performed to both the HDDs and the SSDs but the read operations are performed from the SSDs allowing each server to increase IOPs and reduce latency allowing each server to host more users and perform more transactions which in turn reduces the amount of servers needed to support any given workload. 

This can be used for simple mirrors in workstations all the way up to data center applications allowing for greater capacity in servers and faster booting of those systems. If this is of interest, I suggest researching the topic in greater detail. 

Who Should Use RAID?

If the system in question requires constant uptime, RAID would make sense. With the technology being so readily available nowadays, I would suggest any system contain a RAID for the crucial data mounts. At the very minimum any critical data and the OS should be on a RAID to allow the users and system admins to sleep easier at night. 

Even if the system is being regularly backed up, when a drive fails you can still recover at the risk of some lost data but more importantly the time to fix the issue will always be greater than if the disk failed when in a RAID.

When RAID is in place, a failed disk can be most often be hot-swapped into the system, the controller will then copy the missing information to it with little to no impact to the use of the server and a much smaller investment of time then restoring from backups. 

Ideally, implementing a RAID and then making regular backups of that RAID to an off-system source will cover any scenario the system will eventually encounter. 

Choosing the Correct RAID Level

As mentioned above there are several different configurations of RAID that can be set up on a system (RAID0, RAID1, RAID5, & RAID6). They each have their own pros and cons, furthermore hybrid configurations can combine multiple simpler RAIDs into one working volume (RAID10, RAID50, & RAID60). Each RAID has huge differences to the next and when and where to use each is very important to consider. 

The factors to considering when picking your RAID include:

  • Capacity
  • Performance
  • Redundancy
  • Price

Unfortunately, there is not a single one-size-fits all configuration as to increase any attribute often comes at the cost of affecting another in the list above. For example a RAID that focuses on performance often does such at the cost of redundancy. A large, fast, highly redundant array will be EXPENSIVE. On the opposite side of that a small, average speed array will not cost that much at all, especially nowadays but it will not be as fast as the previous example. 

Below are the different RAID configuration in more detail.

RAID LevelCommon NameDescriptionProsCons
RAID0 StripingThe simplest RAID configuration to understand is RAID0, essentially all drives are combined into one massive logical volume to present to the system. This is great fro performance as there are 2 or more drives sharing the work of the volume allowing more heads and spindles to write or read the workload at hand. The tradeoff here is there is absolutely no redundancy and if a single disk in this configuration is lost, the entire volume is lost even on the disks not experiencing the error. Since there is a great risk for total data lost, coupled with the fact that SSDs are quite affordable now RAID0 is NOT RECCOMENDED. The threat of losing all data on the volume outweighs any performance gains it might provide. 
  • Fast and inexpensive
  • All drive capacity is used
  • Quick to setup
  • All drives sharing the data load makes it the fastest of all arrays
  • NO DATA PROTECTION AT ALL
  • If one drive fails all data is lost with NO POSSIBILITY of recovery
RAID1MirrorThis configuration shares the data across multiple drives. Whether it be 2 or 100 disks, all disks contain the same data while still being presented as one. THis configuration is all about protection and not performance or capacity.
  • HIGHLY REDUNDANT - Each drive is an exact copy of the others in this RAID. 
  • If a drive fails, there is no loss in system performance at all unless it is the last drive in the config
  • Performance is not much better than a single drive 
RAID1EStriped MirrorThis configuration combines striping and mirroring in one array. 
  • Redundancy with better performance, essentially, it can be thought of as a mirror with an odd number of drives. 
  •  High cost since you are only really getting to use 50% of the available drive space. 

RAID5

Striped w/ ParityTypically referred to the best "all-around" RAID configuration, RAID5 stripes data blocks and parity across all the available drives. 3 are needed at minimum but can go all the way up to 32. Should a drive fail the parts needed for that drive are present on all the remaining drives allowing for the replacement disk to easily be filled with the data the original drive contained. Performance is rather close to RAID0 but there are more operations to be completed since the data needs to be written in combination of the needed parity. 
  • Good value and "all-around" performance.
  • Capacity is (total drive size - one disk size)
  • Only one drive can fail at any one time before data loss occurs.
  • Should 2 drives fail, all data is lost. 
RAID6Striped w/ Dual ParitySImilar to RAID5 in design and performance, however, parity is written in two places. THis allows for 2 drives to fail before total loss of data. This extra security comes at the price of another disk worth of space being lost to parity. Minimum drives needed is 4 and maximum is 32.
  • Pretty good value for the required money investment
  • Can withstand 2 drives failing at any one time. 
  • More expensive than RAID5 due to another drive being used for parity
  • Slightly slower than RAID5 in most applications. 
RAID10Striping w/ Mirroring

Can also be called RAID1+0, consists of multiple paired mirrors being striped together into one logical volume. This option offers good performance and data protection while eliminating parity calculations. THis configuration can be setup on any number of even drives and can be expanded by adding drives in pairs/mirrors referred to as legs. For example this can be setup on 10 drives consisting of 5 pairs and offering 5 drives worth of storage. 


  • Fast and redundant
  • Expensive since it requires at least 4 drives but only offers the storage capacity of 2
  • Not the best for large capacities because of the related cost
  • Not as fast as RAID5 in most streaming environments.  
RAID50Striping w/ Parity Also referred to as RAID5+0 combines multiple RAID5 sets into a RAID0. This allows for larger volumes to be created and each RAID5 subset can withstand a drive failure before data loss occurs. This config also has faster rebuild times when compared to a traditional RAID5 array. Although it can be created on as little as 6 drives it really should be used on a minimum of 16 drives. Usable space depends on the number of drives in the array, anywhere between 67-94% For example if 24 drives were in the array it could be 2 legs of 12 drives each with only 2 drives being used for parity OR it could be three legs of 8 drives each with 3 drives used for parity. The latter options offers less overall storage but would be faster as a rebuild would only affect the drives in that leg. 
  • Reasonable value for the cost
  • Very good all-around performance, especially for streaming and large storage capacities
  • Requires a lot of drives
  • 1 drive in each leg is lost to parity
  • More expensive than RAID5
RAID60Striping w/ Dual ParityAlso referred to as RAID6+0 combines multiple RAID6 sets into a RAID0. Very similar to the option above but offering 2 disks of parity per leg allowing for 2 drives to fail before data loss occurs. Minimum of 8 drives needed but 16 is recommended. Usable space is between 50-88% depending on how the legs are configured. For example with 36 drives, a RIAD with 2 legs of 18 drives each could be used or a RAID with 3 legs of 12 drives each. 
  • Can sustain 2 drive failures per leg
  • Very large and a relative good value since it is only used for large configurations. 
  • Needs LOTS of drives
  • More expensive than RAID50 due to the extra level of parity

When to use Which RAID Level

Data can be roughly categorized into two classifications; Streaming and Random while the classifications of RAID can also be placed into two fields; Non-Parity (RAID1 & 10) while the rest (RAID5, 6, 50, 60) can be labeled Parity.

Random data loads tend to be small in nature while Streaming loads are often larger, although all systems will experience loads of both nature it is best to design for the average workload a system will experience. If the system will handle a lot of Random loads, Non-Parity is a better option while systems that will be used for Streaming, Parity configurations are a better choice. 

Drive Size Performance

Event thought HDDs are getting larger, their speed has not really changed for some time. For example two disks at different capacities (1TB and 6TB) offer vastly different storage space but they spin at the same speed. With that said a larger platter will take longer to find the information needed as well as directly affect rebuild times since the entire platter needs to be written to even if the space is empty, this in turn directly affects rebuild times. The general rule of thumb is to create a RAID with more more drives then achieving the same amount of storage with less physical disks. For example a RAID with 3 6TB drives will be out performed by a RAID that has 6 3TB drives since the workload is spread out across more heads and platters. 

This is not necessarily true for SSDs as bigger devices are typically faster than smaller ones. This means that SSDs need to be reviewed with extra scrutiny to ensure the specs of the purchased equipment meet your needs. Additionally, since SSDs are completely different then the spinning counterparts, the general rule of thumb for SSD RAID is to achieve the storage level you need with as minimal of devices as possible. The bigger disks have greater throughput then the smaller options and will be seen in system performance. 

Size of Array vs. Size of Drives

Something that is often overlooked when creating RAIDs is that the entire disk space is not needed to create the RAID. Often when configuring the raid, a portion of the drive can be used leaving the remaining space to be used in another array if needed. For example, a RAID10 can be spread across all the drives using a small portion of each that will ultimately host the OS, this then leaves a majority of the disk space left untouched that can then be used in a RAID5 for data or other uses.

Rebuild Times and Large Arrays

As mentioned above rebuild times can be affected by multiple factors. Unfortunately, the more drives in an array and the bigger the disks the longer the rebuild times will be whenever a disk needs to be replaced. Even though it was mentioned that a RAID5 can have up to 32 disks in the array, it becomes rather impractical with spinning media due to the increase in rebuild times. In contrary, a RAID50 would be a better fit since it could be comprised of 2 legs of 16 drives each meaning that when a disk needed to be rebuilt, only 15 good disks and the 1 faulty would be affected leaving 16 drives untouched allowing them to perform as if nothing was going on with the server while also allowing for better rebuild times. 

NOTE: that if drives are 6TB+ rebuild times will be greater than 24 hours and that is assuming there is no additional load on the server, if the server is still in use during the rebuild the time will increase even further. 

In regards to SSDs, the rebuild times will be much faster since they are often much smaller then their spinning counterparts and their throughput is much greater. 

SUMMARY

Int he current landscape of computers and servers, RAID is invaluable and should be implemented anyplace that DATA is important. It will save you no matter how good your hardware is or how careful you are, however, it should be implemented correctly for your needs and used with data backups to minimize any potential issue that may surface. So depending on your needs, system use, and budget several decisions need to be made. If your budget allows for a hardware controller, it is always a better choice, the next decision is based on use but how much space do you need and what kind of data is going to be hosted. Again, if the budget allows, SSDs are superior, but if spinning disks are to be used, it is always a better idea to use more smaller capacity HDDs to achieve the determined storage goal. This allows for more heads to distribute workload and less space to search on any given drive allowing for a better distribution of the workload, faster search times and rebuild times. 

The key take aways are that you should be using both RAID and BACKUPS to avoid losing data and making your life easier when using computers as a failure WILL happen and by using both the chance for data loss and downtime dramatically are reduced if not eliminated. Furthermore the CORRECT RAID level for your use case needs to be configured and actively managed. Unfortunately, RAID and BACKUPS will not save you from user error and it is always better to measure twice and cut once.