3.9 The next generation RAID techniques

The use of RAID technology started approximately 20 years ago and there are no significant advancements on the RAID technology till last 5-6 years. The common disk drive bloat problems and other disk slow performance issues lead to a longer rebuild times in a disk failure situation. This shows that the older RAID technologies are not suitable to provide uninterrupted business and IT needs. After the introduction of cloud computing and big data, this need has become a necessity to work on the new RAID techniques.

To address these next generation IT demands, there are few interesting and important new RAID techniques have evolved and some of these also include the popular implementation forms of parallel and distributed RAID. Below are some of the interesting and popular new RAID techniques used in the current IT industry.

1) IBM XIV and RAID-X

IBM XIV is a storage array which implements a form of RAID 1 technology known as RAID-X. It is a turbo-charged, object-based distributed RAID and it is susceptible to double drive failures and allocates 50% of its capacity to protection.

Object-based means that RAID protection is based on objects rather than entire drives, these objects are generally 1MB extents known as partitions. Each volume on an XIV is made up of multiple 1 MB extents and it is these 1 MB extents which are protected by RAID-X. The object-based nature of RAID-X enables the 1 MB extents that make up RAID-X volumes to be spread out across all the drives of the XIV array. This wide spreading of volumes across the entire backend leads to massively parallel reads, writes and re-protect operations.

Also Read: Types of RAID Levels

RAID-X offers a parallel reprotection operation which can reprotect the damaged extents to rebuild the original data. If a drive in an XIV fails, RAID-X does not attempt to rebuild the failed drive but instead it starts working/reprotecting the affected data. To protect the damaged or lost extents, XIV reads each extent from all drives on the backend and writes new mirror copies of them to other drives spread across the entire backend in parallel. The RAID-X also ensures that the primary and mirror copies of any of the 1 MB extents are never saved in the same drive and it goes further to ensure that they are also on separate nodes or shelves.

This reprotect operation runs parallely very fast and completes in just few minutes, the speed of the reprotect operation is vital to the viability of RAID-X at such large scale as it massively reduces the window of vulnerability to a second simultaneous drive failure. Also the parallel nature of a RAID-X rebuild avoids the increased load stress which is physically placed on all disks in the RAID set during normal RAID rebuild operations.

Since, RAID-X is object based, it reprotects only actual data. Non-object-based RAID techniques will often spend more time rebuilding tracks that have no data on them. By reprotecting only data, RAID-X speeds up reprotect operations. Once the data is reprotected, RAID-X will start rebuilding the failed drive which is a many-to-one operation (reprotected data comes from many drives to single drive).

Other object-based rebuilds or storage arrays such as HP 3PAR, Dell Compellent and XIO also use this kind of RAID approach but they use object-based RAID with parity and double parity rather than object based mirroring technique.

2) ZFS and RAID-Z

RAID-Z is a parity-based RAID technique which is tightly integrated with the ZFS filesystem. It offers single, doube and triple parity options and uses a dynamic stripe width. This dynamic, variable-sized width is powerful, effectively enabling every RAID-Z write to be a full stripe write with the exception of small writes that are usually mirrored. RAID-Z has brilliant performance and rarely suffers from the read-modify-write penalty which traditional parity-based RAID techniques suffers when performing small-block writes.

RAID-Z rebuilds are significantly more complex than typical parity-based rebuilds where simple XOR calculations are performed against each RAID stripe. Because of the variable size of RAID-Z stripes, RAID-Z needs to query the ZFS filesystem to obtain information about the RAID-Z layout. This can cause longer rebuild times if the pool is near capacity or busy. Additionally, because RAID-Z and the ZFS filesystem talk to each other, rebuild operations rebuild only actual data and do not waste time rebuilding empty blocks.

3) RAID-TM

RAID-TM is a triple-mirror-based RAID. Instead of keeping two copies of data as in RAID 1 mirror sets, RAID-TM keeps three copies. As such, it loses 2/3 of capacity to protection but provides good performance and excellent protection.

4) RAID 7

The idea behind RAID 7 is to take RAID 6 one step further by adding a third set of parity to protect data on increasingly larger and larger drives.

Also Read: Overview of RAID 6 and its use cases

5) RAIN

RAIN is an acronym for redundant/reliable array of inexpensive nodes. It is a form of network-based fault tolerance, where nodes are the basic unit of protection rather than drives or extents. RAIN-based approaches to fault tolerance are increasingly popular in scale-out filesystems, object-based storage, and cloud-based technologies.

6) Erasure Codes Technique

Erasure codes also work slightly differently from parity. While parity separates the parity (correction codes) from the data, erasure codes expand the data blocks so that they include both real data and correction codes. Similar to RAID, erasure codes offer varying levels of protection, each of which has similar trade-offs between protection, performance, and usable capacity.

Previous: 3.8 What is Hot Sparing ?

Next: 4.1 Introduction to SAN Protocols

Go To >> Index Page

What Others are Reading Now...