Sunday 12 October 2014

What is RAID? - Storage Basics-4

RAID stands for Redundant Array of Independent Disk. It is a method used to store the data efficiently onto the hard disk. It gives the method that not only distribute the data on various disks but also provide the mechanism which helps in providing the redundancy.

To understand the RAID methods we must understand the below terms:-

A. Stripping: - Stripping is the way to divide the incoming data by the controller and then distribute it to the various back end available hard disks. It provides load balancing of data to different hard disks.
B. Redundancy: - Redundancy enables us to restore the data in case of hard disk failure. Before writing the data to hard disk, controller creates two copies of it and then stores the redundant to different disk. It is also known as mirroring.
C. Parity:- Controller collect the data from the servers and then run an algorithm which results a value, similar to checksum ,called parity  and store it in a particular disk. In case of disk failure, data can be recovered with help of parity and remaining disk data.

  1. RAID0:- Incoming data from the server to controllers are being stripped by the controllers and then distributes the data to various hard disks. 
  
   In RAID0, there is no mechanism to provide redundancy as mirroring is not involved. It doesn't provide any fault tolerance and data will be lost in case of hard disk failure.




Advantage:-
  • Data is load balanced between the hard disk and hence both the hard disks are equally loaded.
  • High performance - Controller can write and read from the hard disk simultaneously which increases both the read and write performance.
  • Less number of hard disk required to store the data as all the data is shared between the hard disk.
Disadvantage:-
  • No fault tolerance mechanism is involved. 
  2. RAID1:-In RAID 1, data will be mirrored by the controller and then store in both the hard disk. Data can restored in case of hard disk failure.
   

   Advantage:-
  • Mirroring is involved to provide fault tolerance.
   Disadvantage:-
  • Wastage of memory - It requires double space of the actual data size as one disk is require to store the redundant data.
  • Less performance: - As the data has to be replicated and write to another disk, it is less efficient. It decreases the write speed whereas read speed is increases.
  • IO channel utilization: - Twice the original data is transferred by IO channel which will increase the IO channel utilization.
  3. RAID01:- Mirrored and then stripping.

In RAID01, controller creates two virtual disk consists of hard disks. Both the virtual disks are again virtualized as one and hence server will see only one hard disk.

Data received from server is first mirrored to the two virtual disks and then stripped between the physical hard disks.



Advantage:-
  • Read performance is high as both the virtual disk can handle the read requests.
Disadvantage:-
  • Failure of one hard disk will result in failure of one virtual disk. It is also very expensive to recreate the physical hard disk and then the virtual disk. Many of the storage doesn't even capable of recreating the lost data.
4. RAID10:- Stripped and then Mirrored.

In RAID10, data received by controller are first distributed on the virtual disk like in stripping and then mirrored to the individual hard disks.


Advantage:-
  • Fault tolerance - There is no need to recreate whole virtual disk in case of one hard disk failure. Data will be lost only if both the mirrored disk got failed.
  • It is quite inexpensive as compared to RAID 0+1 to recreate the failed hard disk.
Disadvantage:-
  • Less Read performance:- As only one virtual disk has the data, all the read request will be handle by the particular disk.
5.RAID4:- In RAID4, data is first processed by the controller to calculate the parity value and then the data is stripped between the hard disks whereas the parity disk is written on the separate disk.

In the below example, data is written on the four disk sequentially like RAID0 and the parity bit is stored on Disk5 which is dedicated for parity value only.


Advantage:-
  • Less storage disk are required as there is no mirroring of the data.
  • Fault tolerance - Data can be recovered using party stored in the separate hard disk and other present hard disk.       
Disadvantage:-
  • Since controller has to calculate the parity and then write the party to separate hard disk that requires extra time for the process and even there is only one party disk where all the parity will be stored. As write is only come to a single disk which makes it as a point of congestion.
6.RAID5:- RAID 5 is very similar to RAID 4 where the data is striped between all the hard disk and parties is generated and then write on the disk. But instead of the separate disk, parity will be written on all the disks as shown below



Advantage:-
  • It provides all the advantages of RAID4.
  • Also since parity is being written on all the disks therefore there is no bottleneck as well.
Disadvantage:-
  • Similar to RAID 4, Parity is calculated and written on the disk which consume time ( known as write cost).



No comments:

Post a Comment