Sunday, 12 October 2014

Fibre channel Protocol stack - Storage Basics-8

 There are five layers in FC protocol stages which are quite similar to the OSI layer.



1. FC0:- It is equivalent to the physical Layer in OSI model. It belongs to the physical layer and defines the cable and connector used for the FC traffic. It sends the data in the form of bits “0” and “1” sequentially.

FC hubs are working in FC0 layer.

2. FC1:- It is responsible for the data encoding. It ensures the data error correction will be done in case of error.
It also performs the link creation and maintenance. It is equivalent to data link layer of OSI model.

8b/10b encoding is used in 1Gig, 2Gig, 4Gig, 8Gig whereas 10Gig and 16Gig standard uses 64b/66b encoding.

3. FC2:- It is the most important layer in FC protocol stack which perform various critical functions. It is equivalent to network OSI layer and is defined in FC-PI-2 standard.
FC switches works on FC2 layer.

FC2 layer performs the below functions:-

  A. Data block size handling: -   It defines how big data payloads   can  be send over the network.  Below are the few key words:-
  • Exchange:- Exchange is the session built between end machines to transmit and receive data. There can be multiple Exchanges between the hosts.
  • Sequence: - Payload or data is made up of big data units called sequence. It ensures the correct order of delivery as well and eliminates the out of order issues.
  • Frames: - Since each link can only sent a defined amount of data size. The big data blocks are broken down to smaller chucks known as Frames. Frames can consist of up to 2112 bytes. If one frame is lost then the entire sequence has to be retransmitted


FC2 not only ensures that the frame has been received successfully at the receiver end but it also makes sure that the frames are been sent or received in sequence.

B. Flow control: - It provides the flow control to avoid the situation at the receivers end and is known as credit limit. It makes sure that both transmitter and receiver are in sync and transmitter will not overload the receiver. It performs two types of flow control:-
  • Buffer-to-buffer credit: - It is also known as link flow control in which both end of the link negotiate data speed.
  • End-to-End flow control:- In End-to-End flow control, the speed is negotiated between transmitter and receiver
C. Addressing:-  Each device in the fabric has its own unique WWNN (world wide node name) and each port gets a 64-bit address called WWPN (world wide port name). As soon as port is up neighbor switch will assign a 24 bit value called FCID to it.

4. FC3:- It performs the below mentioned services as mentioned below. It is not used in FC protocols and can be used by using additional software but not implemented yet.
  • Encryption
  • Mirroring or RAID
  • Compression
5. FC4:- It is used to map the protocol data to the below layer. It encapsulates the data units and sends it to FC2 to perform the lower layer functions.

What is Mirroring? - Storage Basics-7

Mirroring is the process to provide redundancy. Data will be mirrored from one disk to another so that it can provide the fault tolerance in case of disk failure:-

There are two types of mirroring as mentioned below:-

1. Instant Copying: - It mirrors the data locally to the different hard disk in the same enclosure. It can provide redundancy but to a limited extend as it cannot prevent data lose at the time of complete enclosure failure.
2. Remote Mirror: - In remote mirroring, the data will be copied to the hard disk on the different enclosure which may be located on different data center.

·       Synchronous Remote Mirroring: - In synchronous mirroring, server sent the data to the disk-1 and then it mirror the data to Disk-2.  Disk- will only send the acknowledgement to server only after complete mirroring of data.


·       Asynchronous Remote Mirroring:- In asynchronous mirroring, Disk-1 will send the acknowledgement to server once all the data is written on it and simultaneously it will mirror the data to Disk-2. Disk-1 will not wait for the acknowledgement from disk-2 and will send the acknowledgement to server.

What is LUN masking? - Storage Basics-6

Controller abstracts all the physical disks to a single virtual disk and hence all servers will be able to see all the hard disk. Error generated by one server can impact the complete disk subsystem. LUN masking is the method by which servers will get access to the space they are allowed to. Server cannot read and write on the LUN allocated to other server.

LUN is a kind of filter which restricts the access between server and storage. There are two types of LUN masking:-

1.Port based LUN masking:- In port-based based LUN masking server will see all the disks attached to the storage port. It is not recommenced as there is no restriction between server and hard disks.

2.Server based LUN Masking:- Server will see only its own disk. It will not allow server to see or access the other disks.

What is JBOD? - Storage Basics-5

JBOD is just bunch of disk where all the disks are installed in an enclosure with common power supply. There is no controller present in JBOD and hence it cannot provide any RAID capability. Generally it has either 8 or 16 disks. Server will see each disk as independent disk and each require its own address.


Standard IO technique like SCSI and FC arbitrated loop can be used to connect to JBOD. As shown below all disks are connected in a ring to provide resiliency





It is similar to the hub topology where are the devices are connected in half duplex.  Only one device can transmit at a time and hence the total bandwidth is shared between all devices.

What is RAID? - Storage Basics-4

RAID stands for Redundant Array of Independent Disk. It is a method used to store the data efficiently onto the hard disk. It gives the method that not only distribute the data on various disks but also provide the mechanism which helps in providing the redundancy.

To understand the RAID methods we must understand the below terms:-

A. Stripping: - Stripping is the way to divide the incoming data by the controller and then distribute it to the various back end available hard disks. It provides load balancing of data to different hard disks.
B. Redundancy: - Redundancy enables us to restore the data in case of hard disk failure. Before writing the data to hard disk, controller creates two copies of it and then stores the redundant to different disk. It is also known as mirroring.
C. Parity:- Controller collect the data from the servers and then run an algorithm which results a value, similar to checksum ,called parity  and store it in a particular disk. In case of disk failure, data can be recovered with help of parity and remaining disk data.

  1. RAID0:- Incoming data from the server to controllers are being stripped by the controllers and then distributes the data to various hard disks. 
  
   In RAID0, there is no mechanism to provide redundancy as mirroring is not involved. It doesn't provide any fault tolerance and data will be lost in case of hard disk failure.




Advantage:-
  • Data is load balanced between the hard disk and hence both the hard disks are equally loaded.
  • High performance - Controller can write and read from the hard disk simultaneously which increases both the read and write performance.
  • Less number of hard disk required to store the data as all the data is shared between the hard disk.
Disadvantage:-
  • No fault tolerance mechanism is involved. 
  2. RAID1:-In RAID 1, data will be mirrored by the controller and then store in both the hard disk. Data can restored in case of hard disk failure.
   

   Advantage:-
  • Mirroring is involved to provide fault tolerance.
   Disadvantage:-
  • Wastage of memory - It requires double space of the actual data size as one disk is require to store the redundant data.
  • Less performance: - As the data has to be replicated and write to another disk, it is less efficient. It decreases the write speed whereas read speed is increases.
  • IO channel utilization: - Twice the original data is transferred by IO channel which will increase the IO channel utilization.
  3. RAID01:- Mirrored and then stripping.

In RAID01, controller creates two virtual disk consists of hard disks. Both the virtual disks are again virtualized as one and hence server will see only one hard disk.

Data received from server is first mirrored to the two virtual disks and then stripped between the physical hard disks.



Advantage:-
  • Read performance is high as both the virtual disk can handle the read requests.
Disadvantage:-
  • Failure of one hard disk will result in failure of one virtual disk. It is also very expensive to recreate the physical hard disk and then the virtual disk. Many of the storage doesn't even capable of recreating the lost data.
4. RAID10:- Stripped and then Mirrored.

In RAID10, data received by controller are first distributed on the virtual disk like in stripping and then mirrored to the individual hard disks.


Advantage:-
  • Fault tolerance - There is no need to recreate whole virtual disk in case of one hard disk failure. Data will be lost only if both the mirrored disk got failed.
  • It is quite inexpensive as compared to RAID 0+1 to recreate the failed hard disk.
Disadvantage:-
  • Less Read performance:- As only one virtual disk has the data, all the read request will be handle by the particular disk.
5.RAID4:- In RAID4, data is first processed by the controller to calculate the parity value and then the data is stripped between the hard disks whereas the parity disk is written on the separate disk.

In the below example, data is written on the four disk sequentially like RAID0 and the parity bit is stored on Disk5 which is dedicated for parity value only.


Advantage:-
  • Less storage disk are required as there is no mirroring of the data.
  • Fault tolerance - Data can be recovered using party stored in the separate hard disk and other present hard disk.       
Disadvantage:-
  • Since controller has to calculate the parity and then write the party to separate hard disk that requires extra time for the process and even there is only one party disk where all the parity will be stored. As write is only come to a single disk which makes it as a point of congestion.
6.RAID5:- RAID 5 is very similar to RAID 4 where the data is striped between all the hard disk and parties is generated and then write on the disk. But instead of the separate disk, parity will be written on all the disks as shown below



Advantage:-
  • It provides all the advantages of RAID4.
  • Also since parity is being written on all the disks therefore there is no bottleneck as well.
Disadvantage:-
  • Similar to RAID 4, Parity is calculated and written on the disk which consume time ( known as write cost).



Monday, 6 October 2014

Disk Subsystem - Storage Basics-2

In storage, the directly connected small disks are replaced by large storage subsystems connected via storage network.  It has the flexibility to assigned hard disk to the servers as per the availability. Servers are either connected to Disk subsystem directly or indirectly via SAN ( storage Area Network).

There are below advantages of using storage system.

1. High Availability:- Data will remain available if any of the disk fails. With the help of RAID configuration we can provide 100% data backup and fault tolerance.
2. High performance:-  All the disk are available to the servers and they can get the hard disk as and when they need it.
3. Instant Copy: - Controllers are responsible to perform the parallel write of data to multiple disks to increase the write speed.
4. Remote mirroring:- RAID controllers are available to provide high fault tolerance by copying the data to multiple disks.

Disk Subsystem:-  It consists of storage devices like Hard Disks, Tapes and Controllers. There can be a disk subsystem where controller is not present.

 JBOD (Just bunch of Disk) is the low cost storage subsystem in which all the disks are placed in single enclosure with common power supply. There is no controller present in JBOD. It is used for small deployments.

Components of Disk Subsystem-
  •   Storage Devices: - Hard disk and tapes are used to store data and to increase the fault tolerance we may need the controller to handle the individual disks.
  •     Controllers: - It’s like brain of the complete disk subsystem and makes the entire cluster of small disks as one big virtual disk. RAID (Redundant Array of independent disks) Controllers controls the disks subsystem and provide instant copy and remote mirroring features to provide high fault tolerance.
It is responsibility of the controller to store data to the hard disks.

Sunday, 5 October 2014

IT Architecture - Storage Basics-1

IT Architecture

IT architecture describes the way servers and hosts accessing the storage. Storage consists of various disks which are shared between the hosts.

It is deployed in the environment where there is a requirement of shared memory between the server e.g shared drives. Storage disks are appeared as the directly attached disk to the servers.

There are below two types of IT architecture.

1.Server-Centric IT Architecture:-

It is a traditional or legacy design where hosts are accessing storage disk via dedicated servers. They cannot access the storage directly. 

Servers are connected to hosts via traditional LAN and to storage disk via SCSI cables. There is no dedicated storage network for the Hosts





Advantages:-

1. Easy Deployment: - Deployment of server centric IT architecture is simple and easy . It is still in use for small deployments.
2. Less Expensive:- There is no involvement of expensive storage devices and dedicated storage network which saves a lot of expensive. It not only saves the device expense but also the expense on the skill required to deploy and maintain the infrastructure.

Disadvantage:-

1. Less Scalable :- 

  •   Each server can support limited number of I/O Cards which restrict the scalable
  •   Since SCSI cable has the limitation of 25m length, servers and disk cannot be connected beyond 25m.
  •    Storage and servers are deployed locally as per the requirement since it cannot be 25m apart.
2. Less Reliable: - There are usually one or two upfront servers which gives very limited fault tolerance to the design. Failure of server can lead to major outage.
3. Less Efficient: - Suppose Storage disk, i.e. storage-2, connected to server-3 is full, now there is no way for Server-3 to get extra storage even if large amount of space is available on storage-1.
4. Security issues: - Since the deployment is scattered, there are always risk of unauthorized access.
5. Environment issues: - Due to the scattered behavior, it is very difficult of maintain the temperature in the various data closest.
6. Complex design:- It is not suitable for large deployment as it will lead to very complex design.


2.Storage-Centric IT Architecture:-  

IT architecture with storage dedicated network is called storage centric design. All the SCSI cables present in legacy designs are replaced by separated storage network.  Small disk are also replaced by big storage boxes called disk subsystems.


Like server centric design it also has its own advantages and disadvantages which are discussed below.

Advantages:-

1. Efficient: - Now one server can use whatever storage it wants.
2. Scalability- Since there is separate storage network and large storage devices, it is capable of handling large number of hosts. It is highly recommended for large data centers where the number of servers and hosts are high.
3. Secure: - As the storage is located centrally, it is very easy to restrict the unauthorized access.
4.  Environment issues: - Because of less number of data center, it is quite simple to maintain the temperature and monitor other environment parameters.
5.  Simple design:- Storage centric design are quite easy to understand, deploy and maintain.

Disadvantage:-

1. Expensive: - Requirement of separated storage network adds extra cost to the deployment. It also requires storage specific skills to deploy and manage the infrastructure.
2. Suitable for large and medium deployment only.