Showing posts with label Storage Area Networks. Show all posts
Showing posts with label Storage Area Networks. Show all posts

Storage Area Networks (SAN) Interview Questions and Answers

Below are some of the frequently asked Storage (SAN) basic interview question and answers. Check the Storage Area Networks (SAN) basic & advanced concepts page in this site to learn more SAN basics. 

Storage Area Networks (SAN) Basic concepts Interview Questions and Answers


31. How can u check Error in brocade switch ?
Ans :- errshow

32. Health checks in brocade switch?
Ans :- >switchshow
>switch statusshow
>switchstatuspolicyshow
>sensorshow

33. What is failover , failback ?
Ans :- FailoverProcess of switching production to a remote. (If production server fails moves to remote site)

34. What is 24 bit addressing?
Ans :- It consists of 3 octects each of 8 bits 1st octect is for domain, 2nd octect for port ID ,
3rd octect for AL-PA(Arbitrated loop physical address)

35. What is latency ?
Ans :- Time delay to reach data from source to destination

36. FLOGI,PLOGI,PRLOG?
Ans :-
PRLOGI:- processlogi providing lun access permission to host.

37. What are the class of services?
Ans :- Class 1:-Acknowledged connection oriented service
Class2:- Acknowledged connection less service
Class3:- :- UnAcknowledged connection less service
Class4:-Fraction bandwidth connection oriented service
ClassF:-Multicast service
Class6: - Switch – switchconnection less with acknowledge service.
Here Class 2,3,f are used for san technology.

38. What is FSPF?
Ans:- Fabric shortest path first path to reach destination.

39. What is FCIP,FCOE, IFCP ?
Fibre Channel Protocol (FCP) is a transport protocol (similar to TCP used in IP networks) that predominantly transports SCSI commands over Fibre Channel
Fibre Channel over IP (FCIP or FC/IP, also known as Fibre Channel tunneling or storage tunneling) is an Internet Protocol (IP) created by the Internet
Fibre Channel over Ethernet (FCoE) is a computer network technology that encapsulates Fibre Channel frames over Ethernet networks.
iFCP (Internet Fibre Channel Protocol) is an emerging standard for extending Fibre Channel storage networks across the Internet

40. What hopcount ?
Ans :- number of (Nodes)counts to reach from source to destination

41. Tell me about Fc topology ? and what is private and public?
Fc topologies are : Point-Point, FC-AL, Switched Fabric
Private : No Fabric connection
Public : attached to a ... Channel network in which up to 126 nodes are connected in a loop topology

42. What is firmware?
A: permanent software programmed into a read-only memory.

43. What is ASIC?
Ans :- Application specific integrated circuit. It is Brocade switch processor.

44. What is SSD?
Ans :- Subsystem device driver it is a multipathing software to manage both path failover and preferred path destination.

45. What is Fabric Watch?
Ans: Fabric Watch tracks a variety of SAN fabric elements, events, and counters. Continuous monitoring of the end-to-end fabric, switches, ports, Inter-Switch Links

46. Tell me led light indication in brocade switch?
SWITCH BEACON: if it is of yellow /red then it is not working. If it indicates Green light then it’s working

47. What is NPIDV. How to assign NPIDV ?
Ans :- NodePortIDvirtualization is a technology that defines how multiple virtual servers can share a single physical fiber channel port identifier

48. What is Node?
It is an entity or device where we can connect it to the network to access services.

49. Brocade architecture?
Ans :- It consists of ASIC processor, cache , NSD RAM, Console port(DB-9) .

50. What is fabric?
Ans :- Collection of switches of same vendors or different vendors.

51. what are FC protocols?
Ans :- i)FCP
ii)FCIP
iii)IFCP
iv)FCOE

SAN interview question and answers

Below are some of the frequently asked Storage (SAN) basic interview question and answers. Check the Storage Area Networks (SAN) basic & advanced concepts page in this site to learn more SAN basics. 

Storage Area Networks (SAN) Basic concepts Interview Questions and Answers


1. What is zoning?
A: Zoning is the allocation of resources for device load balancing and for selectively allowing access to data only to certain users.

2. How we can create the zoning?
A: 
1. Identify the WWPNs for the new server HBA. We can do this using Qlogic SAN Surfer or Emulex HBAnywhere,
2. create a new alias for the server HBA port cabled to that fabric.
For each storage device that the server needs access to on fabric 1 (or possibly just switch 1), create a new zone and include the new server alias and the alias for every relevant storage port on that device. Repeat if you have other storage devices (so two XIVs means two new zones).
3. Put the new zone (or zones) into the active zone set (or a clone of it) and activate it.
4. Repeat on fabric 2.

3. What is WWNN?
A: World Wide Node Name, it is 64 bit address. It is for identify the particular HBA

4. What is WWPN?
A: World Wide Port Number: It is 64 bit address; it is for port in HBA, Every port having their own WWPN.

5. What is the difference between WWNN and WWPN?
A: World Wide Node Name, it is 64 bit address. It is for identify the particular HBA
World Wide Port Number: It is 64 bit address; it is for port in HBA, Every port having their own WWPN

6. Expain synchronus, Asynchronous?
Asynchronous: A synchronous process is invoked by a request/response operation, and the result of the process is returned to the caller immediately via this operation.
Synchronous: An asynchronous process is invoked by a one-way operation and the result and any faults are returned by invoking other one-way operations. The process result is returned to the caller via a callback operation

7. New host and storage is there for that zoning also existing. What you will do in storage and how u will check whether the zoning is existing or not?
A: Physical Connectivity, Logical connectivity (zoning), Storage mapping to particular host

8. Explain about fabric services?
A: F-LOGI, P-LOGI, PR-LOGI

9. Explain zones and Zoneset?
A: ZONES: Zones can be members of different zone sets for different purposes
Zoneset: a zone set is composed of zones, which are composed of members

10. Fabric ports?
A: The fabric port (F_port) is a fabric switch port used to connect an N_port to a switch in a fiber channel topology.


11. What is Switch configuration?
A: connect switch through DB-9(Console port) port to management host run tip hard ware for windows host. 
There goto setup - ipaddrset - give IP address, Gateway, Subnet mask and finally enable the switch.

12. Expain about well known addresses (FLOGI,PLOGI)
i.FLOGI(Fabric logi):- It is the type of service when ever host connect to the Fabric ,Fabric will assign ID to the host.
ii.PLOGI(Port Logi):- It is the type of service where storage can assign a lun to hosts without access permissions.

13. How to check the status of the Switch ?
Checking the status of switch

14. What is meant by port channeling?
Ans;-Channeling is where you "bond" two different physical ports together into a single connection for higher bandwidth and redundancy

15. What is SCN?
State change notification used to find hardware failures .

16. What is truncking?
Ans:-Aggregation of ISL’s is tunneling these entire process will call it as trunking.8 ISL link’s aggregation is called truncking .To increase bandwidth,speed.

17. What is Easy tiering and how it works?
Answers : For improving IOPS we can use the easy tiering. It have four steps how it works
  • I/O Monitoring
  • Data placement advisor
  • Data migration planner
  • Data Migrator
  • In this concept max 2 TB of data can be migrated to the HDD’s to SSD’s
18. Step by step Zoning ?
Ans:-
i)  Alias creation
ii) Zone creation
iii) Cfg create
iv) Cfgsave
v) Cfgenable

19. How will you check storage us pinging with host are not?
Check in the storage array the mapped LUN’s are listing the host WWPN’s or not

20. Which zoning is secured ?
ANS:- Soft zoning is better and Flexible .it is done with WWN’s. When ever port failures connect to another port no need to do zoning again because wwn is unique for HBA.

21. What is storage portioning?
ANS :- It’s a logical entity consisting of one more logical drives that are shared by a group of hosts or exclusively accessed by a single host.

22. What is infiband switch bandwidth?
Ans:-InfiniBand is a switched fabric computer network communications link used in high-performance computing and enterprise data centers.

Also Read: DAS vs SAN vs NAS

23. What is SCSI maping? SCSI masking?
Ans:-- Allocating lun to a perticular host without accessin permissions. Providing access rights to access lun to a particular host.

24. Firmware update in Brocade Switch ?
A: Firmware file download from server and update on switch

25. FC layers?
FC-4 :- ULP(upper layered Protocol) ex: fcp,fcip,icfcp
FC-3 :- Common services
FC-2 :- framing & flowcontrol
FC-1 ; - 8/10 Encoding & Decoding
FC-0 :- Physical layer Speeds and feeds)

26. FC topologies?
i) point-point
ii) FC_AL (Fiber channel – Arbitrated loop)
iii) Switched Fabric

27. What is frame ?
Ans:- Frame is a binary electrical digital data transfer between source and destination.frame size is 2148 bytes.

28. What is Priniciple switch?
Ans :-The switch which is having least domain ID in a fabric is Principal switch.

29. Before shutdown switch that what will you do?
A:Take config save ,if we want config upload.

30. What is SAN ?
Ans :-SAN is high speed networking technology used to connect host to the storage with fiber channel.

13.5 Information Security Threats and Security Controls Overview

The information made available on a network is exposed to security threats from a variety of sources. Therefore, specific controls must be implemented to secure this information that is stored on an organization’s storage infrastructure. In order to deploy controls, it is important to have a clear understanding of the access paths leading to storage resources.

If each component within the infrastructure is considered a potential access point, the attack surface of all these access points must be analyzed to identify the associated vulnerabilities. To identify the threats that apply to a storage infrastructure, access paths to data storage can be categorized into three security domains.
 Information Security Threats and Security Controls Overview
Image Credits - EMC
  • Application Access to the stored data through the storage network. Application access domain may include only those applications that access the data through the file system or a database interface.
  • Management Access to storage and interconnecting devices and to the data residing on those devices. Management access, whether monitoring, provisioning, or managing storage resources, is associated with every device within the storage environment. Most management software supports some form of CLI, system management console, or a web-based interface. Implementing appropriate controls for securing management applications is important because the damage that can be caused by using these applications can be far more extensive.
  • Backup, Replication, and Archive Access primarily accessed by storage administrators who configure and manage the environment. Along with the access points in this domain, the backup and replication media also needs to be secured.

The Key Security Threats across the Domains

To secure the storage environment, identify the attack surface and existing threats within each of the security domains and classify the threats based on the security goals — availability, confidentiality, and integrity.

Also Read: The next generation IT Data Center Layers
  • Unauthorized access - Unauthorized access is an act of gaining access to the information systems, which includes servers, network, storage, and management servers of an organization illegally. An attacker may gain unauthorized access to the organization’s application, data, or storage resources by various ways such as by bypassing the access control, exploiting a vulnerability in the operating system, hardware, or application, by elevating the privileges, spoofing identity, and device theft. 
  • Denial of Service (DoS) - A Denial of Service (DoS) attack prevents legitimate users from accessing resources or services. DoS attacks can be targeted against servers, networks, or storage resources in a storage environment. In all cases, the intent of DoS is to exhaust key resources, such as network bandwidth or CPU cycles, thereby impacting production use. 
  • Distributed DoS (DDoS) attack - A Distributed DoS (DDoS) attack is a variant of DoS attack in which several systems launch a coordinated, simultaneous DoS attack on their target(s), thereby causing denial of service to the users of the targeted system(s). 
  • Data loss - Data loss can occur in a storage environment due to various reasons other than malicious attacks. Some of the causes of data loss may include accidental deletion by an administrator or destruction resulting from natural disasters. In order to prevent data loss, deploying appropriate measures such as data backup or replication can reduce the impact of such events. 
  • Malicious Insiders -  According to Computer Emergency Response Team (CERT), a malicious insider could be an organization’s current or former employee, contractor, or other business partner who has or had authorized access to an organization’s servers, network, or storage. These malicious insiders may intentionally misuse that access in ways that negatively impact the confidentiality, integrity, or availability of the organization’s information or resources. For example, consider a former employee of an organization who had access to the organization’s storage resources. This malicious insider may be aware of security weaknesses in that storage environment. This is a serious threat because the malicious insider may exploit the security weakness. 
  • Account Hacking - Account hijacking refers to a scenario in which an attacker gains access to an administrator’s or user’s account(s) using methods such as phishing or installing keystroke-logging malware on administrator’s or user’s systems. Phishing is an example of a social engineering attack that is used to deceive users. 
  • Insecure API's - Application programming interfaces (APIs) are used extensively in software-defined and cloud environment. It is used to integrate with management software to perform activities such as resource provisioning, configuration, monitoring, management, and orchestration. These APIs may be open or proprietary. The security of storage infrastructure depends upon the security of these APIs. An attacker may exploit vulnerability in an API to breach a storage infrastructure’s perimeter and carry out an attack. Therefore, APIs must be designed and developed following security best practices such as requiring authentication and authorization, input validation of APIs, and avoiding buffer overflows. 
  • Shared technology vulnerabilties - Shared technology vulnerabilties Technologies that are used to build today’s storage infrastructure provide a multi-tenant environment enabling the sharing of resources. Multi-tenancy is achieved by using controls that provide separation of resources such as memory and storage for each application. Failure of these controls may expose the confidential data of one business unit to users of other business units, raising security risks. 
  • Media Theft - Backups and replications are essential business continuity processes of any data center. However, inadequate security controls may expose organizations confidential information to an attacker. There is a risk of a backup tape being lost, stolen, or misplaced, and the threat is even severe especially if the tapes contain highly confidential information. An attacker may gain access to an organization’s confidential data by spoofing the identity of the DR site. 

The Key Information Security Controls

Any security control should account for three aspects: people, process, and technology, and the relationships among them. Security controls can be administrative or technical. Administrative controls include security and personnel policies or standard procedures to direct the safe execution of various operations. Technical controls are usually implemented through tools or devices deployed on the IT infrastructure. To protect a storage infrastructure, various technical security controls must be deployed at the compute, network, and storage levels.

Also Read: Factors affecting SAN performance

At the server level, security controls are deployed to secure hypervisors and hypervisor management systems, virtual machines, guest operating systems, and applications. Security at the network level commonly includes firewalls, demilitarized zones, intrusion detection and prevention systems, virtual private networks, zoning and iSNS discovery domains, port binding and fabric binding configurations, and VLAN and VSAN. At the storage level, security controls include LUN masking, data shredding, and data encryption. Apart from these security controls, the storage infrastructure also requires identity and access management, role-based access control, and physical security arrangements. The Key Information Security Controls are
  • Physical Security
  • Identity and Access Management
  • Role-based Access Control
  • Network Monitoring and Analysis
  • Firewalls
  • Intrusion Detection and Prevention System
  • Adaptive Security
  • Virtual Private Networks
  • Virtual LAN & Virtual SAN
  • Zoning and ISNS discovery domain
  • Port binding and fabric binding
  • Securing hypervisor and management server
  • VM, OS and Application hardening
  • Malware Protection Software
  • Mobile Device Management
  • LUN Masking
  • Data Encryption
  • Data Shredding


Go To >> Index Page

13.4 Introduction to Information Security

Information is an organization’s most valuable asset. This information, including intellectual property, personal identities, and financial transactions, is regularly processed and stored in storage systems, which are accessed through the network. As a result, storage is now more exposed to various security threats that can potentially damage business-critical data and disrupt critical services. Organizations deploy various tools within their infrastructure to protect these assets. These tools must be deployed on various infrastructure assets to protect the information. The commonly used infrastructure assets are
  • Servers (which processes information)
  • Storage (which stores information)
  • Network (which carries information) 

As organizations are adopting next generation emerging technologies, in which cloud is a core element, one of the key concerns they have is ‘trust’. Trust depends on the degree of control and visibility available to the information’s owner. Therefore, securing storage infrastructure has become an integral component of the storage management process in modern IT datacenters. It is an intensive and necessary task, essential to manage and protect vital information.

Information Security Overview

Information security includes a set of practices that protect information and information systems from unauthorized disclosure, access, use, destruction, deletion, modification, and disruption. Information security involves implementing various kinds of safeguards or controls, in order to lessen the risk of an exploitation or a vulnerability in the information system which could otherwise cause a significant impact to organization’s business. From this perspective, security is an ongoing process, not static, and requires continuous revalidation and modification. Securing the storage infrastructure begins with understanding the goals of information security.


Information Security Overview
 Image Source - Belvatech

Goals of  information security

The goal of information security is to provide
  • Confidentiality
  • Integrity
  • Availability
Confidentiality provides the required secrecy of information to ensure that only authorized users have access to data. Integrity ensures that unauthorized changes to information are not allowed. The objective of ensuring integrity is to detect and protect against unauthorized alteration or deletion of information. Availability ensures that authorized users have reliable and timely access to servers, storage, network, application, and data resources. Ensuring confidentiality, integrity, and availability are the primary objective of any IT security implementation. These are supported through the use of authentication, authorization, and auditing processes.
  • Authentication is a process to ensure that ‘users’ or ‘assets’ are who they claim to be by verifying their identity credentials. A user may be authenticated by a single-factor or multi-factor method. Single-factor authentication involves the use of only one factor, such as a password. Multi-factor authentication uses more than one factor to authenticate a user.
  • Authorization refers to the process of determining whether and in what manner, a user, device, application, or process is allowed to access a particular service or resource. For example, a user with administrator’s privileges is authorized to access more services or resources compared to a user with non-administrator privileges. Authorization should be performed only if authentication is successful. The most common authentication and authorization controls, used in a data center environment are Windows Access Control List (ACL), UNIX permissions, Kerberos, and Challenge-Handshake Authentication Protocol (CHAP). It is essential to verify the effectiveness of security controls that are deployed with the help of auditing. 
  • Auditing refers to the logging of all transactions for the purpose of assessing the effectiveness of security controls. It helps to validate the behaviour of the infrastructure components, and to perform forensics, debugging, and monitoring activities.

Information Security Considerations

Risk assessment
An organization might wants to safeguard the asset from threat agents (attackers) who seek to abuse the assets. Risk arises when the likelihood of a threat agent (an attacker) to exploit the vulnerability arises. Therefore, the organizations deploy various countermeasures to minimize risk by reducing the vulnerabilities.

Also Read: Introduction to Storage Infrastructure Management

Risk assessment is the first step to determine the extent of potential threats and risks in an infrastructure. The process assesses risk and helps to identify appropriate controls to mitigate or eliminate risks. Organizations must apply their basic information security and risk-management policies and standards to their infrastructure. Some of the key security areas that an organization must focus on while building the infrastructure are: authentication, identity and access management, data loss prevention and data breach notification, governance, risk, and compliance (GRC), privacy, network monitoring and analysis, security information and event logging, incident management, and security management. 

Assets and Threats
Information is one of the most important assets for any organization. Other assets include hardware, software, and other infrastructure components required to access the information. To protect these assets, organizations deploy security controls. These security controls have two objectives. The first objective is to ensure that the resources are easily accessible to authorized users. The second objective is to make it difficult for potential attackers to access and compromise the system. The effectiveness of a security control can be measured by two key criteria. One, the cost of implementing the system should be a fraction of the value of the protected data. Two, it should cost heavily to a potential attacker, in terms of money, effort, and time, to compromise and access the assets.

Also Read: Unified Storage Systems Overview

Threats are the potential attacks that can be carried out on an IT infrastructure. These attacks can be classified as active or passive. Passive attacks are attempts to gain unauthorized access into the system. Passive attacks pose threats to confidentiality of information. Active attacks include data modification, denial of service (DoS), and repudiation attacks. Active attacks pose threats to data integrity, availability, and accountability.

Vulnerability 
Vulnerability is a weakness of any information system that an attacker exploits to carry out an attack. The components that provide a path enabling access to information are vulnerable to potential attacks. It is important to implement adequate security controls at all the access points on these components.

Attack surface, attack vector, and work factor are the three factors to consider when assessing the extent to which an environment is vulnerable to security threats. Attack surface refers to the various entry points that an attacker can use to launch an attack, which includes people, process, and technology. For example, each component of a storage infrastructure is a source of potential vulnerability. An attacker can use all the external interfaces supported by that component, such as the hardware and the management interfaces, to execute various attacks. These interfaces form the attack surface for the attacker. Even unused network services, if enabled, can become a part of the attack surface. An attack vector is a step or a series of steps necessary to complete an attack. For example, an attacker might exploit a bug in the management interface to execute a snoop attack. Work factor refers to the amount of time and effort required to exploit an attack vector.

Also Read: Taking Backup and Archive to Cloud Storage

Having assessed the vulnerability of the environment, organizations can deploy specific control measures. Any control measures should involve all the three aspects of infrastructure: people, process, and technology, and their relationship. To secure people, the first step is to establish and assure their identity. Based on their identity, selective controls can be implemented for their access to data and resources. The effectiveness of any security measure is primarily governed by the process and policies. The processes should be based on a thorough understanding of risks in the environment, should enable recognizing the relative sensitivity of different types of data, and help determine the needs of various stakeholders to access the data. Without an effective process, the deployment of technology is neither cost-effective nor aligned to organizations’ priorities.

Security Controls
Finally, the controls that are deployed should ensure compliance with the processes, policies, and people for its effectiveness. These security controls are directed at reducing vulnerability by minimizing the attack surfaces and maximizing the work factors. These controls can be technical or non-technical. Technical controls are usually implemented at server, network, and storage level, whereas non-technical controls are implemented through administrative and physical controls. Administrative controls include security and personnel policies or standard procedures to direct the safe execution of various operations. Physical controls include setting up physical barriers, such as security guards, fences, or locks. Controls are categorized as preventive, detective, and corrective.
  • Preventive: Avoid problems before they occur
  • Detective: Detect a problem that has occurred
  • Corrective: Correct the problem that has occurred
Organizations should deploy defense-in-depth strategy when implementing these controls.

Defence in depth
An organization should deploy multiple layers of defense throughout the infrastructure to mitigate the risk of security threats, in case one layer of the defense is compromised. This strategy is referred to as defense-in-depth. This strategy may also be thought of as a “layered approach to security” because there are multiple measures for security at different levels. Defense-in-depth increases the barrier to exploitation—an attacker must breach each layer of defenses to be successful—and thereby provides additional time to detect and respond to an attack. This potentially reduces the scope of a security breach. However, the overall cost of deploying defense-in-depth is often higher compared to single-layered security controls. An example of defense-in-depth could be a virtual firewall installed on a hypervisor when there is already a network-based firewall deployed within the same environment. This provides additional layer of security reducing the chance of compromising hypervisor’s security if network-level firewall is compromised.

Previous: Factors affecting SAN performance

Go To >> Index Page

13.3 Factors affecting SAN performance

The common factors which can affect storage infrastructure performance are the type of RAID level configured or due to enabling or disabling Cache, due to Thin LUNs provisioning, latency in Network Hops and in some cases due to misconfigured Multipathing. 

Factors which might affect SAN performance

RAID Configurations
The RAID levels that usually cause the most concern from a performance perspective are RAID5 and RAID6 which is why many DB administrators request SAN volumes which are RAID5 or RAID6. Parity-based RAID schemes, such as RAID 5 and RAID 6, perform differently than other RAID schemes such as RAID 1 and RAID 10. This is due to a phenomenon known as the write penalty. 

This can lead to lower performance, especially in cases with workloads that consist of a lot of random write activity as is often the case with database workloads. The reason the write penalty occurs is that small-block writes require a lot of parity recalculation, resulting in additional I/O on the backend. 

Also Read: Types of RAID Levels

Small-block writes are relatively hard work for RAID 5 and RAID 6 because they require changes to be made within RAID stripes, which forces the system to read the other members of the stripe to be able to recompute the parity. In addition, random small-block write workloads require the R/W heads on the disk to move all over the platter surface, resulting in high seek times. The net result is that lots of small-block random writes with RAID 5 and RAID 6 can be slow. Even so, techniques such as redirecton write or write-anywhere filesystems and large caches can go a long way to masking and mitigating this penalty.

Cache
Cache is the magic ingredient that has just about allowed storage arrays based on spinning
disk to keep up to speed with the rest of the technology in the data center. If you take DRAM caches and caching algorithms out of the picture, spinning disk–based storage arrays practically grind to a halt. Having enough cache in your system is important in order to speed up average response times. If a read or write I/O can be satisfied from cache (not having to rely on the disks to complete the read or write I/O), it will be amazingly faster than if it has to rely on the disks on the backend. 

Also Read: The next generation RAID techniques

However, not all workloads benefit equally from having a cache in front of spinning disks. Some workloads result in a high cache-hit rate, whereas other don’t. A cache hit occurs when I/O can be serviced from cache, whereas a cache miss requires access to the backend disks. Even with a large cache in front of your slow spinning disks, there will be some I/Os that result in cache misses and require use of the disks on the backend.

These cache-miss I/Os result in far slower response times than cache hits, meaning that the variance (spread between fastest and slowest response times) can be huge, such as from about 2 ms all the way up to about 100 ms. This is in stark contrast to all-flash arrays,
where the variance is usually very small. Most vendors will have standard ratios of disk capacity to cache capacity, meaning that you don’t need to worry so much about how much cache to put in a system. However, these vendor approaches are one-size-fits-all approaches and may need tuning to your specific requirements. 

Also Read: Importance of Cache technique in Block Based Storage Systems

Thin LUNs
Thin LUNs work on the concept of allocating space to LUNs and volumes on demand. So on day one when you create a LUN, it has no physical space allocated to it. Only as users and applications write to it is capacity allocated. This allocate-on-demand model can have an impact in two ways:
  • The allocate-on-demand process can add latency.
  • The allocate-on-demand process can result in a fragmented backend layout.
The allocate-on-demand process can theoretically add a small delay to the write process because the system has to identify free extents and allocate them to a volume each time a write request comes into a new area on a thin LUN. However, most solutions are optimized to minimize this impact.

Also Read: Storage Provisioning and Capacity Optimization Techniques

Probably of more concern is the potential for some thin LUNS to end up with heavily fragmented backend layout because of the pseudo-random nature in which space is allocated to them. This can be particularly noticeable in applications with heavily sequential
workloads. If users suspect a performance issue because of the use of thin LUNs, perform
representative testing on thin LUNs and thick LUNs and compare the results.

Network Hops
Within the network, FC SAN, or IP, the number of switches that traffic has to traverse has an impact on response time. Hopping across more switches and routers adds latency, often referred to as network-induced latency. This latency is generally higher in IP/Ethernet networks where store-and-forward switching techniques are used, in addition to having the
increased potential for traffic-crossing routers.

Also Read: The need for a Converged Enhanced Ethernet (CEE) Network

Multipathing
If one path fails, another takes over without the application or user even noticing. However, MPIO can also have a significant impact on performance. For example, balancing all I/O from a host over two HBAs and HBA ports can provide more bandwidth than sending all I/O over a single port. It also makes the queues and CPU processing power of both HBAs available. MPIO can also be used to balance I/O across multiple ports on the storage array too. Instead of sending all host I/O to just two ports on a storage array, MPIO can be used to balance the I/O from a single host over multiple array ports, for example, eight ports. This can significantly help to avoid hot spots on the array’s front-end ports, similar to the way that wide-striping avoids hot spots on the array’s backend.

Standard SAN Performance Tools

Perfmon
Perfmon is a Windows tool that allows administrators to monitor an extremely wide variety of hostbased performance counters. From a storage perspective, these counters can be extremely useful, as they give you the picture as viewed from the host. For example, latency experienced from the host will be end-to-end latency, meaning that it will include host-based, network-based, and array-based latency. However, it will give you only a single figure, and it won’t break the overall latency down to host-induced latency, network-induced latency, and array-induced latency.

For example, Open the Windows perfmon utility by typing perfmon at the command prompt of the
Run dialog box.

Standard SAN Performance Tools

iostat
Iostat is a common tool used in the Linux world to monitor storage performance. For example run
iostat -x 20

The following output shows the I/O statistics over the last 20 seconds:

Previous: Introduction to the key Storage Management Operations

Go To >> Index Page

13.2 Introduction to the key Storage Management Operations

The Key Storage Management Operations consists of Storage Monitoring, Storage Alerting, and Storage Reporting. Storage Monitoring provides the performance and availability status of various infrastructure components and services. It also helps to trigger alerts when thresholds are reached, security policies are violated, and service performance deviates from SLA. These functions are explained below.

Storage Management Operations Overview

1) Storage Monitoring

Monitoring forms the basis for performing management operations. Monitoring provides the performance and availability status of various infrastructure components and services. It also helps to measure the utilization and consumption of various storage infrastructure resources by the services. This measurement facilitates the metering of services, capacity planning, forecasting, and optimal use of these resources. Monitoring events in the storage infrastructure, such as a change in the performance or availability state of a component or a service, may be used to trigger automated routines or recovery procedures. 

Also Read: Using Sub-LUN Auto Tiering techniques in SAN Infrastructure

Such procedures can reduce downtime due to known infrastructure errors and the level of manual intervention needed to recover from them. Further, monitoring helps in generating reports for service usage and trends. Additionally, monitoring of the data center environment parameters such as heating, ventilating, and air-conditioning (HVAC) helps in tracking any anomaly from their normal status. A storage infrastructure is primarily monitored for 
  • Configuration Monitoring
  • Availability Monitoring
  • Capacity Monitoring
  • Performance Monitoring
  • Security Monitoring 
Monitoring Configuration
Monitoring configuration involves tracking configuration changes and deployment of storage infrastructure components and services. It also detects configuration errors, non-compliance with configuration policies, and unauthorized configuration changes.

Monitoring Availability
Availability refers to the ability of a component or a service to perform its desired function during its specified time of operation. Monitoring availability of hardware components (for example, a port, an HBA, or a storage controller) or software component for example, a database instance or an orchestration software involves checking their availability status by reviewing the alerts generated from the system. For example, a port failure might result in a chain of availability alerts. A storage infrastructure commonly uses redundant components to avoid a single point of failure. Failure of a component might cause an outage that affects service availability, or it might cause performance degradation even though availability is not compromised. Continuous monitoring for expected availability of each component and reporting any deviation help the administrator to identify failing services and plan corrective action to maintain SLA requirements.

Monitoring Capacity
Capacity refers to the total amount of storage infrastructure resources available. Inadequate capacity leads to degraded performance or even service unavailability. Monitoring capacity involves examining the amount of storage infrastructure resources used and usable such as the free space available on a file system or a storage pool, the numbers of ports available on a switch, or the utilization of allocated storage space to a service. Monitoring capacity helps an administrator to ensure uninterrupted data availability and scalability by averting outages before they occur. For example, if 90 percent of the ports are utilized in a particular SAN fabric, this could indicate that a new switch might be required if more servers and storage systems need to be attached to the same fabric. Monitoring usually leverages analytical tools to perform capacity trend analysis. These trends help to understand future resource requirements and provide an estimation of the time required to deploy them.

Also Read: How storage is provisioned in Software Defined Storage Environments

Monitoring Performance
Performance monitoring evaluates how efficiently different storage infrastructure components and services are performing and helps to identify bottlenecks. Performance monitoring measures and analyzes behavior in terms of response time, throughput, and I/O wait time. It identifies whether the behavior of infrastructure components and services meets the acceptable and agreed performance level. This helps to identify performance bottlenecks. It also deals with the utilization of resources, which affects the way resources behave and respond. For example, if a VM is experiencing 80 percent of processor utilization continuously, it suggests that the VM may be running out of processing power, which can lead to degraded performance and slower response time. Similarly, if the cache and controllers of a storage system is consistently over utilized, it may lead to performance degradation.

Monitoring Security
Monitoring a storage infrastructure for security includes tracking unauthorized access, whether accidental or malicious, and unauthorized configuration changes. For example, monitoring tracks and reports the initial zoning configuration performed and all the subsequent changes. Another example of monitoring security is to track login failures and unauthorized access to switches for performing administrative changes. IT organizations typically comply with various information security policies that may be specific to government regulations, organizational rules, or deployed services. Monitoring detects all operations and data movement that deviate from predefined security policies. Monitoring also detects unavailability of information and services to authorized users due to security breach. Further, physical security of a storage infrastructure can also be continuously monitored using badge readers, biometric scans, or video cameras.

2) Storage Alerting

An alert is a system-to-user notification that provides information about events or impending threats or issues. Alerting of events is an integral part of monitoring. Alerting keeps administrators informed about the status of various components and processes. For example, conditions such as failure of power, storage drives, memory, switches, or availability zone, which can impact the availability of services and require immediate administrative attention. Other conditions, such as a file system reaching a capacity threshold, an operation breaching a configuration policy, or a soft media error on storage drives, are considered warning signs and may also require administrative attention.

Monitoring tools enable administrators to define various alerted conditions and assign different severity levels for these conditions based on the impact of the conditions. Whenever a condition with a particular severity level occurs, an alert is sent to the administrator, an orchestrated operation is triggered, or an incident ticket is opened to initiate a corrective action. Alert classifications can range from information alerts to fatal alerts. 
  • Information alerts provide useful information but do not require any intervention by the administrator. The creation of a zone or LUN is an example of an information alert. 
  • Warning alerts require administrative attention so that the alerted condition is contained and does not affect service availability. For example, if an alert indicates that a storage pool is approaching a predefined threshold value, the administrator can decide whether additional storage drives need to be added to the pool. 
  • Fatal alerts require immediate attention because the condition might affect the overall performance or availability. For example, if a service fails, the administrator must ensure that it is returned quickly.

As every IT environment is unique, most monitoring systems require initial set-up and configuration, including defining what types of alerts should be classified as informational, warning, and fatal. Whenever possible, an organization should limit the number of truly critical alerts so that important events are not lost amidst informational messages. Continuous monitoring, with automated alerting, enables administrators to respond to failures quickly and proactively. Alerting provides information that helps administrators prioritize their response to events.

3) Storage Reporting

Like alerting, reporting is also associated with monitoring. Reporting on a storage infrastructure involves keeping track and gathering information from various components and processes that are monitored. The gathered information is compiled to generate reports for trend analysis, capacity planning, chargeback, performance, and security breaches. 
  • Capacity planning reports contain current and historic information about the utilization of storage, file systems, database tablespace, ports, etc. 
  • Configuration and asset management reports include details about device allocation, local or remote replicas, and fabric configuration. This report also lists all the equipment, with details, such as their purchase date, lease status, and maintenance records. 
  • Chargeback reports contain information about the allocation or utilization of storage infrastructure resources by various users or user groups. 
  • Performance reports provide current and historical information about the performance of various storage infrastructure components and services as well as their compliance with agreed service levels. 
  • Security breach reports provide details on the security violations, duration of breach and its impact.
Reports are commonly displayed like a digital dashboard, which provide real time tabular or graphical views of gathered information. Dashboard reporting helps administrators to make instantaneous and informed decisions on resource procurement, plans for modifications in the existing infrastructure, policy enforcement, and improvements in management processes.

Chargeback Report
Chargeback is the ability to measure storage resource consumption per business unit or user group and charge them back accordingly. It aligns the cost of deployed storage services with organization’s business goals such as recovery of cost, making a profit, justifying new capital spending, influencing consumption behaviors by the business units, and making IT more service aware, cost conscious and accountable. 

Also Read: Taking Backup and Archive to Cloud Storage

To perform chargeback, the storage usage data is collected by a billing system that generates chargeback report for each business unit or user group. The billing system is responsible for accurate measurement of the number of units of storage used and reports cost/charge for the consumed units. 

Chargeback reports can be extended to include a pre-established cost of other resources, such as the number of switch ports, HBAs and storage system ports, and service level requested by the users. Chargeback reports enable metering of storage services, providing transparency for both the provider and the consumer of the utilized services.

Previous: Introduction to Storage Infrastructure Management

Go To >> Index Page