Storage Redefined for High Availability (HA)

Why is High Availability important for Business?

In today’s day and age, network services highly rely on the Internet. Even the slightest downtime can lead to a substantial loss to a business. Any form of an Outage can lead to lost revenue, disruption to business operations, increased security and fraud-related risks, and terrible inaccessibility to data. Such a disaster can damage a company’s image and the overall customer satisfaction. That’s why it’s important to design and run a highly available system, which is the key to avoiding downtime.

What is High Availability?

Availability pertains to the percentage of total time a computer system can be accessed during working normally. One might assume optimal availability is 100%, however, this is very hard to achieve. HA (High Availability) systems are those that have online availability in the 99.9% to 99.999% time range. An ideal HA is 99.999% and can only tolerate five minutes of downtime in a year.

Availability %

Downtime per Year

Downtime per Month

Downtime per Week

90% (“one nine”)

36.5 days

72 hours

16.8 hours

99% (“two nine”)

3.65 days

7.20 hours

1.68 hours

99.9% (“three nine”)

8.76 hours

43.8 minutes

10.1 minutes

99.99% (“four nine”)

52.56 minutes

4.32 minutes

1.01 minutes

99.999% (“five nine”)

5.26 minutes

25.9 seconds

6.05 seconds

99.9999% (“six nine”)

31.5 seconds

2.59 seconds

0.605 seconds

Through fault tolerance, HA can be improved. It’s created through a complex hardware and software architecture, where, all parts of the system work completely independently of each other. That’s why, the failure of any single component does not compromise the entire system.

Understanding RPO and RTO

RPO (Recovery Point Objective) and RTO (Recovery Time Objective) involve two of the most important parameters in a disaster recovery or data protection plan. Such goals can guide organizations in choosing the ideal data backup plan.

RTO (Recovery Time Objective) talks about the duration of time an application is shut down without creating any significant harm to a business. These high-priority applications can only be shut down for a few seconds, to prevent any friction with the customer and loss of business. It’s also worth noting that the shorter the RTO in mission-critical applications, the better for the business.

RPO (Recovery Point Objective) measures the maximum allowable amount of data that can be lost. This parameter measures the amount of time that can occur between the last data backup and the disaster without causing serious business loss. In fact, RPO does not allow any data loss in mission-critical applications.

Requirements for HA Storage

The requirements for the HA storage, depends on three parameters. They are availability percentage, RTO (Recovery Time Objective), and RPO (Recovery Point Objective).

HA Storage Type

Near HA

Native HA

True HA

Availability % (Downtime per Year)

99.9% (8.76 hours)

99.999% (5.26 minutes)

99.9999% (31.5 seconds)

RTO (Recovery Time Objective)

< 5 minutes

< 30 seconds

< 30 seconds

RPO (Recovery Point Objective)

≠ 0

= 0

= 0

HA storage is a storage system that can run without stopping and provides at least 99% uptime. The redundancy is an important factor of HA storage because it eliminates SPOF (Single Points Of Failure). The HA storage array demands at least two controllers if anything happens to the first controller. The most basic requirements, which the HA requires are fault-tolerant and redundant modular components such as PSU, FAN module, and dual port disk drive interface. Firmware update with zero system downtime will keep storage active.

When it comes to disaster recovery, the HA storage demands redundant storage system to take over the critical data and applications that the business needs when one of them goes offline. Such a system is named a failover. Thanks, to a failover, the tasks are automatically rerouted to secondary during planned or unplanned outages.

The Users can build their HA based on applications. Services with higher availability percentages can be implemented through more complete mechanisms. Of course, it costs a lot because it requires more consideration.

If a user takes regular data backup as an example, it may require 99.9% uptime. Its RTO will be fine in 5 minutes. If data loss is encountered, resending the data can also be accepted.

In mission critical services such as enterprise email service or large-scale surveillance, they require 99.999% uptime and cannot tolerate data loss. If the downtime is too long, the host may fail and begin dropping I/O packets when there are too many retries. At this time, important purchase order emails may be lost or images of critical moments may not be recorded.

In an online nonstop service, the conditions are stricter. Using AFA (All-Flash Array, please refer here) with RAID EE protection (please refer here) and C2F (Cache-To-Flash, please refer here) mechanism is suitable for higher computing and uninterrupted service.

HA Storage Comparison

Based on three indicators of HA storage, let’s compare the dual controller storage and 2-node storage cluster.

Dual Controller Storage                                vs.                                2-Node Storage Cluster

Dual Controller Storage

2-Node Storage Cluster

Availability % (Downtime per Year)

At least 99.999% (5.26 minutes)

99.9% (8.76 hours)

RTO (Recovery Time Objective)

< 30 seconds

> 1 minutes

RPO (Recovery Point Objective)

= 0

≠ 0

The features of dual controller (active-active) storage are at least 99.999% availability, RTP < 30 seconds, and no data loss for RPO. However, a 2-node storage cluster with active-passive architecture cannot reach RPO = 0 due to lack of C2F, and its RTO may be greater than 1 minutes. Therefore, the total availability percentage may be 99.9% uptime.

The active-active controller architecture can provide real-time storage services in parallel at the same time. The active-active architecture doubles the available host bandwidth and cache-hit rate, ensuring that there are no wasted resources in the system. In addition, the all-in-one dual-controller with dual-port SAS HDD is cost-effective and easy to deploy compared to a two-node storage cluster.

Both architectures claim HA storage, what do you choose?

Conclusion

By keeping your business online in critical applications, you will always be able to do business without losing any revenue. A quality HA design will build customer trust by always being online and available. For a real HA storage, you can review if conditions such as availability percentage, RTO, and RPO are true.