top of page

Which High Availability Setup is Best for Your Business: Cloud, On-Premise, or Hybrid?

Updated: Oct 9, 2024

High Availability(HA) Setup: Cloud vs. On-Premise vs. Hybrid

The setup of HA can vary significantly depending on whether it's implemented in the cloud, on-premise, or in a hybrid cloud environment. Each setup presents unique challenges and solutions, which we will explore in this article. 


Today’s digital businesses must ensure their services/products are available for operation and use as committed or agreed to with their customers. To fulfill this commitment business must define some Key Performance Indicators (KPIs). These are measurable values that demonstrate how effectively a company is meeting its business commitments.


Modern infrastructure needs high availability to ensure it meets availability KPIs.

The choice lies on how much availability you need to sustain business goals which are are measured by these KPIs:

Service Level Agreement(SLAs): The agreement you make with your clients or users on service uptime & downtimes. 
Service Level Objective(SLOs): The objectives you must hit to meet that agreement
Service Level Indicators (SLIs): The real numbers on your performance.

For example, (ref: https://uptime.is/)

SLA level of 99.99 % uptime/availability results in the following periods of allowed downtime/unavailability:


  • Daily: 8s

  • Weekly: 1m 0s

  • Monthly: 4m 22s

  • Quarterly: 13m 8s

  • Yearly: 52m 35s


To achieve the KPIs you need high availability of infrastructure where all services and data are actively or passively available across multiple locations. 


The High Availability(HA) Models:

Depending on how services/applications communicate between various layers of infrastructure the HA models can broadly be categorised to data driven or state driven as below:

👉 Data Driven: Where the data and services are actively replicated or mirrored across multiple sites so that if one site encounters a failure the services are still available. 

These can be active/active sites where both serve the load simultaneously, or can be active and hot standby where one is primary while other is an active mirror, or it can be a blue/green site mainly used for zero downtime upgrades or deployments. 

The choice really depends on what availability you are committing for your services. Will they be always online(100% H/A) or can have less than a 100% availability.

👉 State Driven: In this model the stateful services are kept on one site mostly on-premise and the stateless services are kept on a cloud. 

These kinds of H/A models are mostly seen in organisations where data privacy and localisation is a requirement such as in Banks, financial institutions or healthcare etc. 

Irrespective of which H/A model suits you, one key component that plays a key role in any availability configuration is Load Balancers for both load balancing and sharing. 


Load Balancing & Load Sharing:

You need load balancing and sharing between these multisite applications or clusters so that if one goes down the load switches seamlessly to another without any downtime for customers.

There are two kinds of load balancers available in most clouds:


Network or TCP load balancers: These work for layer 3 where routing and switching of data occurs between various devices like routers, firewalls, etc.

Application or http load balancers: These are L7 appliances which provide a means user to access information on the network using an application.

Based on your use cases you will need one or both times.


Cloud vs On-Premise vs Hybrid Cloud


Cloud Availability Zones:

Every cloud hosts their physical infrastructure or hardware in multiple data centers within the same city or country. These data centers are referred to as availability zones(AZs) and the city or country they are in are referred to as regions. 


Cloud provides capability of stretched infrastructure across these AZs. For example, you can create a database cluster that has nodes spread across multiple AZs.

For H/A active/active setup, multi az is a must on cloud. Always use multi az while configuring your network, system or storage services on cloud.


On-Premise DC/DRC:

Challenges:
  • Lack of IaC tools for bare metal virtualisation platforms needed for infrastructure automation necessary to achieve the SLA and RTO/RPO benchmarks.

  • Dynamic on demand disk allocation is impossible since it's all on hardware storage area networks.

  • Setting up fully automated containerisation platform on a bare-metal hypervisor.

  • Real time data replication between two separate data centres.

  • Limited operations support for on-premise systems.


Solutions:
  • Hardware planning and procurement inline with growth projection.

  • Dedicated Physical servers for hypervisor.

  • Network topology with segregated Vlans, client to site vpns and site to site vpn tunnels.

  • Infrastructure-As-Code combined with configuration management for on-premise hypervisors, systems and storage platforms.

  • Fully automated self managed server clusters with horizontal autoscaling, certificate management, and private dns on virtual machines.


Hybrid Cloud Designs:

Challenges:

The biggest challenge here was to ensure all the infrastructures and the data were back up within the said benchmarks. That means there has to be real time data replication between multiple sites, which in this case was a DC(Primary Datacenter) and a DRC(Disaster Recovery Center). 

Any kind of network based data transfer requires a latency of less than 1 ms. The lower the better. This is a standard benchmark used in almost all kinds of inter communication systems.

Solutions:

Dedicated direct private links between your clouds or sites for peer to peer connectivity and data transmission especially when you are running your services in a hybrid cloud model. This is also important if you care about integrating with a large number of third party apis.

Another important reason why dedicated links are necessary is for ensuring security of data in transit. In one of my recent projects with a bank, it was necessary to have secured links to connect between the Banks network and the third party network like the Switching gateway and Central Identity Registry of the respective country. Hence we had to provision dedicated peer to peer links between these locations and the Bank’s network.


Comparing the Solutions:

  • Cloud: The cloud excels in offering scalable, easily manageable H/A solutions with built-in redundancy and disaster recovery features. However, it requires careful management to avoid vendor lock-in and address latency issues.

  • On-Premise: On-premise solutions provide the highest level of control and data privacy but come with high costs and complexity in maintenance and scalability. Effective use of virtualisation and clustering can mitigate some challenges.

  • Hybrid Cloud: Hybrid cloud combines the benefits of both cloud and on-premise setups, offering flexibility and resilience. The main challenges lie in integration, data consistency, and security. Advanced management tools and robust networking solutions can significantly enhance hybrid cloud deployments.


Conclusion:


Cloud vs On-Premise vs Hybrid Cloud:

Setting up high availability varies greatly between cloud, on-premise, and hybrid environments. Cloud solutions offer ease of setup and scalability but require careful integration to prevent vendor lock-in. On-premise setups provide control and compliance but at the cost of higher complexity and expense. Hybrid cloud environments offer flexibility and resilience, balancing the strengths and weaknesses of cloud and on-premise setups. Organisations must evaluate their specific needs, resources, and expertise to choose the most suitable H/A strategy.

Keynotes:

  • Create an active-active infrastructure across creating two or more locations.

  • Replicate data across these sites in real time.

  • Load balancing & load Sharing are essential for ensuring the availability levels.

  • Dry run your H/A clusters at least once a year to ensure you are meeting the SLAs defined.


If you like this article do like 👍 and share  it in your network and follow Kamalika Majumder for more.


 

Don’t let your best-selling product suffer due to an

unstable, vulnerable & mutable infrastructure!

Compliance Ready Cloud

 

Thanks & Regards



5 views0 comments

Recent Posts

See All

Comentários


Join the 10factorinfra Club

Learn about secure, scalable & sustainable modern infrastructure development & delivery.

Thank You for Subscribing!

©2024 by Staxa LLP. All Rights Reserved.

bottom of page