![Horizontal Auto Scaling](https://static.wixstatic.com/media/981170_2f6dd585adc0434c9c5bf77537c3e065~mv2.gif/v1/fill/w_980,h_565,al_c,usm_0.66_1.00_0.01,pstr/981170_2f6dd585adc0434c9c5bf77537c3e065~mv2.gif)
No matter if you are a small startup or a big enterprise, you can't live on a blank cheque. Especially when it comes to your infrastructure spending, everybody needs a budget.
Infrastructure should scale when demand rises and scale down if it is low.
Otherwise, you will end up paying for underutilised resources which can become pretty expensive, sometimes even crossing your revenue margin.
And always remember scaling up or out infrastructure is the easy task, what is tricky is scaling down underutilised resources and this is determined by the method used for infra scale. Exponentially scaling infrastructure without optimisation is like creating ghost towns, where no one wants to reside. In recent times we have seen various organisations had to take tough calls of rearchitecting their entire infrastructure design due to over bearing opex. One similar instance is Amazon prime's migration from serverless to EC2 and ECS.
But not everyone is that lucky or have enough resources to overnight dump their live infrastructure just to save costs. However, such situation can always be prevented if you setup a scale ready infrastructure.
So what should you do?
Let’s first understand the two main types of scaling methods:
Vertical Scaling a.k.a Scale Up:
Scaling by ramping up the sizes of existing systems by adding more resources such as cpu, memory etc. With this method you can allocate more resources to your infrastructure as and when needed and will be reserved by the respective systems
For instance: Upgrading a single VM by adding more ram or disk or cpu. Most cases it’s done on demand when the system reaches its utilisation threshold.
Horizontal scaling a.k.a Scale Out:
In this case, instead of adding resources to existing systems, the no. of systems are scaled hence the name scaling out or horizontal scaling.
For instance: If you have a kubernetes cluster which is reaching 90% utilisation during peak hours, scale out by increasing the number of worker nodes. So what you do is set a minimum and maximum number of nodes, and a cpu/memory threshold, such that when it crosses the threshold nodes are scale out and down accordingly.
Let's compare both approaches: vertical vs horizontal scaling:
Vertical Scaling a.k.a Scale Up | Horizontal scaling a.k.a Scale Out |
With vertical scaling (a.k.a. "scaling up"), you're adding more power to your existing machine. Once allocated these resources cannot be released or re-allocated without decommissioning the systems. Scaling up most often requires downtime so as to prevent data corruption. | In horizontal scaling (a.k.a. "scaling out", you get the additional resources into your system by adding more machines to your network, sharing the processing and memory workload across multiple devices. The resources are shared across the full cluster. This can be completely automated and inflicts no downtime on the services running on the systems. |
Cost of machines increases as you upgrade its size and since the resources are dedicated to individual systems there are chances of underutilisation of costly resources. | Cost remains optimized since the resources allocation is shared and flexible as the number of nodes can be scaled down and out on demand. |
This may create "mutable infrastructure" which can become bottlenecks for future upgrades | For horizontal autoscaling, systems must be treated as commodity items which can be created and destroyed on demand. |
Summary: Vertical vs Horizontal Scaling
Recommendation: Scale out instead of scaling Up.
Use automatic horizontal scaling so you can scale out and scale down based on performance thresholds especially in cloud managed compute cluster services.
This will help optimise resource utilisation in modern infrastructure and avoid scenarios of overspending or underutilisation.
The aim of almost every business is to grow. That’s why it’s baffling to find so many infrastructure services not equipped to handle scaling up.
The 10-Factor Infrastructure is designed for this from the get go, meaning that however big your business gets, there’ll be no need for any expensive, large-scale infrastructure changes later.
If you like this article, I am sure you will find 10-Factor Infrastructure even more useful. It compiles all these tried and tested methodologies, design patterns & best practices into a complete framework for building secure, scalable and resilient modern infrastructure.
Don’t let your best-selling product suffer due to an unstable, vulnerable & mutable infrastructure.
Thanks & Regards
Kamalika Majumder
Comments