Immutable Infrastructure For Continuous Delivery

May 22, 20244 min read

Updated: May 24, 2024

Immutable Infrastructure For Continuous Delivery

If the recent pandemic has taught us something, it's that anything that mutates, becomes untraceable and unmanageable. Same goes for infrastructure especially the services used for compute or to be precise, your app-servers. In this article I am going to share a real experience of the worst impact of not having or understanding immutable infrastructure.

Continuous Delivery in a Bank:

It happened couple of years back where a bank went live with their digital banking solution on an infrastructure that was actually meant for a POC setup for testing some vendor solution. Although it was a containerised platform, but everything from services to databases hosted on the same virtual machines with stateful data on persistent volumes.

And to top it up, they had manual configurations for these systems. This was mostly common during incident management. In a hurry to fix issue and get things back up, engineers commit this mistake of fixing configurations on the local filesystems. Then they forget what was done and that becomes another "work of art". Most often such practices leads to creating single point of failures and increasing number of truck factors within the organisation.

And the result was that, there were at least a hundred(and I not exaggerating), incidents reported daily that were related to infrastructure(some times even if it was not). The distrust on infrastructure grew so much, that even if internet went down in the city, they would blame the app servers.

So how did we tackle these challenges to build a resilient and robust banking infrastructure that was fit for successful digital transformation, here are some of the initiatives:

The Initiatives:

The first step was to stabilise the underlying compute layer where the application was running and segregate the compute and data layers
Once the Compute layer was setup, we took on to the other layers which were supporting the continuous integration from dev to prod in order to achieve a highly stable and scalable Toolchain that would form the backend of the Continuous Delivery platform that was being built.
We enabled high availability for data storage using network file storage with geo-replication across data-centers.
We re-designed the entire deployment model for the containerisation platform and built a solution that would spin up a containerisation cluster stretched across two data-centers within 20 minutes. It was in 2017 and there were no clouds available in that region then.
Automated deployment mechanism for containerisation cluster by use of configuration management tools within a CI pipeline.
Robust monitoring and alerting reports generated for infrastructure utilisation and services health check.
Ensuring prod readiness by conducting heavy stress testing on the platform for a week prior to application onboarding. Once the results were confirmed it was released to production in a day.

The Outcomes:

Always On Platform as a Service for Digital banking Products
Zero downtime for Application and Service layer
Scale out On demand
Geo replication of Data across DC and DR.
RTO and RPO of 1 hr for all backend tools.

And the result was that we were able to build an Always On PaaS for banking & financial services for that Bank.

For successful and sustainable digital transformation you need immutable systems that can deliver environments on demand for continuous software development and delivery. In-fact, it was one of the key driving factors behind the whole DevOps movement.

Modern infrastructure evolved so much since then, which started with running applications on bare metals to virtualised machines to present day containers. However, one mistake that keeps happening is segregation of compute and data layers. I continue to see organisations host both state-full and stateless services on the same system, sometimes even databases or backups and logs too. It's almost like axing the branch while sitting on it.

Compute systems must be considered commodity items that can be created and destroyed on demand. They must not have any critical data stored on them. Once you make them stateful, they become "work of art" , sensitive to any changes or upgrades.

Key Pointers for Immutable Infrastructure:

👉 Standardise the configuration across all environments by ensuring that the compute servers are build from standard version- controlled machine images.

👉 There must be separate disk partitions for the root operating system and apps running on them. This will allow for zero downtime upgrades and config changes.

👉 Always automate the base image creation. You can customise the base system image as required. Store the base machine images in a centralised repository line Nexus or Artifactory. Version your system images so these can be upgraded later.

👉 Beware of licensing terms, use only official open source images if you have not purchased any license. Check for open source EULA first before downloading.

👉 Once standardised in size, volume and configuration, these can be scaled out and down as and when demands rise and fall. This will also prevent over or under utilised infra resources.

👉 With a standard system package you can build as many virtual machines as you want with the same config, this will get rid of the most common excuse that we have heard “it works on my machine but not in prod”.

👉 Last but not the least, do not host your data bases or stateful services on the same systems as the stateless or compute ones. Although, various platforms or services will market that they can host anything, in reality they are not yet scalable enough to handle data storage. So it's better to choose pure database solutions while choosing any managed services.

If you like this article, I am sure you will find 10-Factor Infrastructure even more useful. It compiles all these tried and tested methodologies, design patterns & best practices into a complete framework for building secure, scalable and resilient modern infrastructure.

Don’t let your best-selling product suffer due to an unstable, vulnerable & mutable infrastructure.

Be fit to launch & scale on a compliance ready cloud from Day 1

with 10factorinfra

Thanks & Regards

Kamalika Majumder

The 10-Factor
Infrastructure