
Various challenges tend to create bottlenecks in the process of secure, scalable , sustainable consistent and reliable development & delivery. Most of them are related to the compute systems that we use to build, test and deploy our applications such as
Exponentially mutable infrastructure caused by manual changes making each server a “work of art” in itself. Testing such systems takes forever due to inconsistency between prod and non-prod
Then there are stateful compute systems where data gets stored on the local disk, this practice not only increases probability of data loss but also leads to prolonged downtimes for any minor changes like upgrades, backups, restoration. This causes a lack of higher availability for the entire compute layer.
Scaling such systems is time consuming and costly since these can be only vertically scaled till a certain limit of cpu and ram.
These issues can easily be avoided if we follow these best practices.
How to build modern systems fit for continuous software development & delivery:
Immutable Systems:
If the recent pandemic has taught us something, it's that anything that mutates, becomes untraceable and unmanageable.
Compute servers must be considered commodity items that can be created and destroyed on demand. They must not have any critical data stored on them.
Inorder to standardise the configuration across all environments ensure that the compute servers are build from standard version- controlled machine images. There must be separate disk partitions for the root operating system and apps running on them. This will allow for zero downtime upgrades and config changes. You can customize the base system image as required. Always automate the base image creation. Centralised artifact repository . Stores the base machine images in a centralized repository line Nexus or Artifactory. Version your system images so these can be upgraded later.
Beware of licensing terms, use only official open source images if you have not purchased any license. Check for open source EULA first before downloading.
Once standardised in size, volume and configuration, these can be scaled out and down as and when demands rise and fall. This will also prevent over or under utilized infra resources.
With a standard system package you can build as many virtual machines as you want with the same config, this will get rid of the most common excuse that we have heard
“it works on my machine but not in prod”.
But wait, we are not done yet. Like constructing a house, we have just built the rooms , it still needs the interior that will make it a home to live in and this interior designing of systems is called “Configuration Management”.
Configuration Management:
We need to customising the softwares on the systems like sayOS Upgrades:
Let’s say you have a centos 8.1 system image and have like 50 virtual machines that you created to run apps and services. After some months there is a new security patch that needs to be applied. It’s much more efficient to use version controlled config management to just apply the patch on top of the system and thereby keeping a track of what changed in the state data.
Likewise, activities such as cpu and memory upgrade or disk extension also can be done using the config management system without recreating the whole server, thereby preventing downtime as well as time and effort. Patching process can be triggered on demand if required.
FinOps:
Compute "Systems" is the first layer from which you will start noticing operational expenses since it's the first chargeable service that you will encounter in building cloud infra today. (Basic network setup on cloud is negligible).
Although every business in todays’ digitally transformed work aims at becoming the bestseller in its domain by delivering quality apps and services. However none of them are running businesses without a budget.
Subscription/prepaid/commitment billing model instead of the default pay-as- you-go plans while provisioning compute instances even if it's for testing for a month or so. Trust me you will see the difference within a month.
Clouds usually provide a variety of options in machine types and sizes, so much so that you may be spoilt for too many options. For optimised operations define a standard instance family with T-shirt sizing for creating machines based on the performance they demand. For instance, you can classify them as cpu intensive, memory intensive, i/o intensive etc. Select only one family for each requirement. That way both FinOps and operations will be manageable.
Hardening & Patching:
Images, OSes and platforms come from a trusted vendor or approved repository and are hardened (no 3rd party downloads). All servers, appliances and devices must have a complex password policy. They must be scanned monthly for vulnerabilities
Summary: Systems For Scale
Compute systems that are easy to scale out on demand during happy hours.
Servers that are secured enough to host public facing applications
Prevent overspending or underspending on infrastructure
These can be only achieved if you have
Immutable systems and stateless compute servers built from version controlled machine images standardised across all environments enhanced with automated horizontal scaling on demand.
Regular validation of system hardening , patching and upgrades through configuration management systems.
Prepaid hosting plan for servers.
If you like this article, I am sure you will find 10-Factor Infrastructure even more useful. It compiles all these tried and tested methodologies, design patterns & best practices into a complete framework for building secure, scalable and resilient modern infrastructure.
Don’t let your best-selling product suffer due to an unstable, vulnerable & mutable infrastructure.
Get Compliance Ready Cloud From Day 1
For Startups & Enterprises
Thanks & Regards
Kamalika Majumder
Comments