![Codified Cloud Migration of A Digital Bank](https://static.wixstatic.com/media/981170_b437956bec4d4415994f36e209e301cc~mv2.gif/v1/fill/w_980,h_551,al_c,usm_0.66_1.00_0.01,pstr/981170_b437956bec4d4415994f36e209e301cc~mv2.gif)
One of my clients, an Indonesia based banking company, wanted to build a fully digital mobile banking application that would help them to become a strong tech-based bank embedded in Indonesia’s digital ecosystem.
They wanted a fully automated secure, scalable and sustainable cloud native infrastructure for this tech-based life-centric finance and banking application.
This was during Q4 2019 and major cloud providers were yet to launch in Indonesia. Our only option was a newly launched cloud provider with limited managed services options. So they were compelled to go live on this cloud around mid 2020. It so happened that soon after the first launch, one of the leading cloud providers started their operations in Indonesia and launched their first multi zonal region. They had a better(discounted) pricing model and advanced features which were needed for further scale and sustainability of the Digital Banking and Financial services business.
So it was decided that the live digital banking and financial services applications will be migrated from Cloud A to Cloud G with an objective of infrastructure upgradation, consolidation and cost optimisation.
The goal was to build a parallel infrastructure on Cloud G and migrate the environments in a phased manner , while continuing business as usual in Cloud A.
Since it would mean incurring some dual cost on both infra plus growing data size, the timeline was decided to be within 4 months. While planning itself we knew there were various challenges/blockers at hand as following:
The Challenges:
A constantly growing data size. When we started building the parallel infrastructure, it was in some hundreds of GBs , and by the time we reached the day of migration it had already become a few TBs which was practically impossible to copy over standard internet connection. Not to mention these were from a variety of data sources like databases, object storage, vm images, videos and what not.
The migration had a very tight schedule of 4 months, to avoid overspending on parallel cloud infrastructure.
There were some significant differences between both the clouds especially in their physical network topology, cloud administration and api access control.
Due to this we had to design an architecture which will be logically the same in terms of secure and seamless connectivity for services in both clouds.
In banking and financial services there can be operations which need some specific softwares operating in specific systems. That's why there were a mix of cloud instances with both windows and linux operating systems.
The RTO defined was 4 hrs , meaning production migration of several terabytes of stateful assets(files, images, videos) across clouds had to be done within a maintenance window of 4 hrs.
Codified Cloud Migration:
Effective and Optimised Time Utilisation:
Looking at the sheer scale and limited timeline, for the infrastructure(4 projects 4 env each within 4 months) and the production data (3 TBs in 4 hours), we knew nothing can be achieved with manual process. So it was decided that everything had to be codified , even if it's a simple file/folder copy , since the same would be repeated/reused for all projects.
Enforce traceability and audit-ability through pipeline as code
Being a bank, it had to go through various compliance and regulation checks and since it was a change of infrastructure every step and process had to be documented and audit ready so that nothing is missed. In-fact there had to be a DR and rollback plan incase of any emergency situation.
Codification of step involved from infrastructure creation & configuration to service deployment and data migration enabled the audit ability and traceability that was need to be proven to the auditors. In-fact it as as easy as to just trigger the pipelines in stages and just screen-record or screenshot the process.
We could even go back and capture the pipeline data/history if needed to present at a later stage.
Idempotent Setup - One design multiple implementations
A bank can have multiple project systems such as banking, finance, teller, etl etc.
With modular infrastructure-as-code and Infra provisioner pipeline-as-code, we just had to build a project specific pipeline from the standard template and the source modules and run them in parallel.
In-fact, one of the approach we adopted for data migration was to codify the db and file migrations such that we could batch transfer most of the data a day or two prior to the actual migration/switch over day. And then on the day of prod migration we had to just re-rum the data migration pipelines to copy the deltas. That saved us fro major data corruption risk which could have happened if we initiated 3 TB data transfer all at once.
Lessons Learnt:
Always be realistic rather than optimistic about the timeline. Add comfortable buffers for the unknowns.
Avoid lift & shift. Most often people complicate migration process by trying to move systems around which is unnecessary. Irrespective of administrative, networking or services differences, the standard compute virtualisation layer for any cloud is pretty much the same. All you need is migrating the data to a cloud instance with the similar CPU/MEM/IO and operating system.
Do try to do manual data migration form your local system or any ssh console, if you run into a network issue the entire process had to be re-initiated. Codify the steps, run it through a CI/CD agent pipeline which had direct/better connection with both the source and the destination clouds.
Last but not the least, don't just write code, develop modular infrastructure-as-code. Meaning created independent source modules that can be reused multiple times in parallel. Parametrisation and idempotency is a must.
If you like this article, I am sure you will find 10-Factor Infrastructure even more useful. It compiles all these tried and tested methodologies, design patterns & best practices into a complete framework for building secure, scalable and resilient modern infrastructure.
Don’t let your best-selling product suffer due to an unstable, vulnerable & mutable infrastructure.
Thanks & Regards
Kamalika Majumder
Comments