Resolve Challenges Scaling IaC
Infrastructure as Code (IaC) has made managing infrastructure easier in a lot of ways, but there are many challenges that companies accept as the cost of adopting IaC, especially when scaling. My previous article digs into these challenges to try to understand them better. If you haven’t already, I would highly recommend reading that first.
This article will introduce Environment as Code (EaC) and how it helps resolve those challenges and makes managing environments easier across any cloud provider or even on-premises.
Before we get started, let’s define a few terms that we will be using throughout the article.
Infrastructure Components: A logical grouping of 1 or more Infrastructure Resources that get provisioned together. For example, Networking is an Infrastructure Component with various Infrastructure resources like Virtual Private Cloud(VPC), Subnets, Internet Gateways, Route Tables, etc.
Environment: A logical grouping of all the Infrastructure Components that are needed to run business applications. The grouping includes components like networking, platform-eks, database, s3 buckets, and any other components.
Like in the previous article, I mentioned that Teams want an entire Environment and not just individual infrastructure components to run their applications. For this article, we will use the below Environment with networking, platform-ec2, platform-k8s, db-security-group & rds-database, etc., as an example. See Diagram 1 below.
What is Environment as Code?
Teams usually create pipelines using tools like Jenkins, CircleCI, etc., to provision entire environments and manage relationships between various infrastructure components unless they use a single monolith IaC which is not recommended. These tools are not optimal for the use case, require a lot of custom code, and become a maintenance nightmare. Environment as Code makes it a lot easier to provision the entire environment.
As you can see in the Diagram #2 below, Environment as Code is an abstraction over Infrastructure as Code and calls various Infrastructure as Code Components in the right order. It also has Control Panel that Detects Drift and Reconciles the entire environment and manages state at environment level.
See below the definition of Environment as Code.
Environment as Code (EaC) is an abstraction over Infrastructure as Code that provides a declarative way of defining an entire Environment. It has a Control Plane that manages the state of the environment, including relationships between various resources, Detects Drift as well enables Reconciliation. It also supports best practices like Loose Coupling, Idempotency, Immutability, etc. for the entire environment. EaC allows teams to deliver entire environments rapidly and reliably, at scale.
The intent here is not to replace Infrastructure as Code but create an abstraction over it to resolve the challenges that teams face while implementing Infrastructure as Code.
Think about the various infrastructure components as lego pieces that can be automated using Infrastructure as Code, but these pieces are not useful on their own. Environment as Code puts these lego pieces together to give something useful (i.e., an entire Environment).
Attributes of Environment as Code
Now that we have defined Environment as Code, let’s look at its key attributes.
Ability to define Entire Environment
As mentioned in the definition, Environment as Code manages an entire environment. It supports defining the entire environment with various infrastructure components in an easy-to-understand format for that to happen. It also supports specifying various relationships between those components.
EaC is used to provision the components in the correct order. In the example mentioned above, it should provision networking first and then provision platform-ec2, platform-k8s & security-group. After it provisions platform-k8s, it should provision k8s-addons.
It also supports the teardown of the entire environment, by reversing the order in which the various infrastructure components get destroyed. So it destroys k8s-addons first and then platform-ec2, platform-k8s & security-group, and then the networking component. The logic to reverse the order is taken care of by EaC.
Diagram 4 below shows an example Environment as Code using YAML custom format. We use this for our product CloudKnit, but it doesn’t have to be a YAML format.
Loosely Coupled
One way to manage the entire environment is to use a single monolith IaC which is not recommended. With EaC, it becomes a lot simpler to manage the entire environment with various infrastructure components that are smaller in size. These individual infrastructure components have IaC defined for them that only provision that component. Thus EaC promotes loosely coupled environments which makes it easier to understand and maintain them.
Manage State for the entire Environment
Tools like Terraform manage state for individual infrastructure components, but they don’t manage state for the entire environment unless you create a monolith IaC. EaC manages the state for the entire environment and also does validation.
For example, if you try to destroy networking before destroying child components in the example above, it will give an error or at least a warning, so the user is aware. If a parent gets updated, it automatically triggers a run for child components.
See above Diagram #2 showing in what order various environment components get provisioned. It also has Environment State that manages the relationship and status of how individual components are connected. Similarly Diagram #5 is showing how entire environment teardown happens by destroying individual infrastructure components in the reverse order.
Idempotent and Immutable for entire Environment
While individual Infrastructure Components are Idempotent on their own due to IaC but the entire environment is not. EaC should enable Idempotency for the entire environment.
For example, if there is a failure while destroying one of the components, but its child components get destroyed, if you re-run the Environment as Code, it should give the same result (i.e., Teardown of the entire environment). It shouldn’t matter how many times you run EaC and what your starting state is, you should end up with the same end state.
If you want to follow principles like Immutability for the entire environment or enable sharing best practices implementation of environments across various teams, having a mechanism to replicate environments is critical. EaC makes it a lot simpler to replicate environments and even create blueprints. Being able to replicate environments also allows spinning up ephemeral environments to run automated tests in your CI or even doing blue-green deployments for the entire environment.
Visualize and Understand Environments
Teams usually create diagrams for their environment using diagramming tools like Visio or draw.io, but they struggle to keep it updated as the Environment changes. Using EaC to generate a diagram keeps it always updated and avoids the challenge of out-of-date environment diagrams.
Since EaC has the entire environment and various relationships between each infrastructure component, it can be used to visualize and understand the entire environment. We have used it to visualize environments and provide details about them in our product CloudKnit (see below Diagram 6).
Drift Detection and Reconciliation
EaC supports drift detection (detecting the difference between the desired state in source control and the actual state of environments) for an entire environment using details like various relationships between components. Once the drift is detected, it enables reconciliation to revert the actual state to the desired state. This is based on the same concept as Kubernetes Controllers but in the case of reconciliation with EaC, you might want to add an approval process(that shows the plan), especially since reverting a change might modify/destroy infrastructure.
The cost of, for example, deploying a new version of shared dependency becomes manageable when the infrastructure components impacted downstream are known, monitored, and redeployed, according to EaC.
Compare and Promote Changes between Environments
Keeping various environments as close to each other as possible helps find any issues faster in lower environments. Since the desired state of the entire environment is specified in an easy-to-understand format and pushed to source control with EaC, it can be used to compare various environments, understand the differences between them, and promote changes from lower to higher environments.
You can also use GitOps for the entire Environment using EaC. Check out the Diagram 8 below for the GitOps workflow for EaC.
Conclusion
Thanks for reading the article, and I hope that you find it useful. We would love to hear from you on what you think about Environment as Code. If you have any questions or comments you can reach out to me via twitter or email: adarsh@cloudknit.io.
Acknowledgments
Arielle Sullivan & Dejan Pejčev read the draft version of this article and provided feedback to improve it. Priyanka Rao helped come up with the term "Environment as Code".
If you enjoyed this article you might like our product that makes Environment Management Easy across all 3 major cloud providers as well as On-premises. Please watch the below video to know more & check out our Open Source Project on Github.
https://www.youtube.com/watch?v=yUPJj3MJmqs