How to Manage Terraform Modules: 8 Best Practices for Success
Terraform is one of the most popular infrastructure as code tools out there. If you have just started working with Terraform, you may be asking yourself whether you are doing things in the right way. So, in this article, you will learn eight Terraform best practices that will improve your Terraform workflows immediately and make you feel more confident when using Terraform in your projects.
Now, many of the best practices are around Terraform state and the state file. So, let’s quickly understand what they are first.
Terraform is a tool that automates creating infrastructure and then making changes and maintaining that infrastructure. To keep track of the current infrastructure state and the changes you want to make, Terraform uses state.
When you change the configuration in the Terraform script, it will compare your desired configuration with the current infrastructure state and figure out a plan to make those desired changes.
The state in Terraform is a simple JSON file which has a list of all the infrastructure resources that Terraform manages for you. Since it’s a simple JSON file, technically, you could make adjustments to the state file directly by manually changing stuff inside.
However, the first best practice is: only change the state file contents through Terraform command execution. Do not manually manipulate it; otherwise, you may get some unexpected results.
Now, where does this state file come from?
When you first execute the terraform apply command, Terraform will automatically create the state file locally. But what if you're working in a team? Other team members will also need to execute Terraform commands and they will need the state file for that. In fact, every team member will need the latest state file before making their own updates.
So, the second best practice is to configure a shared remote storage for the state file. Every team member can now execute Terraform commands using this shared state file. In practice, remote storage backend for the state file can be Amazon’s S3 bucket, Terraform Cloud, Azure, Google Cloud, etc. You can configure Terraform to use that remote storage as the state file location.
But what if two team members execute Terraform commands at the same time? What happens to the state file when you have concurrent changes? You might get a conflict or mess up your state file.
To avoid changing Terraform state at the same time, we have the next best practice: locking the state file until an update is fully completed and then unlocking it for the next command. This way, you can prevent concurrent edits to your state file. In practice, you will have this configured in your storage backend. For example, in an S3 bucket, the DynamoDB service is automatically used for state file locking.
However, note that not all storage backends support this. Be aware of this when choosing a remote storage for the state file. If supported, Terraform will lock your state file automatically.
Now, what happens if you lose your state file? Something may happen to your remote storage location, or someone may accidentally overwrite the data, or it may get corrupted.
To avoid this, the next best practice is to back up your state file. In practice, you can do this by enabling versioning for it, and many storage backends will have such a feature. For example, in an S3 bucket, you can simply turn on the versioning feature. This also means you have a nice history of state changes, and you can revert to any previous Terraform state if you want to.
Great, so now you have your state file in a shared remote location with locking enabled and file versioning for backup. You have one state file for your infrastructure. But usually, you will have multiple environments like development, testing, and production.
So, which environment does this state file belong to? Can you manage all the environments with one state file?
This leads to the next best practice: use one dedicated state file per environment. Each state file will have its own storage backend with locking and versioning configured.
These were the best practices related to Terraform state. The next three best practices are about how to manage Terraform code itself and how to apply infrastructure changes. These practices can be grouped into a relatively new trend that emerged in the infrastructure-as-code space, which is called GitOps.
So, let’s see the next best practices.
When you’re working on Terraform scripts in a team, it’s important to share the code to collaborate effectively. So, as the next best practice, you should host Terraform code in its own Git repository, just like your application code. This is not only beneficial for effective collaboration in a team, but you also get versioning for your infrastructure code changes. So, you can have a history of changes in your Terraform code.
Before moving on to the next best practice, I want to give a shoutout to n0, who made this article possible. n0 automates and simplifies Terraform, Terragrunt, and GitOps workflows for provisioning cloud deployments. For example, it gives you visibility of the infrastructure changes when creating the pull request and automatically deploys your changes after merging them into your Git repository. With its self-service capabilities, n0 allows developers to spin up and destroy an environment with one click. It also integrates policies, called guardrails, to limit direct cloud access. Check out n0.com for all its use cases and capabilities.
Now, let’s continue with best practice number seven.
Who is allowed to make changes to Terraform code? Can anyone just directly commit changes to the Git repository?
The best practice is to treat your Terraform code just like your application code. This means you should have the same process of reviewing and testing the changes in your infrastructure code as you do for your application code. With a continuous integration pipeline and using merge requests to integrate code changes, your team can collaborate effectively and produce quality infrastructure code that is tested and reviewed.
Great, so now you have tested and reviewed code changes in your Git repository. But how do you apply them to the actual infrastructure? Eventually, you want to update your infrastructure with those changes, right?
The final best practice is to execute Terraform commands to apply changes in a continuous deployment pipeline. Instead of team members manually updating the infrastructure by executing Terraform commands from their own computers, it should happen automatically from an automated build. This way, you have a single location from which all infrastructure changes happen, and you have a more streamlined process of updating your infrastructure.
These are the eight Terraform best practices you can apply today to make your Terraform projects more robust and easier to work on as a team.
Do you know of any other best practices that I forgot? Please share them in the comments! And finally, if you want to learn more about Terraform, check out other Terraform resources linked here.