One of the most common questions/challenges you face when you start working with Terraform are related to the way you organize the code. This article will present the three most common paters when managing Terraform code.
But first, let me offer some brief explanations in case you are unfamiliar with some of the terms.
Terraform is an infrastructure as a code (IAAC) tool that lets you define cloud and on-premise infrastructure.
Terraform Cloud – a managed service offered by HashiCorp (the company behind Terraform ). Among other things, this service includes remote state management, collaboration tools, version control, and secure access controls.
Terraform Workspaces – a feature of Terraform that allows you to create multiple environments to manage different infrastructure resources. In a few words: a collection of everything terraform needs to run – configuration files, variables, and state data.
There are two strategies for structuring repositories when working with Terraform: the Mono-Repo and the Multi-Repo.
On Multi Repo :
- You have a single repository that contains all the code
- It is referenced and shared by the whole team
On Multi Repo
- We have multiple repositories based on business domains or teams
- Modules are stored in individual repositories and references when needed.
- We can use Terraform Modules that reside in their repositories.
On the mono-repo strategy, we have two possible approaches: The “Workspaces per Folder” and the “Workspace per git branch.”
Workspace per Folder
You create a folder for each development stage in the Workspace per Folder approach (this is the one we used here and here). You will have a “devel” folder to keep the development infrastructure and a “production” folder to store the production code.
This approach is recommended when short-lived branches are constantly merged into the main one and with organizations with significant environmental differences.
The benefits of this method are that all infrastructures are in unique environments, and you have a single repository containing all infrastructure resources.
The downside of this mode is that you will have to promote changes between stages manually. For any change in infrastructure implemented and accepted in the devel environment, you will have to copy the code on the production environment manually.
Workspace per Git Branch
On the other hand, we will have a single repository on the “Workspace per Git Branch” approach, and each workspace will be tied to a development branch. So, for example, we will have two git branches: production and development, and a workspace for each of these branches.
The code will be structured as in the image below: a folder containing our modules and the main.tf, variables.tf, outputs.tf in the applications’ root.
Remember that we will not instantiate the workspaces locally with “terraform workspace new” but define those in the Terraform Cloud.
This approach works best for teams that work with long-running branches. Then, if you want to promote changes between different stages, you have to do a pull request and a merge.
When we create the Terraform workspace, we will tell TF Cloud to listen for changes only on a particular branch, and only if a change happens on that branch will it do the plan and apply.
The upside of this approach is that there are fewer files to maintain and plans to run.
The downside is that there is a chance for your branches to drift out of sync, and teams can cross-contaminate environments.
Now that we explained the theory behind the mono-repo patterns, you can do an actual implementation
In Terraform Cloud, create a new workspace and choose a “Version Control Flow.” Select Github as the version control provider and connect to your GitHub account.
Select the repository and open the “Advanced Options” Section
If you work with the “Workspace per Branch” pattern, fill in the VCS branch field.
If you work with the “Workspace per Folder” pattern, fill in the “Terraform Working Directory” field.
Save the settings, and you are ready to go. Terraform Cloud will detect any Github changes in the branch or folder and trigger your infrastructure deployment.
The Multi-Repo Pattern
The multi-repo is a pattern used on high-complexity projects where different teams manage different infrastructure parts. The transition to multi-repo starts with the modules: instead of keeping them in a single repository, we will remove them and host each module on its repository.
For example, you can have a network module, a compute one, and a security module. Each module will be maintained by its team and hosted in a separate repo.
That means our application can access these modules hosted in private registries through source argument.
This pattern works best for the organization with large teams where a team manages their repositories while different applications connect to those repositories and reuse the code.
The obvious benefit of this approach is that it enables teams to self-service on shared modules and reduces the blast radius to a particular module. You can also implement stricter access control for individual repositories and better version management through tagging.
Separating the modules into different repositories, it’s not a complicated task. You need to use the right source. There are plenty of options available – look over this page: https://developer.hashicorp.com/terraform/language/modules/sources
I will cover two cases: Github and Terraform Registry
For Github, declare the module like this :
module "testing" {
source = "github.com/crerem/multi-repo-terraform-module"
}
This will work if the source repo is a public one.
If you are working with private repositories, you must authenticate with Github. In this case, you need to include the “GitHub provider.”
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "4.34.0"
}
github = {
source = "integrations/github"
version = "~> 5.0"
}
}
}
and later in the code – do the actual authentication using the GitHub Token
provider "github" {
token = "your Github token"
}
For alternative ways to authenticate, you can look over the GithuB provider documentation: https://registry.terraform.io/providers/integrations/github/latest/docs
In your workspace, you must create two variables: AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY, with your connection data – mark those as sensitive.
For Terraform Registry
You first need to go into Terraform Cloud and declare a new module. This module has to have the name in “terraform-Provider-Name” format. Something like terraform-aws-sample-module. If you do not use the proper naming convention the repo will not appear in the TF cloud list.
Another mandatory thing is to publish a tag for your code – so add the 1.1.0 (or your tag) for the module repository.
Confirm the selection and publish your module. This is how the page will look like
In the right column, you will have the configuration details. You can copy and paste those into your code.
Since this is a private module, you must create a TF Cloud workspace to use it. And, of course, create two variables: AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY, with your connection data – mark those as sensitive.
This is the sample code for using the module from Terraform Registry: https://github.com/crerem/multi-repo-terraform
Summary
Each team should pick the best pattern for their cases. The single mono repo patterns should be used if you need a single source of truth for infrastructure resources, optimizing the build, and better collaboration between teams.
You should use the “single workspace per branch pattern” for organizations that need long-running branches. In contrast, the single workspace per repository directory should be used where you have significant differences between environments.
The Multi Repot pattern suits teams that collaborate on complex infrastructure systems. It will give better access controls over modules and faster development and flexibility.