Use Terraform to Manage Git Repositories

Shanmuganathan Raju
7 min readJun 17, 2022

Storing code and other artifacts into a repository backed by a version control system (VCS) is a fairly well understood and agreed upon technique. However, many organizations I work with are still creating and managing repositories by hand or with one-off scripts. While this method does work and is often good for beginners to get a grasp on the fundamentals, there are tough challenges with scale, consistency, security, and cleanup to surmount. It is far more powerful to use code to deploy, secure, and manage repositories.

In this post, I start with a design overview on how to piece together a repository manager to build and maintain multiple repositories. From there, I switch into “guide” mode and dive into the initial setup, using a personal access token (PAT), securing the root repository, and generating a new repository with Terraform. Finally, I showcase how to import an existing repository into Terraform, address any drift concerns, and drop a few handy tips to consider for the future.

I use Terraform to declaratively build all of my repositories across GitHub, GitLab, and Bitbucket. Each service is used for different organizations (work, personal, community) and for different use cases (internal code, external code, examples). My Terraform code is stored in a repository called the Repository Manager as shown below:

The Repository Manager configuration describes how each production repository should be created, including name, labels / tags, members, teams, readme details, licensing, CI settings, visibility, and more. Specific configuration settings vary depending on the provider used, such as GitHub accepting a template repository to create new repositories.

Colleagues are able to submit a pull request against the Repository Manager configuration to meet specific requirements such as creating, modifying, or archiving a repository. Pull requests are subject to policy, linting, and validation jobs by way of continuous integration (CI).

This system works well for distributed teams managing numerous repositories. It is especially handy when dealing with a variety of hosted and internal services. It allows everyone to focus on writing code instead of worrying over the operational toil of managing repositories.

Initial Setup

There are a few components necessary to begin setup:

  1. An account with the desired VCS. I will use GitHub in this example.
  2. A personal access token (PAT) for the aforementioned account. The documentation from GitLab and GitHub do a nice job with explaining this step.
  3. A local copy of Terraform CLI.

Setup the root organization and the Repository Manager repository by hand. This avoids circular dependencies and gives the code a place to live during development.

Clone the repository locally.

> git clonegit@github.com:WahlNetwork/repository-manager.git
Cloning into 'repository-manager'...
remote: Enumerating objects: 3, done.
remote: Counting objects: 100% (3/3), done.
remote: Compressing objects: 100% (2/2), done.
remote: Total 3 (delta 0), reused 0 (delta 0), pack-reused 0
Receiving objects: 100% (3/3), done.

Setup the usual Terraform suspects for a new project: variables, providers, versions, .gitignore, and so forth. Initialize the Terraform configuration with terraform init. Add the changes and proceed as below.

First, to access our github, Terraform will need our PAT.

Each provider will require the PAT for authentication. In the case of GitHub, the token is passed in the provider section. I advise using a Terraform variable and passing the token value as an environmental variable or tfvars file while working through this guide.

To get the PAT …

Once in the settings, go to Developer settings and select Personal access tokens.

Select, Generate new token, now go ahead and create the token, selecting the permissions you want the token to have on your github account. Also, select the expiration date you want it to have. Then at the bottom, generate the token.

Now we can use that token in our provider, giving our terraform, access to our github account. In this case, we will keep our value in the github_token variable.

First we create the variable

variable "github_token"{
type = string
default = "abc123qwe456asd789"
}

Now we will call it in our provider block

provider "github" {
owner = "apotitech"
token = var.github_token
}

If the token is not defined, Terraform will request the value during execution.

> terraform plan
var.github_token
Personal access tokens (PATs) for authentication to GitHub.
Enter a value: 12345 (I've got the same combination on my luggage!)
Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be persisted to local or remote state storage.

Take a moment to lock down the Repository Manager repository:

  1. Protect the main branch. No one should be able to push code without a review and approval by a maintainer or owner.
  2. Require at least one review for pull requests.
  3. Set the repository visibility to private. However, my repository is public to use as an example to this post.

The repository is now clean and is ready to churn out new repositories. Inception? 🙂

I can now add new repositories using a small bit of Terraform configuration. This can be daunting at first; GitHub accepts a large quantity of arguments. However, many arguments are not required and have acceptable defaults.

The creation of a new repository named demo-1 is performed using the code below:

resource "github_repository" "demo-1" {
name = "demo-1"
description = "A demo GitHub repository created by Terraform"
visibility = "public"
homepage_url = "https://apoti.tech/"
has_projects = false
has_wiki = false
has_downloads = false
has_issues = true
license_template = "mit"
topics = ["example", "public", "infrastructure-as-code", "operations", "terraform", "github"]
}

Run terraform plan -out plan.tfplan to validate the configuration meets expectations. Add or modify any arguments that need adjustment and repeat as necessary.

Terraform will perform the following actions:
# github_repository.demo-1 will be created
+ resource "github_repository" "demo-1" {
+ allow_merge_commit = true
+ allow_rebase_merge = true
+ allow_squash_merge = true
+ archived = false
+ default_branch = (known after apply)
+ delete_branch_on_merge = false
+ description = "A demo GitHub repository created by Terraform"
+ etag = (known after apply)
+ full_name = (known after apply)
+ git_clone_url = (known after apply)
+ has_downloads = false
+ has_issues = true
+ has_projects = false
+ has_wiki = false
+ homepage_url = "https://apoti.tech/"
+ html_url = (known after apply)
+ http_clone_url = (known after apply)
+ id = (known after apply)
+ license_template = "mit"
+ name = "demo-1"
+ node_id = (known after apply)
+ visibility = "public"
+ ssh_clone_url = (known after apply)
+ svn_url = (known after apply)
+ topics = [
+ "example",
+ "github",
+ "infrastructure-as-code",
+ "operations",
+ "public",
+ "terraform",
]
}
Plan: 1 to add, 0 to change, 0 to destroy.

Once the configuration looks solid, run terraform apply plan.tfplan to bring the repository to life. Note that the plan.tfplan file contains an encoded version of the token value and should be kept private.

> terraform apply plan.tfplan
github_repository.demo-1: Creating...
github_repository.demo-1: Creation complete after 10s [id=demo-1]
Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

Check out the new repository and bask in the glory of automation.

This is great for new repositories. But, what about existing ones?

It’s rare to find a truly greenfield environment without any existing repositories. Terraform’s import command is great for adding the existing repositories into management. I have manually created a repository named demo-2 that will be imported into Terraform using the steps below.

I’ve started by defining the repository in the Terraform configuration using the demo-2 example from earlier. The name and description have been updated to provide the correct information for demo-3.

resource "github_repository" "demo-3" {
name = "demo-3"
description = "A demo GitHub repository created by hand and imported into Terraform"
visibility = "public"
homepage_url = "https://wahlnetwork.com/"
has_projects = false
has_wiki = false
has_downloads = false
has_issues = true
license_template = "mit"
topics = ["example", "public", "infrastructure-as-code", "operations", "terraform", "github"]
}

The next step is to import the repository into Terraform. This is also detailed in the GitHub provider documentation. GitHub only requires the name of the repository — easy!

> terraform import github_repository.demo-3 demo-3
github_repository.demo-3: Importing from ID "demo-3"...
github_repository.demo-3: Import prepared!
Prepared github_repository for import
github_repository.demo-3: Refreshing state... [id=demo-3]
Import successful!
The resources that were imported are shown above. These resources are now in your Terraform state and will henceforth be managed by Terraform.

It is now a good idea to run another terraform plan && terraform apply to see if any parameters have drifted from the current configuration.

You see that terraform is now trying to apply the changes onto the repo we just created. The demo-3 repository is now fully under Terraform's management.

This guide should be enough to get the creative juices flowing to meet the requirements of a specific use case. However, there is so much more that can be done, including:

  • Use the native CI capabilities of GitHub or GitLab to lint, test, and validate pull requests based on your team’s standards and policies.
  • Deploy a service account or bot user to perform the Terraform work instead of using your own PAT.
  • Store the PAT as a secret in VCS instead of using an environmental variable or tfvars file.
  • Add a cron job to CI to check, and potentially remediate, configuration drift.
  • Add the prevent_destroy meta-argument to ensure that Terraform is not capable of destroying defined resources. Alternatively, limit the permissions bound to the PAT to exclude destroying resources.
  • Use remote state for the Terraform configuration, such as with Terraform Cloud, instead of a local state file. Yes, there is a provider for this. 🙂
  • Split the Terraform configuration files into small chunks, such as main.tf to pull data and define and use-case.tf for a specific project or use case.

Please accept a crisp high five for reaching this point in the post!

If you’d like to learn more about Infrastructure as Code, or other modern technology approaches, please keep on reading my posts.

Originally published at https://medium.com on June 17, 2022.

--

--

Shanmuganathan Raju

A Multicloud Architect, with more than 16 years in the information technology industry with experiences in architecting, solution designing and Cloud Migration.