Terraform CDK - Managing our GitHub Organisation

Preface

Introduction

At reecetech, we’re currently in development stages of migrating our repositories from Stash (BitBucket) to GitHub. The Delivery Engineering team has been on the forefront of this migration process and slowly have been onboarding teams and helping them migrate their build and deploy plans over. This migration has been in the works for the past several months.

What is Terraform?

Terraform is a cloud resource provisioning solution that uses straightforward, declarative programming as its infrastructure. When configuring a SaaS, you might find yourself utilising the GUI of your preferred cloud provider, click-ops’ing your way through setting up your infrastructure. What happens if you want to replicate this process? It results in complete chaos and you may not know all the previous steps. Surely there’s an easier way right? - right. This is where terraform comes in.

Implementation

Brainstorm

Whilst brainstorming what we can use Terraform in our Github Org, we had come to the following list.

And there’s much more, that’s just the few we brainstormed.

Actions Whitelist - Terraform

Introduction

GitHub Actions Whitelist is based on GitHub recommendations and best practices with respect to third party actions, which can be located here:

S3 Backend

The first project with Terraform managing our GitHub Organisation was creating a GitHub Actions Whitelist to secure our workflow runs. We would also want to store the Terraform State File in an S3 bucket. We would want to store the state file in an S3 Bucket as it may include various sensitive values represented in plain text, creating a huge security risk. This also enables our developers to collaborate on the terraform project.

An example of setting up a S3 Backend for Terraform would be the following.

terraform {
  backend "s3" {
    bucket               = "whitelist-bucket"
    key                  = "gha-whitelist/terraform.tfstate"
    region               = "ap-southeast-2"
    encrypt              = true
    dynamodb_table       = "whitelist-table"
  }
}

Providers

We need to setup which providers we’re going to use for terraform. In this case we’re using GitHub (for org management) and AWS (for the S3 backend). Using a PAT (Private Access Token) is extremely frowned upon for authentication, since this is the case, we’re going to be using GitHub Apps as a means of authenticating. Using GitHub Apps is highly recommended way to integrate with GitHub because they offer more granular permissions to access data - in this case terraform. An example of setting up providers for Terraform would be the following.

provider "github" {
  app_auth {
    id              = var.app_id
    installation_id = var.app_installation_id
    pem_file        = var.app_private_key
  }
  owner = var.github_organization
}

provider "aws" {
  region = "ap-southeast-2"
}

Resources

We also need to setup our resources. Since this is specifically for which GitHub Actions we would allow, there aren’t many resources to be created. We’re only really setting up 3 things here; the AWS S3 Bucket, the AWS DynamoDB Table and of course, the GitHub Actions Organisation Permissions. We won’t cover setting up the S3 Bucket and DynamoDB Table here, but for the GitHub Actions Organisation Permissions, you would set up as follows.

resource "github_actions_organization_permissions" "repository" {
  allowed_actions      = "selected"
  enabled_repositories = "all"
  allowed_actions_config {
    github_owned_allowed = true
    patterns_allowed     = [for action in local.actions : length(regexall("/", "${action.name}")) > 0 ? "${action.name}@${action.sha}" : "${action.name}/${action.sha}"]
    verified_allowed     = false
  }
}

The main part which glues all of this together is the following line

patterns_allowed     = [for action in local.actions : length(regexall("/", "${action.name}")) > 0 ? "${action.name}@${action.sha}" : "${action.name}/${action.sha}"]

We will elaborate on what exactly local.actions is in the next subheading, but what we have configured here is very much reece’ism but you can probably understand what the above function is doing.

Locals

Now this is really the crux of the whole whitelisting process.

locals {
  actions_whitelist = yamldecode(file("${path.module}/actions-whitelist/whitelist.yaml"))
  actions = flatten([
    for action_name, values in local.actions_whitelist : [
      for sha, s in values.sha : {
        name = action_name
        sha  = sha
      }
    ]
  ])
}

This iterates through the yaml file which has the list of which GitHub actions we would like to use in our organisation. Here’s a sample of our current whitelist yaml file.

ibiqlik/action-yamllint:
  sha:
    76fdac38393593cc46d89860c3e9698d4fe6b1a4:
      description: yaml linting action (v2)
      whitelisted: 2022-04-14
      whitelistedBy: --
    81e214fd484713882ce4237cb7cd264d550856cf:
      description: yaml linting action (v3)
      whitelisted: 2022-05-27
      whitelistedBy: --
hashicorp:
  sha:
    '*':
      description: official hashicorp actions
      whitelisted: 2022-04-14
      whitelistedBy: --

Using the whitelist, it enables us to pin certain third party actions we deem safe by either the SHA or just the action in general. For organisations we trust, we use a wildcard * to whitelist the whole organisations. But for other third party actions which we want to allow a specific version which our Security Team have reviewed, but don’t quite want to allow every single version, we pin by the sha. This is the most safest, and secure way to harden our organisation’s security.

As seen in the sample whitelist yaml above, we allow everything under the hashicorp organisation but we’ve pinned certain versions from ibiqlik for the action-yamllint GitHub action.

Conclusions - Terraform

Terraform in it of itself is an amazing IaC tool in a purely declarative state, however, we learnt that using programming in Terraform’s HCL files is not the right way of going about things and it can get messy, quick, especially for more complex data structures and logic - trying to dynamically create resources

Team & Member Management - Terraform CDK

Introduction

Learning from the previous task using terraform, we decided to use the provided terraform cdk, to have a more programmable way of dynamically creating our resources.

This is important as we have multiple teams in a hierarchy structure and members who needs to be in those teams. Similarly to how we implemented the GitHub Actions Whitelist, we will also have a dedicated repo for the data (teams & members) and another repo which will consume the data in the cdk.

Terraform CDK supports multiple programming languages, but since I was most familiar with Python, I decided to go with that (close runner up was Golang).

Data Repository

We had several discussions on how to setup our data repository. This was somewhat complex as hierarchies are annoying to try and represent. The following is a sample of our teams hierarchy.

└── Grandparent Team
    │   └── membership.yml
    └── Parent Team
        ├── Child Team 1
        │   └── membership.yml
        ├── Child Team 2
        │   └── membership.yml
        └── membership.yml

And adding members to teams are defined in the membership.yml files.

---
description: This is a Grandparent team
members:
  - user: firstname-lastname@reece.com.au
    role: maintainer
  - user: firstname-lastname@reece.com.au
    role: member

We have several parent and child teams to consider and you cannot reference to an instance created in the same resource.

This is was big head pain to deal with, but this type of reference is not supported in terraform as it needs to evaluate the resource as a whole first. The solution to this will be explained in the next heading, but it involves creating all the teams, then updating each resource with their parent team after.

Terraform CDK

After working with Terraform’s CDK for the past couple of months, I would like to highlight several drawbacks. However, these issues might be solved in the future as this project is still in its infancy.

  1. Lack of documentation on the classes and implementation. Had to go through the python package to find the implementation of the class. Also for the majority of the time, I didn’t have any syntax highlighting in my IDE (VS Code). But this was an issue on my end, had to change the source of the Python Interpreter to the virtual env I created where all the packages resided.

UPDATE: Was able to find documentation for the GitHub Provider

  1. Lack of documentation on example implementations in programming languages other than Typescript. At the time I was working on Terraform CDK, the documentation and the examples heavily favoured developers who used Typescript and only supported Unit Tests using JEST.

UPDATE: In the latest update v0.12.x features were added to support multiple programming languages for Unit Tests with examples. They can be found for the Unit Tests. But still no integration testing.

  1. Error messages were highly verbose - a simple error would spit out 30+ lines of errors, but you could still determine where and what the error was but was just a bit convoluted when debugging.

  2. Too slow in compiling. Especially with python, having to import 50MB+ files is extremely slow.

Where the Terraform CDK really shines is being able to dynamically create resource attributes.

Creating Teams Hierarchy

Creating teams dynamically is easily achievable.

# create github team
github_teams[team_name] = Team(
    self,
    team_name,
    name=team_name,
    description=data['description'],
    privacy="closed"
)

We have some predefined functions/variables to retrieve the parameter input values. team_name is extracting the directory names by walking through the directory structure. and data['description'] is being retrieved from the value from the key in the defined membership.yml file.

To read more on the Team class, go here

Adding Parent Team

As mentioned above, you cannot reference to an instead created in the same resource. So what we decided to do, after all the teams have been created, we just update in a different resource with the following.

if parent_team_name != "":
    github_teams[team_name].add_override(
        "parent_team_id",
        Token.as_number(github_teams[parent_team_name].id)
    )

Adding Members to Teams

All of our members are defined in the team director yaml files, so similarly as we have previously done for reading the team description from the yaml file, we read the member email and the role within the team (maintainer / member)

for d in data['members']:
    lowercase_user_email = d["user"].lower()

    if lowercase_user_email in gh_users.keys():
        gh_username = gh_users[lowercase_user_email]['github-username']

        TeamMembership(
            self,
            gh_username+"-"+team_name,
            team_id=github_teams[team_name].id,
            username=gh_username,
            role=d['role']
        )

There’s some logic to be taken in here, especially for the terraform id as the member to team relationship is one-to-many. Meaning a single member can be apart of multiple teams. Hence why for the id parameter in the TeamMembership class is hyphenated with the team name as well.

To read more on the TeamMembership class, go here

Conclusion - Terraform CDK

The Terraform CDK is a great tool to leverage however there are some pretty serious drawbacks mentioned above. The Terraform CDK generates JSON which terraform then just applies. The Terraform CDK is still in its infancy and is constantly getting new features and improved upon, so I see nothing but positives in the future and hopefully they can address some of the concerns that have been presented above.