Migration to GitHub & GitHub Actions CI

Introduction

In less than 2 years reecetech was able to migrate around 2,500 git repositories from Atlassian Stash* to GitHub.com. This necessitated simultaneously translating and reworking all continuous integration & deployment workflows from Atlassian Bamboo to GitHub Actions.

Motivation for this shift came directly from feedback development teams had provided the Platform Engineering team. Areas of concern included:

The Platform Engineering team also were keen to make use of a SaaS solution for git & CI. SaaS would mean fewer VM’s to take care of, allowing the team to focus on more value-adding activities.

* you may be more familiar with the Atlassian BitBucket product name, but I still think Stash was a better name and that Atlassian made a mistake in 2015. After all, a bit bucket is where you throw away data!

Challenge

For those who spent more than a decade in the IT industry, then you’ve probably experienced at least one VCS migration+. Based on our experience, these kinds of migrations tend to have a long tail of un-migrated repositories due to the effort of rewriting the associated CI workflows. This tends to be the case since in a lean manufacturing sense, doing this kind of migration is waste. Waste in this sense is commonly defined as any action that does not add value to the customer. Hence deferring this kind of work is easily justified.

We had our doubts at the beginning of the project whether an aim to complete the migration by September 2023 was realistic.

To our advantage, however, most of our codebase was built & deployed using reusable snippets of Bamboo workflows. If the Platform Engineering team could provide equivalent reusable actions in GitHub Actions, then we could smooth the migration path for the other teams.

+ reecetech had migrated from subversion to git previously, and before subversion had used CVS. Some of the active codebase dates back to the mid-1990’s.

Success

Overall we’re very happy with the outcome of the migration. We decommissioned Stash and Bamboo a month earlier than our deadline of September 2023.

Let’s examine the factors that made this migration successful, including being delivered on time.

Migration was not “a project”

A project would have a project manager, a budget, and planned out milestones. Delivery at reecetech tends to avoid this model, instead adopting agile delivery with long-running, stable, teams. Do we throw out any obligation to deliver results? Of course not. We had two easy-to-understand guidelines to work with:

“Working software over comprehensive documentation” dictated the first few months (from July 2021) of the migration. The Platform Engineering team played with the setup of GitHub.com, exploring OIDC & SCIM, GitHub teams, single organisation vs. multiple organisations, etc.

The team also explored what it meant to get typical-looking Reece software built on GitHub. This meant iterating on self-hosting GitHub Actions runners, and, creating reusable actions that made it easy to publish artefacts to our Artifactory server.

The unknowns would have made it hard to plan all that we developed up-front. We didn’t know exactly what we needed, or how long it would take.

Because the migration was not only about the Platform Engineering team, after the initial setup we needed shift to focus to “Customer collaboration over contract negotiation” (noting that in this case, customers = colleagues). In October 2021, we invited our first pilot users into the new GitHub.com environment well before it was production-ready. We worked closely with other teams to find out things we hadn’t discovered yet.

Iterative work on various reusable actions, building all the different types of software at Reece, and, solving how to deploy to Kubernetes and AWS, continued for many months. In fact, the “official” start of migration didn’t commence until June 2022! Those measuring the metric “number of repositories remaining in Stash” had to remain calm, and keep faith in the process.

Batch size of one

The importance of small batch sizes is an important topic in the IT parable “The Phoenix Project” (definitely worth reading!). And what’s the best batch size? One.

We implemented a GitHub Actions workflow that would migrate a single repository at a time. This workflow could be dispatched from our chatbot - Helmet - giving the people of reecetech a familiar and easy-to-use interface to get the job done. Once this was available it meant any person in any team at Reece could migrate their code to GitHub.

A screenshot of the first Slack thread to migrate a repository to GitHub

Whilst we had received advice that we should migrate larger batches of repositories at a time (e.g. pick a department, and move all their repositories), we’re glad we didn’t follow that advice. One, it would have impacted the teams affected by the large batch. They would have had to stop work on other things, and, translate all of the build, test & deploy workflows in bulk. Until that was done they couldn’t continue with their other work. Two, it would not have allowed our people to learn and adjust after each migration. We’re fond of the saying “software is never done”, and the saying was true here. The migration workflow and the supporting reusable actions continued to be improved after the launch of the chatbot command.

Self-service

It was important to us that the migration work be pulled into teams, rather than pushed onto teams. It allowed teams to run the migration of their repositories when it suited them. Factors such as the current workload, and, knowledge in the team of how to translate Bamboo to GitHub Actions are difficult to assess from outside the team. This is why we exposed the migration workflow to all staff via the chatbot. Teams could take ownership of their success in this migration.

Swarm

The next success factor might seem to contradict the previous concept of “Batch size of one”, but stick with us whilst we explain why it does not.

In December 2022, we organised a “GitHub Migration Week” for the department in reecetech with the heaviest VCS + CI usage. This involved a break from usual work for all the developers in the department, and a one-week focus on migrating to GitHub. The week was gamified with the various teams in the department competing for glory and some GitHub t-shirts. Whilst this means we were migrating lots of repositories in a short period, each repository was still migrated individually via the chatbot & workflow. This kind of gamified approach was something we’d done before with success.

Keep in mind the labour-intensive part of a repository migration is the re-writing of the workflows to build, test & deploy the software. To encourage teams to not just migrate, but to also re-write, teams scored 10 points for migration, an additional 20 points for building and publishing, and finally an additional 40 points for deployment to production. This emphasised shipping working software, not just batch-processing migrations. The week gave the teams time to focus on and learn the specific technical skills required to make use of GitHub.com & GitHub Actions in a reecetech context.

A screenshot of a score update in Slack during GitHub migration week

The week resulted in us increasing the % of repositories migrated from 5% to 16%. In raw numbers, we still had a long way to go to get to 100%, but we finished the week with lots more people who now had hands-on experience doing the work required to get us there.

Managerial support

It was quite nice the Platform Engineering team could play the “good cop” role in the migration: supporting the teams' migration by providing the chatbot, authoring reusable actions, and providing advice. However, the overall migration still needed a “bad cop” somewhere in the process. That’s where the senior leader that the Platform Engineering team reports to came into the picture. They went to other senior leaders with our key metric (number of repositories remaining in Stash) and the deadline and made sure the migration was given managerial support in other areas of reecetech.

Make it visible

Another theme of The Phoenix Project is to make work visible - because IT work can quite easily be hidden. We created a website that showed the number of repositories yet to be migrated, grouped by departments that owned the repositories. People could follow a link to their department to see the specific repositories along with some meta-data such as the last committer and associated build plans.

A screenshot of the information radiator website showing a department view

Whilst the website was not quite the definition of an information radiator, which by definition needs to be in a “highly visible location”, it was able to be shared easily and used in discussions about prioritisation and ownership of migration activities.

Targeted assistance

With a handful of months to go, it became clear that some teams needed assistance to complete their migrations. These teams had one thing in common: they didn’t use Stash and Bamboo in a common reecetech way.

To help these teams meet the deadline the Platform Engineering team assisted. This took the form of consulting. The teams themselves still had to deliver the migration, but the Platform Engineer team dedicated time to teaching GitHub Actions, pair programming workflow translations, devising cross-repository access schemes, etc.

Deal with the leftovers

Towards the end of the migration, it was apparent there were a lot (hundreds!) of repositories that were “abandonware”. Any repository more than 2 years without a commit, and with no associated Bamboo workflows was migrated to GitHub and immediately archived. These repositories were migrated in bulk by the Platform Engineering team, but still made use of the migration workflow for consistency.

Thanks everyone!

The success of the migration is shared among all staff at reecetech, not just the Platform Engineering team.

A screenshot of a Slack message celebrating the conclusion of the GitHub migration