Success! 🙌 Finishing the Migration from Heroku CI to Jenkins on AWS

In Migrating from Heroku CI to Jenkins on AWS – Part One, I went into depth about our migration. We containerized our CI/CD using Amazon Elastic Container Service (ECS) and the Amazon EC2 Container Service Plugin for Jenkins. This allowed us the flexibility of defining all of the required types of build agents as different Docker images. This, along with the scalability of Amazon, allowed us to scale up and down our compute resources as demand required.

At first, we were only running the build agents as Docker containers. Then, at the client’s suggestion, we investigated running the Jenkins master as a Docker container. There were a few tripping points along the way, but we ultimately achieved this as well. The benefits of this configuration made it worth the extra effort.

Persisting Files After Migrating from Heroku CI

One key consideration in running the Jenkins master as a container is maintaining state. By design, each time you create a Docker container from an image, it starts from a clean slate. This works well for the build agents as it ensures consistent results. Each job execution will not be affected by the results of a previous execution. This, however, is not the functionality we want for the Jenkins master. We want to be able to upgrade the Jenkins master without losing the job configurations or build results.

Docker Volumes and Amazon Elastic File System Benefits

Docker volumes do allow for the creation of persistent storage for a Docker container. It has to ensure that data is maintained beyond the lifecycle of a Docker container. This only gets us one step further, however. We would still be tied to the lifecycle of an EC2 instance. With frequent updates to the AMI that AWS maintains for ECS, this would quickly become a headache to try to keep the operating system up to date. It would require frequent manual maintenance that would take time away from other areas of the project. It also isn’t a cloud-native approach to this problem.

Amazon Elastic File System (EFS) allows us to create a network file store. Similar to Docker volumes, it lives beyond the lifecycle of the EC2 instances to which it is attached. It will also automatically scale to accommodate the files stored on the filesystem, so we won’t need to constantly expand storage as we need it.

In order to leverage EFS, we modified the launch configuration. Our auto-scaling group mounts the EFS file system as an NFS share each time a new instance is launched in the group. This ensures that the files for the Jenkins master are always available at a known location on each instance in the Auto Scaling group. Second, the task definition that we create for the Jenkins master mounts a data volume that maps the mounted volume to the location within the Jenkins master container that Jenkins uses to store files (/var/jenkins_home by default).

Now, each time we upgrade our EC2 instances or the container running the Jenkins master, we maintain our configuration and build results.

Accessing Jenkins Logs

Now that the master is running as a container and is persisting its files to an EFS volume, we need to do the initial configuration. During the setup, Jenkins writes an initial admin password to a file in the Jenkins directory. It also outputs this password in the logs. You can get to the logs by SSH’ing into the EC2 instance and pulling the logs directly from the container, but this can be tedious, and it might not be desirable to have certain individuals have SSH access to the EC2 instance (or anyone for that matter).

ECS does allow for the configuration of log forwarding to CloudWatch. The task definition for the Jenkins master container can be configured to forward its logs to CloudWatch. That way, we can view the logs directly within the AWS console instead of having to SSH into the EC2 instance. We can also read the logs directly from the container. With this configuration in place, once the Jenkins master container starts, the initial admin password will be in the CloudWatch logs within the log group configured on the task definition.

Networking Master and Agents

The Jenkins master has been installed and is running. Next, we needed to set up networking so that the build agents and the master instance could communicate. While we could have used the public web address we were utilizing for the master, that would mean that the communication between the master and the build agents would take place over the public web, which is not something that we really wanted. This is also made difficult by the fact that the Jenkins master container, as well as the underlying EC2 instance, could be terminated at any time if one of them fails its health checks.

Our solution was to run the Jenkins master service behind an internal load balancer. It may seem like overkill for just a single container running on a single EC2 instance. But it gives us a consistent private web address to communicate with the Jenkins master. Even if a new container or a new EC2 instance is created, the address of the master will always be the same. By using an internal load balancer, we also ensure that the communication between the master and build agents takes place within a VPC and not over the public web.

Accessing Jenkins Master from the Public Web

Now that the Jenkins master is running behind an internal load balancer, we need to make it so that we can access it, preferably over the public internet. NGINX maintains a Docker image for its HTTP server that works great as a reverse proxy. By mounting a configuration file at a known location within the container, we can configure the NGINX container to act as a reverse proxy for the internal load balancer.

By placing this container behind an internet-facing load balancer, we were able to access it from the public web. And again, by using a load balancer, we can access it by a consistent web address even if the NGINX containers are recreated. We attached an SSL certificate generated by the AWS Certificate Manager and configured a Route 53 alias to the load balancer. Now, we can access the Jenkins master at a consistent, user-friendly web address utilizing SSL.

Granting Permissions to Jenkins Master

For the Jenkins master to launch build agents in ECS, it requires access to parts of the AWS API. Amazon ECS allows for the assignment of IAM roles to ECS tasks. By assigning a role to the task running the Jenkins master, you don’t have to save an AWS access key in the credential store of Jenkins. Instead, it will use the credentials assigned to the task itself. This way, we can define a role that is individual to the Jenkins master instance and only contains the permissions it requires to function.

A Couple of Notes

By default, EC2 instances are configured with a timezone of UTC. Docker containers will also inherit the underlying host’s timezone when created. This means that by default, the Jenkins master container will be running in the UTC timezone. In most cases, this is not desirable. We can change this by setting the TZ environment variable in the task definition for the Jenkins master instance. When the task runs, it will run in the specified timezone and the time stamps (and cron jobs) in the Jenkins master will reflect it.

Another lesson we learned was to keep the Jenkins master and the build agents running in separate ECS clusters on separate EC2 instances. While both are running the ECS-optimized AMI, we found that running the master and agents on the same instances led to situations where the build agents would consume too much CPU or memory. This led to the master instance becoming slow or unresponsive.

With health checks occurring at regular intervals, this led ECS to restart the master instance occasionally. The occasional job started and never finished (due to ECS forcibly stopping the master). By keeping the master and agents separated, we can ensure that there are always enough resources for the master. Additionally, we don’t run any builds on the master. We only utilize agents running as containers, ensuring the master isn’t bogged down by any builds.

Next Steps

This post (and the last one) has been more about the story of how we achieved the migration from Heroku CI to Jenkins on AWS. Next, I will dive into the CloudFormation template we use to maintain our Jenkins infrastructure. I’ll explain how this solution works on AWS.