CI/CD: Using GitLab and Ansible to deploy to Docker Swarm
How we deploy to Swarm from GitLab using Ansible
August 25, 2021
Yes, this is an article about Docker Swarm in 2021!
We have previously explained how we use GitLab CI and Ansible to deploy services.
In this post, we will show how we use the same setup (GitLab and Ansible) to build and deploy containers to Docker Swarm.
TL;DR
- We use GitLab CI to build and store a Docker image matching a Git tag
- We use Ansible (as a shell runner) to
template
our Docker Stack file - We use the
docker_stack
module to deploy to Swarm
Quick reminder
- GitLab is a web-based Git repository manager with CI/CD pipeline features.
- Ansible is an automation tool for provisioning, configuration management, and application deployment.
- Docker is a program (and much more) that runs containers.
- Docker Swarm is a container orchestration tool provided by Docker.
Workflow
GitLab CI is at the center of our CI/CD system.
For one of our project that runs on a Docker Swarm cluster, the CI/CD pipeline looks like this:
- Test & Build: build a Docker image from a Dockerfile. The image is stored on the GitLab Container Registry.
- Deploy with Ansible to specific environments.
The pipelines are run on tags only.
Here is the complete workflow:
build stage deploy stage swarm post deployment
┌─────────────────┐ ┌────────────────┐ ┌─────────────────────────────────┐
┌────────┐ ┌───────────────┐ ┌──────────────┐ ┌──────────────┐ ┌─> worker
│ GitLab │ ► │ GitLab Runner │ ► │ Ansible │ ► │ Docker Swarm │ ──> worker
└────────┘ └───────────────┘ └──────────────┘ └──────────────┘ └─> worker(s)...
private build, push stack deploy manager
instance docker executor shell executor
Step 1: build the image
For the first step, we use the following job in our .gitlab-ci.yml
file:
Build:
image: docker:20-git
stage: build
only:
- tags
script:
- echo -n $CI_REGISTRY_PASSWORD | docker login -u $CI_REGISTRY_USER --password-stdin $CI_REGISTRY
- >
docker build
--build-arg http_proxy=$http_proxy
--build-arg https_proxy=$https_proxy
--build-arg no_proxy=$no_proxy
--tag $CI_REGISTRY_IMAGE:$CI_COMMIT_REF_NAME
.
- docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_REF_NAME
We use docker:20-git
because our Dockerfile
requires git at some point (composer dependencies are sometimes fetched with git). Also, we specify a version of the docker image, because we like reproducibility. Using latest
would break the CI at some point.
The script has 3 commands:
docker login
docker build
(with proxy build args): the image is tagged the same as the Git tag with$CI_COMMIT_REF_NAME
docker push
to push the image to the GitLab Container Registry
At this point, the GitLab Container Registry contains an image tagged the same as the Git tag being deployed.
Step 2: deploy to swarm
On GitLab
To deploy the containers, we use the following jobs:
Deploy to production:
stage: deploy
script: *deploy
when: manual
only:
- tags
environment:
name: production
tags:
- deployment
variables:
GIT_STRATEGY: none
ANSIBLE_INVENTORY: prod
ANSIBLE_SUBSET: all
Deploy to TH2:
stage: deploy
script: *deploy
when: manual
only:
- tags
environment:
name: production-th2
tags:
- deployment
variables:
GIT_STRATEGY: none
ANSIBLE_INVENTORY: prod
ANSIBLE_SUBSET: th2
You can see we use two options here:
- Deploy to environment “production”, or
- Deploy to environment “TH2”.
Both jobs set a GIT_STRATEGY
to none
because we do not need the source code anymore at this point: the docker images are ready.
Both jobs use a common script defined elsewhere, using a YAML-specific feature: anchors.
Here is the .deploy
reusable hidden job:
.deploy: &deploy
- >
cd /var/ansible &&
sudo -E ansible-playbook ci_api.yml
--diff
--private-key="/var/ansible/ssh_keys/gitlab/gitlab-ci"
--inventory="inventories/$ANSIBLE_INVENTORY"
--limit="$ANSIBLE_SUBSET"
-e "API_IMAGE=$CI_REGISTRY_IMAGE:$CI_COMMIT_REF_NAME"
-e "GITLAB_USER_ID=$GITLAB_USER_ID"
-e "GITLAB_USER_LOGIN=$GITLAB_USER_LOGIN"
-e "GITLAB_USER_NAME=$GITLAB_USER_NAME"
-e "CI_COMMIT_REF_NAME=$CI_COMMIT_REF_NAME"
-e "CI_COMMIT_SHA=$CI_COMMIT_SHA"
-e "CI_ENVIRONMENT_NAME=$CI_ENVIRONMENT_NAME"
-e "CI_PIPELINE_ID=$CI_PIPELINE_ID"
-e "CI_PROJECT_URL=$CI_PROJECT_URL"
-e "CI_RUNNER_DESCRIPTION=$CI_RUNNER_DESCRIPTION"
-e "CI_COMMIT_MESSAGE=$CI_COMMIT_MESSAGE"
-e "CI_REGISTRY_USER=$CI_REGISTRY_USER"
-e "CI_REGISTRY_PASSWORD=$CI_REGISTRY_PASSWORD"
-e "CI_REGISTRY=$CI_REGISTRY"
This job is the link between GitLab and Ansible. Because we set a specific deployment
tag on our jobs, only some selected runners receive the job. They are configured as shell runners, where Ansible is available and ready.
The ANSIBLE_INVENTORY
and ANSIBLE_SUBSET
previously defined variables are passed to Ansible args --inventory
and --limit
respectively.
The deploy job passes a lot of variables to Ansible using -e
that will be used in the Ansible tasks. Most are used to log what is being deployed by whom. The most two important args are:
- the image to deploy:
API_IMAGE=$CI_REGISTRY_IMAGE:$CI_COMMIT_REF_NAME
- the registry url and auth:
CI_REGISTRY_USER
,CI_REGISTRY_PASSWORD
,CI_REGISTRY
All variables on the right side of the -e
assignments are GitLab predefined variables.
The image to deploy is $CI_REGISTRY_IMAGE:$CI_COMMIT_REF_NAME
, which is exactly the image we pushed at the build stage using docker push
.
Ansible
The playbook is quite simple:
- name: Deploy API
hosts:
- docker-swarm-managers
roles:
- callr/ci_api
First, we template
our stack file to all manager nodes:
- name: Copy stack file
template:
src: stack/swarm-stack-api.yml
dest: /opt/swarm-stack-api.yml
mode: 0444
register: stack_file
tags: stack
The interesting parts of the stack file are:
version: "3.7"
services:
backend:
image: "{{ API_IMAGE }}"
deploy:
mode: replicated
replicas: 8
Notice how we are using the API_IMAGE
variable passed from GitLab to Ansible. This is how a Git tag becomes a Docker image and then a Docker Stack service image.
Then, we task Ansible to authenticate with the GitLab Container Registry:
- name: Docker login
docker_login:
registry_url: "{{ CI_REGISTRY }}"
username: "{{ CI_REGISTRY_USER }}"
password: "{{ CI_REGISTRY_PASSWORD }}"
To authenticate, we use the GitLab CI/CD predefined variables, specifically the CI_REGISTRY_USER
and CI_REGISTRY_PASSWORD
. Those are only valid for the job.
Then, we can deploy our updated stack:
- name: Deploy stack
docker_stack:
state: present
name: api
prune: yes
with_registry_auth: yes
compose:
- /opt/swarm-stack-api.yml
run_once: yes
when: stack_file.changed
If the stack file has changed, we run the docker_stack
module, with:
prune: yes
to remove the services not used anymore,with_registry_auth: yes
, because we want to send the registry authentication details to swarm agents,run_once: yes
because the stack deployment needs to happen on one manager node only.
We finish with a Slack notification, using the what/whom variables:
- name: Send slack notification
slack:
token: "{{ slack_token }}"
channel: "#ops-prod"
attachments:
- text: "API deployed from Gitlab CI\n"
color: "#39932A"
fields:
- title: "Environment"
value: "{{ CI_ENVIRONMENT_NAME }}"
short: yes
- title: "Git tag"
value: "{{ CI_COMMIT_REF_NAME }}"
short: yes
- title: "Hosts"
value: "{{ ansible_play_hosts|join(', ') }}"
short: false
- title: "Git commit hash"
value: "{{ CI_COMMIT_SHA }}"
short: false
- title: "Deploying user"
value: "{{ GITLAB_USER_NAME }} (login:{{ GITLAB_USER_LOGIN }})"
short: yes
- title: "CI Runner"
value: "{{ CI_RUNNER_DESCRIPTION }}"
short: yes
- title: "Last commit message"
value: "{{ CI_COMMIT_MESSAGE }}"
short: false
- title: "Pipeline"
value: "{{ CI_PROJECT_URL }}/pipelines/{{ CI_PIPELINE_ID }}"
short: false
delegate_to: localhost
run_once: yes
when: stack_file.changed
ignore_errors: yes
At this point, we have reached the end of the CI job, and the end of the “deploy” stage of GitLab CI.
we are here
-----------
˅
build stage deploy stage swarm post deployment
┌─────────────────┐ ┌────────────────┐ ┌─────────────────────────────────┐
┌────────┐ ┌───────────────┐ ┌──────────────┐ ┌──────────────┐ ┌─> worker
│ GitLab │ ► │ GitLab Runner │ ► │ Ansible │ ► │ Docker Swarm │ ──> worker
└────────┘ └───────────────┘ └──────────────┘ └──────────────┘ └─> worker(s)...
private build, push stack deploy manager
instance docker executor shell executor
Docker Swarm
However, the deployment is not done yet. We just gave the order to Swarm, but the deployment itself will take some time, depending on your deployment strategy.
The docker_stack
module returns immediately, it does not wait for services to converge. Though it can return the stack diff.
One important note: remember we have authenticated to the GitLab Container Registry with CI_REGISTRY_USER
, which is only valid for 5 minutes after the job is done. If the Swarm deployment takes longer than 5 minutes, your deployment may fail, because nodes will not be able to fetch the image from the registry. You have other options here:
- Use a deploy token
- Use a personal access token
- Extend the default token expiration timeout: Admin area > Settings > CI/CD > Container Registry > Authorization token duration (minutes)
Final Notes
We have been using this workflow for 2 years now, and it has been working great. We are still considering using docker_swarm_service
to have a better control, but we like to have our dedicated stack file.
We will probably give a try to Nomad in the coming months, and see how it compares to our Swarm cluster. We like simple things, so it might be a match. Time will tell!