[debops-users] Current state of the DebOps test suite

Friday, 15 December 2017

Hello everyone,

The tl;dr version
-----------------

Today I pushed the code of the new test suite for the DebOps monorepo that
i've been working on for the last three weeks or so. The test suite uses
GitLab CI, Vagrant, KVM and LXC to test DebOps roles and playbooks in
different configurations. You can check out the results here:

    https://gitlab.com/debops/debops/pipelines/15127360

The DebOps repository on GitLab is automatically mirrored from the one on
GitHub, to perform the role tests using the new GitLab CI pipeline. Along
with it, you can check out the DebOps tests performed on Travis-CI:

    https://travis-ci.org/debops/debops

The code on Travis is used to lint the code in the repository (YAML, Python
and soon Bash source code, reStructuredText documentation based on Sphinx), as
well as test if the DebOps Docker image is correctly built before the new
changes are pushed to Docker.

How DebOps was tested in the past
---------------------------------

Previous iteration of DebOps, which used separate repositories for each
Ansible role and the set of playbooks, relied on Travis-CI to perform the code
tests. The test suite was created by me and Nick Janetakis around 2015 and
used a central git repository to store the test suite itself; Rolespec,
written by Nick, was an application that performed the testing. The role
repositories themselves didn't have any test code in them - centralization
allowed for a common test environment for all roles and made the role commits
a bit cleaner since any issues in the test suite could be resolved in its own
repository.

Unfortunately with the merge of the project repositories to a single monorepo
this system had to be abandoned - Travis-CI has a short time limit for a test,
which could be occasionally exceeded by the larger roles like 'debops.gitlab',
etc. One option to test the monorepo was usage of the Travis build matrix,
however a limit of 200 separate builds in this system meant that with
currently ~120 roles it will quickly be not enough, not to mention that using
build matrix in this way means abusing free resources provided by Travis,
which I really wanted to avoid.

Therefore I decided to rely on Travis-CI only for the code linting and
ensuring that the monorepo was "sane" enough to work. The testing of the
individual roles and their combinations in playbooks had to be done
separately.

Current state of the DebOps tests on Travis-CI
----------------------------------------------

The main project repository on GitHub is hooked up to the Travis-CI public
infrastructure to perform automatic tests of any new pull requests. Because if
this, it's a good place to validate the incoming code to quickly find simple
errors. At the moment the Travis-CI performs the following tests:

- PEP8 compliance of the Python scripts is tested using 'pycodestyle'. At the
  moment the result of this test is ignored by DebOps due to massive amount of
  errors currently present in the various Python scripts in the repository.
  When everything is cleaned up, this test will be enabled and the PEP8
  compliance will be enforced.

- test of the 'debops-tools' scripts using nose2 package and existing set of
  tests written by Hartmut Goebel, back when they were in the separate
  repository.

- Build of the DebOps documentation located in the 'docs/' subdirectory, based
  on reStructuredText, Sphinx and ReadTheDocs. This test should ensure that
  the documentation available on https://docs.debops.org/ will be rendered
  correctly. Currently the documentation included in the monorepo is not
  published there yet, the role documentation needs to be moved in place.

- Ansible playbook syntax check performed on the 'site.yml' playbook which
  includes almost all other Ansible code. This test should catch simple issues
  with the Ansible code before more involved testing is performed. At the
  moment there's massive amount of warnings due to Ansible not finding
  inventory groups defined in the playbooks; the Ansible developers already
  know about this issue and hopefully this will be fixed in the future.

- YAML syntax of the entire project is tested using 'yamllint' with some
  relaxed configuration that should account for the style of the YAML used in
  DebOps. At the moment the result of this test is ignored due to massive
  amount of errors in YAML files all over the code base. When they are fixed,
  this test will be enforced.

- The syntax of the Dockerfile is checked by the 'hadolint' script to ensure
  that the changes to it will result in an usable container image.

- The 'debops/debops' Docker image is built locally on Travis-CI to check if
  new code affects the build (official DebOps Docker images are automatically
  rebuilt when new commits are accepted in the monorepo). This also serves as
  the sanity test; the 'debops.core' role is executed inside of the created
  Docker container to check if Ansible stack works correctly.

- At the end, the test suite checks if the DebOps repository contains any
  changes or untracked files not accounted by the '.gitignore' file to ensure
  that the repository stays in a pristine state.

The above test suite is accessible locally using the Makefile targets, as long
as the required applications are available on the host. You can run the 'make
help' command in the root of the DebOps monorepo to see available Makefile
targets; executing the 'make test' command will perform all of the above
tests. The test suite will check if Docker is available locally and if not,
the Docker tests will be skipped; other tests will result in an error if the
required applications are not available.

One thing I definitely plan to add to the above test suite is using
'shellcheck' to test all of the Bash scripts present in the DebOps monorepo.
If you have any other useful suggestions for what to add to the test suite at
this stage, let me know.

GitLab CI to the rescue
-----------------------

What remained to implement was testing of the DebOps roles and playbooks
themselves in a real environment, as similar to the actual production
environment as possible - that means separate "hosts", real DNS domain,
networking, usage of Debian instead of Ubuntu as a test platform.

Due to the scale of the project, writing custom software to manage all of the
tests was out of the question. Thankfully, GitLab provides a ready to use
Continuous Integration in the form of Gitlab CI (integrated in the main
application) and separate GitLab Runners which perform the actual operations.

Choosing GitLab for this task was an easy choice. DebOps includes
'debops.gitlab' and 'debops.gitlab_runner' roles for a long time, which
allows
an easy installation of a local development environment. This is important
because the tests used by DebOps are performed using the 'shell' executor,
access to which should be tightly controlled (see below for longer
explanation). Easy deployment of GitLab Runners allows for simple scaling if
necessary, which is a plus with a large number of tests.

Due to the recent changes in the GitLab licensing model, the Debian Project
plans to replace its existing community collaboration platform, Alioth, with
a GitLab instance. Choosing the same development platform should ensure, that
inclusion of DebOps in the new collaboration platform should be easy to
perform, if and when that happens.

Unfortunately, the GitLab feature set available at the time restricted how the
new DebOps test suite could be designed. GitLab supports several different
"executors" which can be used to perform operations on the Runner hosts. Some
of them (Parallels, VirtualBox, SSH and Kubernetes) were unsuitable right from
the get go, because I didn't use the mentioned virtualization solutions, or
remote execution via SSH was unnecessary.

That left the Docker and shell executors. Docker executor is very handy
- installing Docker on the GitLab Runner and providing the Runner UNIX account
to the Docker daemon is enough to have a complete testing solution with good
isolation for test jobs. Unfortunately, Docker itself is not sufficient to
test all of the DebOps roles, some of which operate on low-level host
functionality like network interfaces, bootloader, LVM, etc. Additionally,
Docker containers are designed to run only one application at a time, where
DebOps roles usually manage multiple services at once to provide desired
functionality, for example a database engine, web server and the application
itself. Using Docker for testing DebOps roles would severly limit available
options - DebOps is focused on management of the real hosts, therefore
a suitable alternative was required, either LXC containers or virtual
machines.

This left the 'shell' executor, which provides full access to the UNIX
environment on the GitLab Runner host. The runner itself is executed using an
unprivileged user account (at least that's how the DebOps role sets it up),
however managing LXC containers or virtual machines requires elevated
privileges. This is risky and with wrong deployment can lead to a security
breach, where anonymous users can put malicious code in the git repository and
create a pull request, which gets automatically executed on GitLab Runners.
Due to that security vulnerability, direct access to the GitLab tests is not
possible - only code reviewed by DebOps maintainers and accepted in the GitHub
repository gets pulled by GitLab and executed in the CI pipeline. The
equivalent feature on GitLab, merge requests, are disabled in the
'debops/debops' repository. This shouldn't be a huge issue, since the whole
test environment can be self-hosted and interested users can deploy it on
their own infrastructure if they so wish.

The 'shell' executor works directly on the GitLab Runner host, and running
DebOps playbooks directly on it would quickly make that host unusable.
Configuring GitLab Runners in each LXC container or a virtual machine was
unsuitable because that would require adding them over and over as runners in
GitLab CI. Because of that, an additional layer of abstraction was required.

Vagrant brings in isolation
---------------------------

Vagrant provides this form of abstraction, which can be easily used from the
command line, and therefore automated. Fortunately, Debian Stretch release
includes packages for 'vagrant-lxc' and 'vagrant-libvirt', which make the
whole solution almost entirely native to Debian. The Vagrant libvirt plugin
can manage KVM hosts through libvirt, so a suitable virtual machine platform
becomes available; the Vagrant lxc plugin can manage LXC containers, which are
faster than virtual machines and can happily test the bulk of the DebOps roles
which don't require full virtualization.

To make support for these plugins available out of the box in DebOps,
I updated the 'debops.gitlab_runner' role to detect the presence of LXC or
libvirtd support and configure corresponding Vagrant plugins on the host. This
fact is reflected in the tags defined automatically for each GitLab Runner,
and using these tags, specific DebOps role tests can be redirected to
a suitable platform. The whole project can be tested using libvirt/KVM,
support for LXC containers is optional and makes the execution of the entire
test suite faster.

A few pain points were left to resolve at this point. DebOps is maintained as
a monorepo, with multiple roles that can work and be tested independently.
However, due to how GitLab CI is designed, a small change in one file requires
all of the CI jobs to be executed, every time. The current number of tests
(120+) and slow test platform I'm using results in 3-4 hour test times for the
entire project, which was definitely unacceptable.

One way to solve this was caching the container and virtual machine images
- installation of required packages like Ansible, pycodestyle, etc. could be
performed once and reused for multiple test jobs afterwards. At first,
I planned to create a separate CI stage that created the container/VM image
which could be reused later, but this became problematic because it prevented
scaling horizontally from just one GitLab Runner to multiple Runners - each CI
job is directed towards one Runner, with this arrangement there was no way to
use multiple GitLab Runners at a time.

Configuration for each job was also an issue. Ideally, each DebOps role,
playbook, sets of multiple playbooks, etc. would have their own entry in the
one single '.gitlab-ci.yml' file (multiple separate files are not supported).
Due to number of jobs this had to be somewhat standardized so that management
of the tests and adding new ones would be easy and sane enough to do.

Sheer number of tests called for a tool that knew how to execute Ansible
playbooks, multiple times, and check for results, for example verifying
idempotency. Having a way to augment each test independently as needed due to
the role requirements, and a way to include notification about each the state
of each job would be a bonus.

There are a few test frameworks available that could fit some of the bill,
like Rolespec, Test Kitchen, Molecule, but due to unique DebOps requirements
I felt that none of them was suitable enough. So, I wrote my own.

Introducing Jane - Just ANother Executor
----------------------------------------

Jane is a fictional character in the Orson Scott Card series of books about
Ender, which also include ansibles. She is a distributed, omnipresent entity
evolved from a computer program AI, that manages Ender's financial assets,
among other things.

The 'jane' script, located at 'lib/tests/jane' in the DebOps monorepo, is
a Bash 4 script that serves multiple roles within the DebOps test suite.
GitLab CI is the primary "executor" of the tests jobs, hence jane is just
another executor working on behalf of the GitLab Runner shell executor, that
works in each separate job in parallel.

Jane is used on each stage of the CI pipeline - in GitLab Runner itself to
execute different steps of the job, during Vagrant host provisioning to inform
about different stages of the provisioner script execution, and inside of the
VM/container to execute Ansible playbooks and other tests. At each step of the
CI job, Jane takes care to transfer interesting environment variables over
from GitLab Runner, through Vagrant provisioning to the resulting VM/container
so that each job has consistent environment.

Jane can detect if a specific Vagrant box with cached VM/container image is
available; if there isn't one, a random Jane instance on a given GitLab Runner
host starts to create one automatically. The rest of Jane instances pause and
wait for the box image to be finished, before using it themselves. This
happens independently on each GitLab Runner host, therefore scaling out to
multiple Runners is painless.

Very early in the job execution, Jane checks for changed files in specific
paths in the repository or special strings in the git log of all commits in
a particular pull request, including changes from the previous set of merged
commits. If nothing interesting is found, a given Jane instance can end the
particular CI job early, which results in shortening the time for the complete
pipeline run from 3-4 hours to about 20-40 minutes, depending on the scope of
changes. The tests can also be enforced using special environment variable,
either for a specific CI job, an entire CI stage or globally for all jobs.

Jane is very talkative and colorfully describes each step of a CI job, in
excruciating detail. This can be very useful in debugging of a failed job or
checking why a particular CI job was even executed.

With Jane, each CI job has easy to use, templated and standardized
configuration defined in the 'gitlab-ci.yml' file using environment variables.
The configuration can be used to augment Ansible inventory (Jane is used
during playbook execution as a dynamic Ansible inventory source) with custom
inventory groups or variables, as necessary. Multiple playbooks can be
executed in sequence therefore testing roles that depend on other service
roles like databases is certainly possible. In the future, Jane will learn how
to manage multiple Vagrant hosts at once, so that multi-host tests will become
available.

Lots of things still to do
--------------------------

It's getting late and this mail is long enough already, therefore I'll try to
finish up quickly. There are still a few things I want to do before the new
test suite is finished. I think that right now the most important is
documentation - the new test suite feels almost like a separate project in and
of itself beside the main one (DebOps), and it could probably be extracted
from the DebOps monorepo and published as standalone, for other similar
projects to use. Extensive documentation should help with that. I'm heading
out to holidays and will be mostly offline until the New Year, but in some
free time I'll try and document everything properly, there are many
interesting things this test suite is capable of.

At the moment the roles and playbooks are only validated by Ansible itself via
an idempotency test, but there's no further integration testing done. I plan
to implement this using testinfra, because it's written in Python and
similar language to Ansible should make maintenance of the integration tests
easier, hopefully. Unfortunately testinfra is not currently packaged in
Debian, therefore it's installed from PyPI; I hope that in the future Debian
release it will be available natively.

That would be it for now. I wish all of you happy holidays, and see you back
on IRC in 2018. :-)

Cheers,
Maciej

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016