Arie Bregman

Linux And Stuff

OpenStack Infra: Understanding Zuul

Recently I had the time to explore Zuul. I decided to gather everything I learned here in this post. Perhaps you’ll find it useful for your understanding of Zuul.

I split it into two posts. This post will focus on understanding what is Zuul and how it works. The second post will focus on how to deploy it along with common failures in the process.

What is Zuul?

In the beginning there was only Jenkins. The infra was unformed and broken , darkness and sadness was among developers. So the wise infra people decided to create Zuul.

Zuul acts as the gatekeeper, it handles the patches the moment they are submitted by the developer. It manages the triggering of the relevant jobs and makes sure the patch merged only after all the tests passed successfully .

At this time of point, Zuul v2 is the stable version and v3 is already actively developed.

Zuul components

To understand better what is Zuul and how it works, we first need to get familiar with its different components.

Zuul-server

The main service of Zuul.  Also known as the scheduler  It handles events from Gerrit. Once event received, it’s responsible for starting the job build and also to collect build results and post them to Gerrit.

The way it’s communicating with Gerrit is by executing “gerrit stream-events’ and listen for events. (Code reference)

It’s recommended to have a dedicated user for the communication with Gerrit. The user should have permissions to the projects you would like to manage with Zuul.

In addition, the scheduler also communicates with Gearman.

Zuul server source code

Gearman

There is a lot you can say about Gearman. Let’s focus on Zuul related stuff.

Shortly, Gearman is a protocol for running jobs on distributed workers. It wasn’t developed specifically for Zuul, but the Zuul developers decided to use it since it was a good match for their needs.

You can choose either setup your own separate  Gearman sever or let Zuul create it by specifying the following lines in /etc/zuul/zuul.conf

Including this section, specifically ‘start = true’ , will cause Zuul to take care of starting and running the Gearman server. (Code reference)

Zuul Gearman source code

Zuul-merger

Let’s start by saying it’s not the component that merges your change when all tests passed.

When a change get submitted by a developer, zuul-merger merges it locally, in a forked repository, so it can be served for testing.

You can have zuul-merger running on separate node and more important, you can have multiple nodes, each with its own zuul-merger daemon.

Its configuration located also in /etc/zuul/zuul.conf, under the [merger] section. Let’s see an example for such section in zuul.conf

git_dir is where zuul keeps its own copies of the Git repositories its monitoring .

log_config is where zuul-merger’s log being kept.

pidfile is the path to PID lock file.

zuul_url is the url of your Zuul server. remember,  it can be on separate node.

Merger source code

Zuul-cloner

Unlike the other components, there is no daemon for zuul-cloner. It’s a client, used to create the job’s workspace (Code reference)

Cloner source code

Zuul Workflow

Let’s review Zuul workflow from the moment patch submitted to Gerrit. For this, I created the following drawing which should make it easier to follow the workflow

zuul

Gerrit

The first step is a simple one – the user submits a new patch to Gerrit. It  can be also a new comment, it really depends on your Zuul layout file.

Once the patch submitted, Gerrit will publish new event, letting other services, which consume the events stream (such as Zuul, Jenkins, etc.), to know about the new submitted patchset.

Zuul Server

When you start zuul-server, it registers connections based on what you configured in your zuul.conf. One of those connections is Gerrit. Zuul server basically starts the GerritWatcher, which watching the Gerrit events stream (step 3). Right after registering the connections,  zuul-server also starts Gearman (unless you setup a dedicated server for it) and the Scheduler.

When an event added to events stream, GerritWatcher lets the scheduler know about it. The scheduler is checking if it has interests in this event (= matches to the conditionals in layout.yaml) if yes, it adding it as a trigger event (step 4).

Next, the Scheduler is processing the event and adding it to the relevant pipeline (step 5. Again, depends on your settings in layout.yaml). The change is then processed in the pipeline and the scheduler will also check if any additional changes are required (e.g merge the change).

Zuul Merger

Now, apart from starting zuul-server, you also need to start zuul-merger (if you didn’t, now is a good time 😀 ). zuul-merger has an important role in the workflow. While  the scheduler is the one responsible for triggering the events and identify that change occurred, the actual work of cloning the project and make changes to it, is done by the merger

So in step 6, the merger will get what is called a “merge job” from Gearman. The merge job includes a lot of details like project name, branch, ref and change number. As a first step, the merger will check if there is already a commit for this ref in the repo. If yes, it will proceed with the found commit. If not, it will try to get the most recent commit, given there is no recent commit, it will continue to resetting the repository.

Resetting is done with HEAD ref, unless it’s reference to non valid branch, which will cause Zuul to simply choose the first valid branch. Next, the merger will update the repo, by running ‘git fetch’ and then it will checkout the ref.

Now the merger reaches the interesting step of trying to the merge your change to the repo. If succeeded, it will create a reference (which also known as ZuulReference in the code/logs) in the repo, on your Zuul merger server.

Finally, it will send a notification that its work is complete

Gearman & Jenknis

Now that the repo is ready, after merger’s work, we are ready to run the job(s) and test the change.

This is actually done be the Schulder, but without Gearman it will not work. The scheduler launching the job by submitting function:<job_name> to Gearman. The scheduler also provides several interesting variables which start with ‘ZUUL_’ (very similar to ‘GERRIT_’ variables) and you’ll probably want to use them in your jobs.

Gearman then sends the data to Jenkins, which will trigger the job on one of the available slaves. The result will be reported back to Zuul, which will notify Gerrit and possibly merge it (assuming all tests passed).

Do I need to use Jenkins?

Only if you want to 🙂

Not long ago,  OpenStack Infra decided to abandon Jenkins and use only Zuul. If your CI is heavily depended on Jenknis and its plugins, it might be difficult make this switch to non-jenkins environment.

 

A good strategy would probably be to try and transition slowly by including Zuul in your environment with Jenkins and step by step switch to other infra projects such as nodepool, until you no longer need Jenkins.

If you do choose to use Jenkins, you will need to use the Jenkins Gearman plugin.

Pipelines

Pipeline is an important concept in Zuul. Each Gerrit change is processed by one or more of the existing pipelines.

Pipeline source code

Upstream

In upstream, each change, when submitted, is entering the check pipeline, to be verified and receive +-1 according to the results.

Next, after getting approved by two core developers (= two ‘+2’ votes), it’s processed in the gate pipeline. If it passed the tests in gate pipeline, the change will be merged.

Next, there is the post pipeline. Used to trigger jobs after the change has been merged.

You have even more useful pipelines:

  • periodic – time-based trigger jobs. Useful for running nightly jobs for example.
  • expermental – if you have new jobs that you are not sure how they work, you can test them in the experimental pipeline
  • silent – I have no idea why would you need this pipeline 🙂 if you know, make sure to leave a comment
  • pre-release – trigger jobs based on new pre-release tags
  • release – trigger job when the change marked as a release

Custom

In your internal environment, you will probably want to have the same pipelines as in upstream, but you don’t have to.

You can call your internal pipelines, however you want and decide on how they process changes. It’s all configured in your zuul layout file.

Let’s see an example for pipeline in layout.yaml

We defined a new pipeline, which called ‘test’ and the changes it will process will be new patchset submitted to our Gerrit server. If the jobs ran successfully, it will vote with +1 on Gerrit, if one of the jobs failed, it will vote with -1 on Gerrit.

There is also this line:

It determines how the changes in your pipeline are handled.

IndependentPipelineManager

You should use the independent pipeline when there is no significance to how the changes are processed, in which order.

This is especially true when the patch is not going to be merged after running the jobs, so it’s safe to test the change in the same time other changes are tested.

An example for such pipelines in upstream would be the ‘check’ and the ‘post-merge’ pipelines. ‘check’ pipeline, tests the change right after it uploaded, before getting any review from the cores. The post-merge is done after merging, so the order is insignificant.

DependentPipelineManager

I believe it would be safe to say that the dependent pipelines is more complex due to the dependency assumed between the patches.

‘gate’ pipeline is using the ‘dependentpipelinemanager’ since gating in the CI jargon is basically when you testing the change in order verify it’s ready for merging,  so the order is very important here.

There is some speculation in the process, since when you have multiple changes submitted and you apply each one of them to the tip of the project, you basically assume each change is going to pass the gates and be merged.

If one of the changes fail to pass the gates, all the dependent patches are tested again, without this patch.

There is one section in Zuul documentation which I really liked and describes the process of testing very well. I recommend to read it.

How to deploy Zuul?

This part, due to its longevity, is described in a separate post called (surprisingly) “How to deploy Zuul”.

1 Comment

  1. based on my experience ‘silent’ pipeline is mainly used by the infra themselves when they want to test a job without leave trace (result, comment) in gerrit

Leave a Reply

Your email address will not be published.

*

© 2017 Arie Bregman

Theme by Anders NorenUp ↑