Gitlab

Constructing the pipeline

September 22, 2016 DevOps, Gitlab, Haskell, Ubuntu No comments

If you successfully added a Gitlab project and pushed our Yesod code there, you might notice that some builds are being executed on your runner. That’s because the project already contains a .gitlab-ci.yml file. As you can see, it’s pretty much empty – just some prints for sake of checking whether the runner is configured properly.
Since it is, now is the time to adjust our pipeline to a more complex scenario. Obviously, there are dozens of pipelines used for many cases. Here I want to present one, quite simple deployment routine. We won’t be using all the steps yet (since we only have unit tests now), but they will come later during development (if not on this blog, then during your own coding sessions).

I propose a four-step pipeline, in .gitlab-ci.yml coded as:

stages:
  - dev-testing
  - packaging
  - integration-testing
  - publishing

dev-testing is the part executed by developers on their local machines – this usually boils down to some linter, compilation and unit test execution. I treat “unit tests” as tests which do not require any particular binary file available on build server (for example a separate database instance or some set of services). For these reasons, tests that use only sqlite are fine for me in this phase. Of course, feel free to disagree, I’m not going to argue about it. This phase goes first (before the “official” build phase), because – for compiled languages like Haskell – a separate build is required for test cases, and that’s kind of a commit sanity check. This stage should only fail if the developer didn’t run proper scripts before commit (ideally never), or if he’s not required to (e.g. for a really small project).

Next stage, packaging, is a phase that should never fail. It consists of building a whole deployment package, resolving dependencies, constructing RPM, DEB, Docker image or whatever deployment system do you use and pushing it to test-package repository (not necessarily – it may be sometimes passed as an artifact between builds).

Third stage, integration-testing is arguably the most important piece of the whole pipeline. It is needed to verify whether all the pieces fit together. They require full environment set up, including databases, servers, security rules, routing rules etc. I’m a big fan of performing this phase automatically, but many real-world projects require manual attention. If you have such a project, the best advice I can give you is – run whatever is reasonable here, and publish internally if it passes. Then handle the passed scenarios to your testers and add another layer of testing-publishing (possibly using a tool dedicated for release management). This stage will fail often – mostly due to bugs in either your code or your scripts (which are also your code) – there will be races, data overrides and environment misalignments. Be prepared. Still, it’s the purpose of this stage – things that will fail here, most probably would fail on production otherwise, so it’s still good!

The last stage, publishing is simple and should never fail – it should simply connect to release repository and put the new package there. It might be an input point for your Ops people to take it and deploy, it might be an input point for the testers. This stage should be executed only for your release branches (not ones hidden in developer repositories) and is the end of the automated road – next step has to be initated by a human being, be it deployment to production or further testing. This job should also make a proper version tag on the repository (this may be done in packaging as well, but I prefer to have less versions).

Of course, all stages may additionally fail for a number of reasons – invalid server configuration, network outage, out of memory exception, misconfiguration etc. I didn’t mention them earlier, because they aren’t really related to (most of) the code you create and will occur pretty much at random. However, remember my warning: while they might seem random, you should investigate them the first time you encounter any of them. Later on they will only become more and more annoying, and in the end you’ll either spend your most-important-time-just-before-release-oh-my to solve them or ignore the testing stage (which is bad).

A few more words about the choice of tooling: I tend to agree that Gitlab CI might not be the best Continous Deployment platform ever, especially due to limited release management capabilities and tight connection to automated-everything (I like it, but most projects require some manual testing). Perhaps a choice of Jenkins or Electric Flow would be better, but would require significantly more attention – first of all, installing and configuring a separate service and second – managing integration. Configuring Gitlab CI only takes a few lines of YAML, but for Jenkins it’s not that easy anymore!

Now, after we’ve managed to design the pipeline, let us create an example jobs for it.

dev-testing is easy – it should simply run stack setup && stack test (we have to linters for now).
preparing-package is a little trickier:

preparing-package:
  stage: packaging
  script:
    - stack setup
    - stack install --local-bin-path build
  artifacts:
    paths:
      - build/
    expire_in: 1 hr
  cache:
    - .stack-work

first, we need to install the package to build directory (otherwise it would remain in a hash-based location or be installed to local system – which is not what we want), then define the artifacts (whole build directory) and it’s expiration (1 hour – should be enough for us). The cache option is useful to speed up compilations – workspace is not fully cleared between builds. Note that this might be dangerous, if your tools don’t deal well with such “leftovers”. However, clean installation of GHC and all packages takes about a year, so caching is required (of course, you may also set up your own package server with a cache for the used ones, if your company is a tad bigger).
Rest of the stages is just printing for now – we have no integration tests, and installing Apt repository or Hackage server seems to be a bit of an overkill right now. I also hate polluting the public space (public Hackage) with dozens of packages, so I won’t do nothing right now there (I might reconsider later on, of course!).

If you download the code from GitHub, you will see that it doesn’t work in Gitlab. Apparently, stack is not installed in our Runner container! This requires quite a few commands, but luckily, they are listed in Stack GitHub installation manual.

For Ubuntu Server 16.04 this goes as following:

# add repository key
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys 575159689BEFB442
# add repository
echo 'deb http://download.fpcomplete.com/ubuntu xenial main'|sudo tee /etc/apt/sources.list.d/fpco.list
# update index and install stack
sudo apt-get update && sudo apt-get install stack -y

Manual configuration management and tool installation is not the best practice ever, but it’s often good enough, as long as the project is relatively small (or you have dedicated people for managing your servers). We might consider changing this to some configuration management tool later on, when dependencies get more complex.

Aaand, that’s it! First pipeline builds should already successfully leave the Gitlab area. Congratulations!

Next post, promised a long time ago – GHCJS instead of jQuery in our app – comes soon.

Stay tuned!

Preparing the deployment system

September 17, 2016 Containers, DevOps, Gitlab, Linux, LXC, Tools, Ubuntu, Virtualization 4 comments

To deploy Yesod applications we’ll need quite a lot of infrastructure – first, a machine that performs the build, it’s configuration and tooling, and second – a machine that acts as a production server. That’s the minimum – we may also need special machines for test runners, client emulation, performance tests etc. – but let’s ignore it for now and focus on the two machines – build server and production server.
To make our life simpler, we’ll be using virtualization instead of physical servers – it’s cheaper and easier to maintain, plus rollbacks are much easier (machine snapshot before risky change is sufficient). I use Windows 10 at my host, so I’ll be using VirtualBox as the virtualization tool, but you may as well use KVM or Xen or even VMWare if you happen to have a license (free version doesn’t provide snapshot feature).
At first I simply set up a virtual machine with Ubuntu Server 16.04. I chose Ubuntu, because it’s the only distribution I know of which has LXC/LXD (Linux Containers – kinda like Docker, but more flexible) provided by the repository management. Since we’ll be using Linux Containers a lot, being sure that they work correctly is a must.
Ubuntu installer is really nice, so it’ll lead you step-by-step through the installation (I’m not sure if Ubuntu has kickstart/preseed installation accessible as simply as in RedHat/CentOS, but we’re only going to do it once, so we can live with it taking a bit longer and requiring our attention – at least for now). Remember to install OpenSSH server and Virtual Machine host (KVM). We won’t need DNS for now, we’ll just stick with with mDNS (broadcasted domain names – for internal networks, implemented by Bonjour or Avahi).
You might wonder why do we do this manually instead of automated installation via Packer and management via Vagrant – the short answer is – that’s simpler. It’s generally simpler to perform some task manually than to automate it, since you have to handle error cases – and in case of automation, you have none. Plus, you need to know the interface, which (arguably) – in tools such as most OS installers – is created for humans (unless you’re deploying RHEL/CentOS – they have a great kickstart installation). I also like to learn how do things work before automating them – it really simplifies the automation development later on. Unfortunately, this also means that you won’t be able to simply download a script from the repository and run it (of course, CI scripts will be provided, but deployment ones – not yet).
Anyway, after installing the OS, next step is to set up LXC containers. To do that, we’ll simply follow the instructions from Linux Containers webpage to install LXC and configure unprivileged containers (it’s best done for a user without admin rights – otherwise it doesn’t have much sense, except for excercise). Default container images are downloaded from “somewhere on the Internet”, but this can be changed by setting MIRROR variable (details are available here). I used the original mirror (ubuntu, xenial, amd64 – watch out, you might not be able to install Gitlab on i386 release and new Intel i7 processors – i686), and recommend it for start. Remember to set up LXD configuration (via lxd init). After installing the container you won’t be able to log in via ssh or lxc start, since the server is not installed by default and there are no users – you have to use lxc-attach to create first user, and you’re ready to go!
Today we’re going to manually set up two containers – one with Gitlab Server, and second with Gitlab Runner (CI), together with the infrastructure needed by our application. Do not be afraid – all these operations can be automated using Puppet, Chef, Ansible or Salt, but today we’ll perform them manually – to get to know the problem better. Later on, when our infrastructure gets bigger (including Nexus repository and multiple runners) we’ll start provisioning the machines using one of these tools.
For now – just clone this container twice (lxc-copy --name [old-name] --newname [new-name]) – after that we’ll have three containers on our VM – one for Gitlab Server, one for Gitlab Runner, and one as a template for future containers (remember to install openssh!).
Installing Gitlab Server is really simple – just follow the instructions available on Gitlab webpage. Be careful – if you install 32-bit version of container, you might not be able to use the repository maintained by the Gitlab team, and the one maintained by Ubuntu didn’t work for me. On a x86_64 container it was a piece of cake, so I won’t dive into details, as now we’ve got a problem to solve – how to expose Gitlab Server, installed in container (automatically served via container’s port 80) to out host system (VM host, not container host)? Unfortunately, it’s less obvious than I’d expect.
First of all, you need to have network access from VM host to VM guest – in case of VirtualBox I use Host-Only network. It needs to be configured on startup – you can request that by adding:

auto [interface-name-eg-enp0s8]
iface [interface-name-eg-enp0s8] inet dhcp

Of course, you can also use static IP assignment if you prefer to.
Then you need to install iptables-persistent package – it will provide a possibility to save our new forwarding rules. Next, you need to set IP address for your containers. To do this, first open /etc/default/lxc-net and uncomment line: LXC_DHCP_CONFILE=/etc/lxc/dnsmasq.conf.
If you wish to, you can also modify internal network addresses – I prefer the 192.168.x.y network, so I changed them. Next, open /etc/lxc/dnsmasq.conf and add line: dhcp-host=[container-name],[ip-address]. This will cause internal LXC’s DHCP server to assign static IPs to these containers. So far so good, now it’s time for port forwarding. Luckily, the magical command is available on LXC page: iptables -t nat -A PREROUTING -p tcp -i [external-connection-name-eg-enp0s3] --dport [host-port] -j DNAT --to-destination [container-ip]:[container-port]. After that you might have to open the ports (we want to be able to access Gitlab Server from outside the virtual machine, depending on whether firewall is enabled). Persist the new rules (sudo netfilter-persistent save), and voila, Gitlab Server should be already accessible!

One more thing left – we don’t want to start the container after each VM restart, so we need to automatically start it. To do this, we need to add a flag to out container config:

lxc.start.auto = 1

you can add also to add lxc.group = onboot (it should just start up earlier), but it didn’t work for me. Possibly because I’m using unprivileged containers, and this means they aren’t cleanly ran on startup. Root containers use a nice approach of automatically running lxc-autostart on boot – but this is executed with root privileges. For unprivileged containers we have to additionally help ourselves with nasty tricks like the one proposed in ServerFault answer – by adding @reboot lxc-autostart to crontab.
Well, as long as it works it’s fine enough – at least unless we get some nicer solution (like an additional service, which we don’t want to write now, so we’ll use cron).
Now let’s log in to Gitlab (root is the default user name, you choose the password) and create some project. Then go to Admin Area (a tool in top right corner of the screen), then Overview tab and Runners subtab. As you can see, we currently have no runners (quite understandable, we didn’t add any). Now is the time to add one.
Remember our second container? Log to it now. Then run the bad command (curl -L https://packages.gitlab.com/install/repositories/runner/gitlab-ci-multi-runner/script.deb.sh | sudo bash – it’s bad because it gives root access to your machine to an unknown script. Good news is that we’re running that in an unprivileged container, so if something bad happens it should be simple to fix it). We can avoid the script by manually extracting repository address from the script and manually adding it to our apt-repositories.
Next steps are similar to ones done for the server – sudo apt install gitlab-ci-multi-runner, adding container to autostart. One more special step left for the runner is registering it to Gitlab Server. This can be done by running sudo gitlab-ci-multi-runner register. To perform registration you need network address of the server (either hostname or IP, with /ci appended, e.g. http://10.0.0.1/ci), registration token (available on the Gitlab page we’ve opened before installation). Additionally you have to choose a name for the runner (doesn’t matter), tags (don’t really matter for now – may be used to indicate machine power, architecture or something) and runner type – that’s quite important – I’ve chosen shell runner (which is pretty unsafe, but again – that’s why it’s in a container).
If you reload the Gitlab webpage, you should see that a runner was registered.

Congratulations! You’ve managed to set up a Continous Integration/Deployment environment and a Version Control Repository today! All in nice, separated virtual machines. How cool is that?

In the next post we’re going to provision our environment with the necessary tools – GHC, Stack – and run first automated, on-push compilations for our repository.

Stay tuned!