Reusing Apps Between Teams and Environments Through Containers

This was my speech for the WeAreDevelopers Container Day on February 3rd, 2021. The talk will feature a live demo showing how to build, optimize, and distribute containers to be reused in as many environments as possible, 100% based on the experience of the VSHN team.

Introduction

Thanks a lot everyone for attending this presentation, and thanks to the WeAreDevelopers team for having me today on this stage. My name is Adrian Kosmaczewski, I am in charge of Developer Relations at VSHN, the DevOps Company located in Zürich, Switzerland.

When organizations start using containers, they usually see them primarily as a lightweight virtual machine. Developers find it easy to learn how to write Dockerfiles, and Operation teams appreciate being able to start and stop them quickly and with low requirements. As time passes, the usual path for those organizations consists in assembling larger applications in Docker Compose, and finally to migrate those apps to Kubernetes.

But this is not the focus of our meeting today. In this talk I will show you how containers are used inside VSHN to not only run cloud native applications, but also to share internal tools in companies

The Challenge of Internal Tools

In the latest “State of Internal Tools” report by Retool,

More than 80% of respondents said that internal tools are critical to their company’s operational success.

However,

Despite how important internal tooling is, teams still struggle to get the buy in they need: more than half of respondents indicated that one of their biggest problems building internal tools is that they don’t have enough time or resources to get the job done.

We at VSHN have lots of internal tools and our teams have to juggle between the maintenance of our own customer’s systems, which is our core business, and maintaining and keeping updated the myriad tools that we use every and each day. Most of those custom tools are command-line based.

To make things a bit more complex, each VSHNeer can choose their preferred operating system, and this means that our technical teams feature users running various Linux distributions (the vast majority, using Ubuntu, Arch, Fedora, and others), Macs, and even Windows PCs.

We also use the best possible programming language for the job; we have internal tools written in Go, C++ , Java, Python, and JavaScript; each has its own requirement, libraries, workflows, and ecosystem.

Go, C and C++ are great languages for multi-platform command-line tools, but to build them, they require a certain degree of knowledge that not all teams might have.

On the other hand, languages like Ruby, Python, and JavaScript have a great library ecosystem, which make them a fantastic choice for prototyping and releasing new features faster. But they require quite a bit of knowledge in terms of libraries, runtimes, and even language versions.

To make matters more complicated, some tools require custom configuration and resources: fonts, styles, images, default configuration files. All of this increases the chances for something to go wrong during installation or runtime.

And to top it off, we need to reuse that knowledge in CI/CD pipelines ; after all, that is one of the core pillars of DevOps, and our slogan is, quite literally, “The DevOps Company”. We have to eat our own dogfood.

Our challenge, then, was to find a way to share tools and knowledge with one another and with other systems, with the least possible amount of friction. And for that, we have chosen containers.

Sharing with users in other platforms

The first challenge is then to reuse tools with users in other platforms. To illustrate how containers help us do that, I have created a small TypeScript application. I have chosen to show an example in the Node.js / JavaScript ecosystem, because this is arguably the one with the highest headache per minute ratio in the industry.

TypeScript is a language that compiles to JavaScript, which means that we need quite a bit of infrastructure to first compile the application, and then to run it. Lots of dependencies, and lots of “downloading the internet”; we live in the times of npm , which has taken the crown from mvn in the past decade.

The project I’m going to show is called “Greeter”; it is a very simple command-line tool that greets the user on the command line using various characters: Snoopy, R2-D2, a cat, and many more. Characters can “talk” or “think”, that is, their text bubbles change depending on the context.

All of this has a certain number of dependencies, and if someone would like to run this app, they would need to install Node.js and perform quite a few operations before being able to run the app.

Our objective today will be reusing this tool in as many environments as possible, with the least amount of stress.

To make things even more interesting, we are going to use Podman instead of Docker, and as you can see, I am not using an alias in my command line. This is actually Podman running. And because Docker Hub now imposes certain download restrictions, we are going to use two different container repositories: GitLab and Quay.

As the Joker said, “Gentlemen, let’s broaden our minds”.

The Dockerfile is using that great technique called “Multi-Stage Builds”. This is probably one of the greatest features Docker introduced in the past 5 years and it has saved countless Petabytes of storage all over the planet.

This technique consists of using a separate image for the build process, and another for runtime. The advantage being that the resulting image that is shared and used is well, as small as possible, and does not contain libraries that are only required during the build.

On the first stage of our Dockerfile we install all the requirements for building our app; that includes the TypeScript compiler, Gulp, and some Gulp plugins. The package.json file of the project shows that there are quite a few dependencies required.

Then we use the pkg project to create a single executable file that contains all the required libraries.

Finally, on the second step of the Dockerfile, we use the most lightweight possible distribution (Alpine, of course), we install glibc (sadly, pkg does not yet work very well with musl alone) and finally we copy our executable.

To share our container with others, let’s use a public container repository. In this case I will use Red Hat’s own Quay.io service, where I create a public project for our “Greeter” image.

We can now push our final container image to Quay.io. To reuse it , I will switch to a VMWare virtual machine where I am running Windows and where I have installed Docker. This is all very “Inception” like, or as Edgar Allan Poe would say, “A dream within a dream”.

If the Quay.io project was private, I would need to login into my Quay.io account using Docker, of course. But this one is public, so I can pull my newly created image with Docker Desktop for Windows, and boom, I can run it. Very simple. Podman produces images using the same standard as Docker, which means that the interoperability is perfect.

Reusing in GitLab CI/CD

But as you can see I’m hosting my application in GitLab, and GitLab has a lovely CI/CD system and a fantastic container registry, all built in. Let’s add a simple .gitlab-ci.yml file in our “Greeter” project, and this way we will be able to store the product of our Dockerfile directly into our project’s own image registry.

This is even better! Now I can login to registry.gitlab.com from Podman or Docker, in any platform, and pull and run this image as I need it.

But this is not all; I would like to reuse my greeter image in other contexts. Take, for example, the case of CI/CD pipelines. We have all faced the hurdles of properly configuring pipelines, be it in Jenkins, TeamCity, GitHub Actions or GitLab. It ain’t easy, it ain’t pretty, but OMG how cool it is when it works, right?

When it comes to CI/CD pipelines, we at VSHN are great fans of not adding much complexity in the pipelines themselves; we like to be able to reproduce locally the steps performed by the pipelines on our own machines. This makes it much simpler to diagnose and troubleshoot problems, and this is where containers shine.

To show an example of this at work, here’s a little project1 called “Fortune”; it is a Flask application that returns a random “fortune cookie of the day” per HTTP. This app is built with Python, which has nothing to do with JavaScript (and thankfully so!)

We are going to reuse our beloved “multi-stage build” procedure, this time introducing a new step in the Dockerfile . Let us commit and push this change, and now our CI/CD pipeline run displays an image of Snoopy greeting us directly from the pipeline. Isn’t this lovely?

In a few steps, we have taken an application with lots of dependencies, and have bundled it in a format that is suitable for collaboration and reuse in as many systems as we could think of.

Task Examples

What are the kind of things you can do with this approach? At VSHN we have created container images that allow us to do the following things:

Another nice benefit is that users can run multiple versions of the same tool, without conflicts! The trick for that consists in always tagging your images properly, and never using latest as a tag.

Gotchas & Drawbacks

As with any technique or approach, there are some things to keep in mind; remember that there is No Silver Bullet.

Regarding GitLab

Regarding creating command-line tools

Regarding building containers

Tips for Python command-line tools

$ python3 -m venv .venv
$ source .venv/bin/activate
$ pip install PyYAML
$ pip install PyGithub
$ pip freeze > requirements.txt
$ cat requirements.txt

# later on...
$ pip install -r requirements.txt

Tips for tools built with Go

Tips for command-line tools built with JavaScript

Regarding CMD vs ENTRYPOINT

Pay attention to the differences between CMD vs ENTRYPOINT:

In short, CMD defines default commands and/or parameters for a container. CMD is an instruction that is best to use if you need a default command which users can easily override. If a Dockerfile has multiple CMDs, it only applies the instructions from the last one.

On the other hand, ENTRYPOINT is preferred when you want to define a container with a specific executable. You cannot override an ENTRYPOINT when starting a container unless you add the –entrypoint flag.

Combine ENTRYPOINT with CMD if you need a container with a specified executable and a default parameter that can be modified easily. For example, when containerizing an application use ENTRYPOINT and CMD to set environment-specific variables.

(Source)

Regarding ADD vs COPY

…in a nutshell, the major difference is that ADD can do more than COPY:

  • ADD allows <src> to be a URL
  • Referring to comments below, the ADD documentation states that:

If is a local tar archive in a recognized compression format (identity, gzip, bzip2 or xz) then it is unpacked as a directory. Resources from remote URLs are not decompressed.

Note that the Best practices for writing Dockerfiles suggests using COPY where the magic of ADD is not required. Otherwise, you (since you had to look up this answer) are likely to get surprised someday when you mean to copy keep_this_archive_intact.tar.gz into your container, but instead, you spray the contents onto your filesystem.

(Source)

Conclusion

Thanks to containers, at VSHN we have reached a very high degree of code reuse. We have reduced the requirements to just one: Docker (or, as you have seen, Podman).

Most of the tools shown in this presentation are open source and free to use. Feel free to build upon them, reuse them, and if you want, to submit your pull requests. We will be very happy to consider your ideas for future versions!

I hope that this presentation has been useful to you, and keep building containers!


  1. Unfortunately the project does not exist anymore at the original location. Will restore it as soon as I find it. ↩︎