Use docker containers as dev environments

Planted October 17, 2022

Historically we would install our development environment directly on our local systems. We would test on our local systems. We would fight library conflicts, runtime version compatibility, and related errors and instabilities on our local systems. These things would prevent us from upgrading runtimes, and libraries. These things would lead to security concerns, and ultimately result in major productivity loss when upgrades to the environment were forced through an EOL notice, or a security incident.

At production deploy we would hear things like “What do you mean it doesn't work in production? It worked fine on my machine!”. Often developer machines are not as strictly controlled as production systems; often they are not even the same OS (eg. Windows 10 vs Windows Server)! These differences are all but impossible to track and test for.

Containers provide a relativly easy and an elegant solution to this problem.

Let's start with the basics.

What is a container?

Directly from Docker

A container is a standard unit of software that packages up code and all its dependencies so the application runs quickly and reliably from one computing environment to another.

This is it, the holy grail of reproducability and stable testing environments, and predictable deployments. With Docker, Podman and other container technologies it is now possible to define an environment that has only what we need to develop, build, and test our application. We can define the environment concisely in a config file, distribute it across a team, and deploy it exactly into production.

How to use containers for development

The typical software development life cycle looks something like this:

Checkout project source code to dev environment
Develop new features, make bug fixes, etc
Write and execute unit tests (hopefully)
Push the code to the upstream source control
Pull the code and run functional regression tests (hopefully manually, or better automatically)
Deploy tested code to production

This process likely involves at least 3 different environments: Dev, QA, Prod. Maybe more. Each of these environments is potentially different from the production environment. By defining containers that mirror the production environment, and distributing these to the full team to utilize, one can ensure that they are developing, testing, and deploying in an environment that matches production. This significantly reduces the risk of environment induced errors in production!

Define an Environment in a Dockerfile

(eg. Julia Development for Financial Market Data)

Let's take a look at one of the environments we use regularly when working with Julia

We won’t cover how to write a Dockerfile here, we might elaborate at a later time, but for now one can always refer to the Best practicies for writing Dockerfiles information directly from the Docker team.

Here is our Dockerfile:

FROM julia:1.8.2-bullseye

RUN set -eux; \
        apt-get update; \
        apt-get install -y --no-install-recommends \
                vim \
                git \
                ssh \
        ; \
        rm -rf /var/lib/apt/lists/*

# Copy a .vimrc file into the container with basic settings
COPY ./julia_vimrc /root/.vimrc

RUN set -eux; \
# Install Julia Vim plugin
        curl -fLo ~/.vim/autoload/plug.vim --create-dirs https://raw.githubusercontent.com/junegunn/vim-plug/master/plug.vim; \
        vim -es -u /root/.vimrc -i NONE -c "PlugInstall" -c "qa"; \
# Install some Julia packages
        julia -e 'using Pkg; Pkg.update()'; \
        julia -e 'using Pkg; Pkg.add.(["AbstractTrees", "Actors", "ArgCheck", "ArgParse", "Avro", "BenchmarkTools", "Chain", "CSV", "DataFrames", "DataFramesMeta", "DataStructures", "Dates", "DuckDB", "ErrorTypes", "HTTP", "InMemoryDatasets", "JSON3", "LazyArrays", "LazyJSON", "Logging", "MethodAnalysis", "MiniLoggers", "Parameters", "Parquet", "PkgTemplates", "Random", "Redis", "ResultTypes", "Revise", "RDKafka", "StatsBase", "StructTypes", "Temporal", "TimeSeries", "TimeZones", "XLSX"])'; \
        julia -e 'using Pkg; Pkg.precompile()'; \
# smoke test
        julia --version

CMD ["julia"]

So what does this do:

In this example we begin with the community docker image for Julia 1.8.2 which is based on the official debian bullseye-slim container.
We add vim, git, and ssh to facilitate development, and interaction with the source repository.
We copy in a .vimrc file with our preferences for Julia development, and add the Julia plug-in for vim. This could be customized by user, and would not be needed in the production instance.
We then load all of the Julia packages that are needed in our application, precompile them, and test the Julia environment
Finally we set the default container command to “julia”. This allows us to use the container both for interactive development, or as an execution environment for code that we mount into it.

Build the Docker container

Once we have the Dockerfile we can build a new image with the docker build command and give it a tag with -t so we know what the image contains.

$ docker build -t testlakehaus/juliadev:1.8.2-bullseye .

Use the new Docker environment

Start an interactive Julia REPL (shell)

$ docker run -it --rm testlakehaus/juliadev:1.8.2-bullseye

Run an interactive bash shell in the container

$ docker run -it testlakehaus/juliadev:1.8.2-bullseye bash

Run a Julia script from your local directory inside the container

$ echo 'print("Hello World")' >> julia.jl
$ docker run -it -rm -v "$PWD":/usr/myapp -w /usr/myapp testlakehaus/juliadev:1.8.2-bullseye julia julia.jl

Articles