This post is outdated as of October 2016. Refer to the Docker Project Boilerplates for updated versions.
Ruby (and many other languages) are easy to dockerize for TDD and/or production. The general approach follows these steps:
- Use docker to download application dependencies to commit to source control
- Use dependencies to create a docker image
- Use previously built docker image to run arbitrary code w/o rebuilding new docker images.
The same process may be applied to other interpreted languages such as Node.js or Python. This process can be orchestrated through various build tools. This post uses
make because it is adequate and ubiquitous. (Plus make is great.)
First, create the
Gemfile. This file lists all application dependencies. Concepts like “development” and “test” are irrelevant because all dependencies will be committed to source control and the final docker image will be used in all environments/stages1. All changes to dependency versions must be committed to the
Gemfile. Let’s consider a simple web application using sinatra. Here is a sample
source 'https://rubygems.org' gem 'sinatra', '~> 1.4.7'
Next the runtime environment may be created from the dependencies. This is done by using the current
bundle package. All dependencies will be added to
vendor/cache and committed to source control2. Given we are using docker, dependency installation should also run through docker. The current directory may be mounted as a volume to capture the generated artifacts.
make makes this easy enough.
Gemfile.lock: Gemfile docker run -w /data -v "$(CURDIR):/data" -u $$(id -u) ruby:2.3 \ bundle package --all
make Gemfile.lock will run
bundle package --all downloading all
.gem files into the local
vendor/cache. Note the
-u argument. This important because the current directory is a mounted volume in the docker container. New files will be created with the user’s ID instead of root (the docker default user).
Now time to build the application docker image. This takes two parts: a
Dockerfile and a new
make target. Let’s start with the
FROM ruby:2.3 ENV LC_ALL C.UTF-8 RUN mkdir -p /app/vendor WORKDIR /app ENV PATH /app/bin:$PATH COPY Gemfile Gemfile.lock /app/ COPY vendor/cache /app/vendor/cache RUN bundle install --local -j $(nproc) COPY . /app/ CMD [ "irb" ]
Dockerfile creates a directory for all the source code in
/app. Next everything in
vendor/cache is copied into the image. Then
bundle install --local runs.
--local ensures that only
.gem files in
vendor/cache are used. Finally every other file is copied over to
/app. Now time to build the image with
DOCKER_IMAGE:=tmp/image IMAGE_NAME:=my_company/my_app $(DOCKER_IMAGE): Gemfile.lock docker build -t $(IMAGE_NAME) . mkdir -p $(@D) touch $@
make target builds the docker image with the dependencies installed + source code. The docker image is ready for production/staging/etc. Great but is it possible to avoid rebuilding the docker image on each code change? Yes! This is entirely possible. Given Ruby is an interpreted language all it needs is the
ruby interpreter and all available dependencies. Given those two things are available we can run code. We can do this with a shared volume. Our docker image expects application code at
/app. So mount the current directory (with all current application code) at
/app and we’re off. The below
make test target does exactly that.
.PHONY: test test: $(DOCKER_IMAGE) docker run --rm -v $(CURDIR):/app $(IMAGE_NAME) \ ruby test/some_test.rb
make target works by mounting the current directory
/app (the source code directory specified in the
Everything is packaged up in a handy example repo for use on your projects.
Typically Ruby applications declare dependencies for a particular environment. Example:
rack-testis not needed in production. This is all well and good when things are running directly on a given machine. However using docker as a delivery mechanism negates this problem. Installing dependencies (and thus things like C extensions) are all encapsulated so there is no need to enforce context specific dependencies at installation time. ↩
Dependencies should be committed to source control. This removes a dependency on the upstream package sytsem. It also insulates you from upstream deletions. (See the Node.js leftpad module discussion for an example). People argue this creates undo bloat in the git repo. I do not see this as big enough tradeoff to warrant ignoring the stability vendoring everything applies. Once your hit by a problem that could have been solved from vendoring everthing, you will do it. Take my advice and vendor everything from the beginning. ↩