It’s straightforward to create a data container with pre-baked archives included: COPY them on during docker build and off you go. But what if you can’t or don’t want to create the data ahead of time?
At RealScout, we recently started experimenting with the approach detailed below to create docker images from postgres snapshots for distribution to development environments. Briefly, it looks like:
Create and initialize a data container
Start postgres using the volume from that data container
Restore data
Stop postgres
Create an image from the populated data container
Setup
First, build a data container image, for example with this Dockerfile. :
Now create a postgres data container using that image:
Now start a postgres server (using an image from Docker Hub here) using the volume from that data container:
Now we can load up some data. At this point, you should probably scrub out anything from the database backup that you don’t want to include in the image.
The clever part
We’re going to docker build inside of a third container that has the data volume mounted inside of it. It’s not quite docker in docker, but it’s related.
First, take the Dockerfile above and COPY data/ ${PGDATA} to it so it looks like:
(We just have a single Dockerfile used for initial and snapshot builds and an empty data/ sitting around for use by the initial build.)
Now stick this in snapshot.sh:
if you used the --name flag on docker run earlier, you can snapshot with:
I hope that helps. Please let me know if I screwed anything up.