How do I move data into a container- Part 3: Using a ‘data mover container’

This is the third in a series of posts focusing on extending and enhancing the previously ‘Dockerized’ WordPress installation – specifically focused on the movement / migration of data for the newly containerized application.

The idea here is straightforward… We have data in one of two folders for each of our containers (MySQL and WordPress) that needs to be moved or copied into a location accessible by the container at runtime. As specified previously, we need to be able to move / copy the “wp-content” folder into an accessible location for WordPress to use (previous post), and we need to move/copy the SQL backup file for the database for MySQL to import (previous post)… AND we need to be able to make a run-time decision on whether we are doing so with either the ‘test’ data or the ‘production’ data.

One common pattern for this is to use a third, transient container running a script of some kind to perform this task. This container – which I will refer to here as the ‘data mover’ – will have only one job… to copy data into a predefined location upon start, then exit. I will be using a script inside the container to perform this task. This script must be able to accept a couple variables at runtime – namely whether we are spinning up the containers for a ‘test’ or ‘prod’, and the locations of the source files based on the above.

I’m not going to bore you with the gory details of the hours I spent tweaking to get this just right – this would be a book instead of a blog post. Suffice it to say I landed on the following as a plan:

  • to set the ‘test’ or ‘prod’ environment variables
    • I will be using compose to spin up the environment
    • I will use a ‘.env file’ during compose execution to deliver environment variables
    • the ‘.env’ file(s) will set the location of the source directory to pull from
    • I will have separate .env files for either ‘test’ or ‘prod’

Here are the two .env files I created – I named them “env.dm.prod” and “env.dm.test”:

cat env.dm.prod

SOURCE_WP_DIR=prod/wp-content
SOURCE_SQL_DIR=prod/mysql-init-files
cat env.dm.test

SOURCE_WP_DIR=test/wp-content
SOURCE_SQL_DIR=test/mysql-init-files
  • to copy / move the files
    • I will create a shell script that inherits the environment variables set by the .env file – as those are read in at runtime
    • the script will use the environment variables to selectively copy / move the correct files (‘test’ or ‘prod’) to a new folder for WordPress and another for MySQL

Here is a copy of the shell script – I named it “copysource.sh”:

cat copysource.sh

#!/bin/sh

# SOURCE_WP_DIR must be set via CLI or env file
if [ -z "$SOURCE_WP_DIR" ]; then
echo "Error: The SOURCE_WP_DIR environment variable is not set."
exit 1
fi

# SOURCE_SQL_DIR must be set via CLI or env file
if [ -z "$SOURCE_SQL_DIR" ]; then
echo "Error: The SOURCE_SQL_DIR environment variable is not set."
exit 1
fi

# Define the target directories
TARGET_WP="/target-wp"
TARGET_SQL="/target-sql"

# Check if the target directory WordPress directory exists, exit if so, create if not
if [ ! -d "$TARGET_WP" ]; then
echo "The target directory $TARGET_WP does not exist. Creating it..."
mkdir "$TARGET_WP"
if [ $? -ne 0 ]; then
echo "Error: Failed to create the target directory $TARGET_WP."
exit 1
fi
echo "The target directory $TARGET_WP has been created."
else
echo "The target directory $TARGET_WP already exists."
# exit 1
fi

# Check if the target directory MySQL directory exists, exit if so, create if not
if [ ! -d "$TARGET_SQL" ]; then
echo "The target directory $TARGET_SQL does not exist. Creating it..."
mkdir "$TARGET_SQL"
if [ $? -ne 0 ]; then
echo "Error: Failed to create the target directory $TARGET_SQL."
exit 1
fi
echo "The target directory $TARGET_SQL has been created."
else
echo "The target directory $TARGET_SQL already exists."
# exit 1
fi

# Copy the contents from the source to the target WordPress directory, exit if fail
cp -r $SOURCE_WP_DIR/* $TARGET_WP/
if [ $? -ne 0 ]; then
echo "Error: Failed to copy files from $SOURCE_WP_DIR to $TARGET_WP."
exit 1
fi
echo "Contents have been successfully copied from $SOURCE_WP_DIR to $TARGET_WP."

# Copy the contents from the source to the target MySQL directory, exit if fail
cp -r $SOURCE_SQL_DIR/* $TARGET_SQL/
if [ $? -ne 0 ]; then
echo "Error: Failed to copy files from $SOURCE_SQL_DIR to $TARGET_SQL."
exit 1
fi
echo "Contents have been successfully copied from $SOURCE_SQL_DIR to $TARGET_SQL."

… it’s probably longer than it needs to be, but I have some error checking built in. Basically the script:

  • checks for the existence of the environment variables
  • checks to see if the target directories already exist
  • creates the target directories
  • copies the source data for WordPress and MySQL from the correct source folder based on the environment variables set

I could have added some logic in there to optionally forcibly overwrite the contents of the target directory if it already exists, but I chose not to… perhaps a future enhancement.

Next I had to build a Dockerfile. This is a short Dockerfile, as the container will have a single purpose – copy some files and then exit.

cat dockerfile-dm

from alpine:latest
WORKDIR /stage
COPY . .
RUN chmod +x copysource.sh
ENTRYPOINT [ "/stage/copysource.sh" ]
VOLUME [ "/target-wp" ]
VOLUME [ "/target-sql" ]

This dockerfile simply creates a directory called “stage” and changes to that directory, copies the contents of the current host directory in, changes permissions on the script to allow it to run, then runs the script.

You may notice that the container has to copy EVERYTHING in first before it can selectively copy / move the contents of the test or prod data to the correct folder. I am sure there is a way to keep this more efficient and smaller, but I haven’t gotten that far yet.

Lastly, I need a compose file to tie it all together.

cat compose-dm.yml

version: "3.8"

services:
datamover:
image: alpine:latest
build:
context: ./
dockerfile: ./dockerfile-dm
tags:
- "datamover:v1"
- "thomastwyman557/datamover:v1"
restart: no
environment:
SOURCE_WP_DIR:
SOURCE_SQL_DIR:
volumes:
- wpcontent:/target-wp
- sqlinit:/target-sql

db:
image: mysql:8.3.0
restart: always
environment:
MYSQL_DATABASE: wordpress
MYSQL_USER: wordpress
MYSQL_PASSWORD_FILE: /run/secrets/db_password
MYSQL_ROOT_PASSWORD_FILE: /run/secrets/db_root_password
depends_on:
- datamover
volumes:
- db:/var/lib/mysql
- sqlinit:/docker-entrypoint-initdb.d
secrets:
- db_password
- db_root_password

wordpress:
image: wordpress:6.4.3-php8.2-apache
restart: always
ports:
- 80:80
environment:
WORDPRESS_DB_HOST: db
WORDPRESS_DB_USER: wordpress
WORDPRESS_DB_NAME: wordpress
WORDPRESS_DB_PASSWORD_FILE: /run/secrets/db_password
volumes:
- wordpress:/var/www/html
- wpcontent:/var/www/html/wp-content
- ./uploads.ini:/usr/local/etc/php/conf.d/uploads.ini
secrets:
- db_password
depends_on:
- datamover
- db

secrets:
db_password:
file: ./run/secrets/db_password.txt
db_root_password:
file: ./run/secrets/db_root_password.txt

volumes:
wordpress:
wpcontent:
db:
sqlinit:

This took me the most time, as there are a couple nuances in here… firstly, in the section for the datamover:

  • I had to add a ‘build’ section to ensure the container would be built at runtime
    • you can see I added the name of the dockerfile as well as tags to the resulting image
  • I used “restart: no” to ensure the container does not restart after exit.
  • I had to declare the variables used in the .env file and copysource.sh script to ensure they are present at runtime (“SOURCE_WP_DIR” and “SOURCE_SQL_DIR”).
  • Finally, I am declaring a couple volumes:
    • wpcontent:/target-wp
      • this maps the container’s “target-wp” directory to a new volume “wpcontent”
      • this new “wpcontent” volume will be used by the WordPress container later
    • sqlinit:/target-sql
      • this maps the container’s “target-sql” directory to a new volume “sqlinit”
      • this new “sqlinit” volume will be used by the MySQL container later

Next, in the section for the MySQL server:

  • you will notice I call out the required variables for MySQL server as per the documentation
  • you may also notice I am using secrets to avoid exposing passwords in the compose file itself (in a production environment I might choose to use something like Hashicorp Vault for secrets management)
  • I am using the “depends_on” parameter to indicate MySQL shouldn’t start until the datamover starts
  • finally I am declaring a couple volumes – one is for the database itself (“db”), but the other is the “sqlinit” volume, which I am mapping to docker-entrypoint-initdb.d.
    • you may recall, the “sqlinit” volume was created by the data mover above, and container the SQL backup file for either the ‘test’ or ‘prod’ database
    • The “docker-entrypoint-initdb.d” directory is a special directly used to run any “.sql” scripts upon startup by MySQL (or Postgres / Mongo / Redis / etc).

Finally in the section for WordPress:

  • you will notice I call out the required variables for WordPress server as per the documentation
  • you may also notice I am using secrets to avoid exposing passwords in the compose file itself (in a production environment I might choose to use something like Hashicorp Vault for secrets management)
  • I am using the “depends_on” parameter to indicate WordPress shouldn’t start until the datamover and MySQL containers start
  • Finally, I am declaring a few volumes:
    • wordpress:/var/www/html
      • This maps a new volume “wordpress” to the container’s /var/www/html folder – this is the recommended volume to hold the WordPress installation outside the container so it will persist
    • wpcontent:/var/www/html/wp-content
      • This maps the ‘wpcontent’ volume created earlier to the container’s ‘wp-content’ folder, which holds the updated test or prod data for my posts, images, etc.
    • ./uploads.ini:/usr/local/etc/php/conf.d/uploads.ini
      • This is a custom uploads.ini file that allows me to reconfigure WordPress’ default upload limit… by default WordPress has a limit of a 2MB upload, which is too small for my purposes.

Lastly, the compose file instantiates the necessary secrets and volumes required by the containers.

To test this all I should need to do is run a docker compose command, and use the correct compose file and env file… Let’s test it out!

docker compose -f compose-dm.yml --env-file env.dm.test up -d --build

This looks convoluted, but that’s my fault. This is essentially “docker compose up,” but I am using a custom-named compose file, a custom-named .env file, and I am specifying that a build should be run. Here are the results:


[+] Building 3.1s (9/9) FINISHED docker:desktop-linux
=> [datamover internal] load build definition from dockerfile-dm 0.0s
=> => transferring dockerfile: 266B 0.0s
=> [datamover internal] load metadata for docker.io/library/alpine:latest 0.0s
=> [datamover internal] load .dockerignore 0.0s
=> => transferring context: 34B 0.0s
=> [datamover 1/4] FROM docker.io/library/alpine:latest 0.0s
=> [datamover internal] load build context 0.2s
=> => transferring context: 1.19MB 0.2s
=> [datamover 2/4] WORKDIR /stage 0.0s
=> [datamover 3/4] COPY . . 1.8s
=> [datamover 4/4] RUN chmod +x copysource.sh 0.1s
=> [datamover] exporting to image 1.0s
=> => exporting layers 0.9s
=> => writing image sha256:1061057e705c2a08df2737bf0089c521775e85f55898271d2e9e7731c263c55d 0.0s
=> => naming to docker.io/library/alpine:latest 0.0s
=> => naming to docker.io/library/datamover:v1 0.0s
=> => naming to docker.io/thomastwyman557/datamover:v1 0.0s
[+] Running 3/8
⠼ Network wp-migrate_default Created 0.4s
⠼ Volume "wp-migrate_wpcontent" Created 0.4s
⠼ Volume "wp-migrate_sqlinit" Created 0.4s
⠸ Volume "wp-migrate_db" Created 0.4s
⠸ Volume "wp-migrate_wordpress" Created 0.4s
✔ Container wp-migrate-datamover-1 Started 0.1s
✔ Container wp-migrate-db-1 Started 0.3s
✔ Container wp-migrate-wordpress-1 Started 0.3s

If we look in Docker Desktop we see the following:

Notice that the data mover container has exited, but the WordPress and MySQL containers are running. If we click into the data mover and look at the logs, we see the following:

So it appears as if data was successfully copied… we should be able to check the volumes as well:

All our volumes are there. If click into the “wpcontent” volume we can see my uploads from this month:

If we go to our browser –

Success! We can see our test blog! Docker Desktop should show us active responses from the WordPress server as well:

Finally, let’s spin down our environment, and turn it back up with production data:

docker compose -f compose-dm.yml --env-file env.dm.test down -v

[+] Running 8/8
✔ Container wp-migrate-wordpress-1 Removed 1.2s
✔ Container wp-migrate-db-1 Removed 0.8s
✔ Container wp-migrate-datamover-1 Removed 0.0s
✔ Volume wp-migrate_wordpress Removed 0.0s
✔ Volume wp-migrate_db Removed 0.0s
✔ Volume wp-migrate_sqlinit Removed 0.0s
✔ Volume wp-migrate_wpcontent Removed 0.0s
✔ Network wp-migrate_default Removed 0.0s

You will notice I used the “-v” parameter to also delete the volumes… in my case since I am switching between production and test data in the same environment, I don’t actually want the volumes to persist right now. Now all I need to do is change one parameter in the command to get my production environment:

docker compose -f compose-dm.yml --env-file env.dm.prod up -d --build

And a quick check in my browser shows the following:

Finally, just to prove my point, if we examine the wpcontent volume one more time, we can see posts and uploads going be 10+ years – a complete copy of my production blog:

That’s it! In my next post we will look at how to do this using multi-stage builds and compose. See you there!

One comment

Leave a Reply