Ongoing physical backups with Docker & WAL-E

deprecation

This section describes a feature that is deprecated on TimescaleDB. We strongly recommend that you do not use this feature in a production environment. If you need more information, contact the support team.

To make TimescaleDB use the WAL-E sidecar for archiving, the two containers need to share a network. To do this, you need to create a Docker network and then launch TimescaleDB with archiving turned on, using the newly created network. When you launch TimescaleDB, you need to explicitly set the location of the write-ahead log (POSTGRES_INITDB_WALDIR) and data directory (PGDATA) so that you can share them with the WAL-E sidecar. Both must reside in a Docker volume, by default a volume is created for /var/lib/postgresql/data. When you have started TimescaleDB, you can log in and create tables and data.

  1. Create the docker container:

  2. Launch TimescaleDB, with archiving turned on:

    1. docker run \
    2. --name timescaledb \
    3. --network timescaledb-net \
    4. -e POSTGRES_PASSWORD=insecure \
    5. -e POSTGRES_INITDB_WALDIR=/var/lib/postgresql/data/pg_wal \
    6. -e PGDATA=/var/lib/postgresql/data/pg_data \
    7. timescale/timescaledb:latest-pg10 postgres \
    8. -cwal_level=archive \
    9. -carchive_mode=on \
    10. -carchive_command="/usr/bin/wget wale/wal-push/%f -O -" \
    11. -carchive_timeout=600 \
    12. -ccheckpoint_timeout=700 \
    13. -cmax_wal_senders=1
  3. Run TimescaleDB within Docker:

    1. docker exec -it timescaledb psql -U postgres

The WAL-E Docker image runs a web endpoint that accepts WAL-E commands across an HTTP API. This allows PostgreSQL to communicate with the WAL-E sidecar over the internal network to trigger archiving. You can also use the container to invoke WAL-E directly. The Docker image accepts standard WAL-E environment variables to configure the archiving backend, so you can issue commands from services such as AWS S3. See for more details.

  1. Start the WAL-E container with the required information about the container. In this example, the container is called timescaledb-wale:

    1. docker run \
    2. --name wale \
    3. --network timescaledb-net \
    4. -v ~/backups:/backups \
    5. -e WALE_LOG_DESTINATION=stderr \
    6. -e PGWAL=/var/lib/postgresql/data/pg_wal \
    7. -e PGDATA=/var/lib/postgresql/data/pg_data \
    8. -e PGHOST=timescaledb \
    9. -e PGPASSWORD=insecure \
    10. -e PGUSER=postgres \
    11. -e WALE_FILE_PREFIX=file://localhost/backups \
    12. timescale/timescaledb-wale:latest
  2. Start the backup:

    Alternatively, you can start the backup using the sidecar’s HTTP endpoint. This requires exposing the sidecar’s port 80 on the Docker host by mapping it to an open port. In this example, we map it to port 8080:

    1. curl http://localhost:8080/backup-push

You should do base backups at regular intervals, we recommend daily, to minimize the amount of WAL-E replay, and to make recoveries faster. To make new base backups, re-trigger a base backup as shown here, either manually or on a schedule. If you run TimescaleDB on Kubernetes, there is built-in support for scheduling cron jobs that can invoke base backups using the WAL-E container’s HTTP API.

To recover the database instance from the backup archive, create a new TimescaleDB container, and restore the database and configuration files from the base backup. Then you can relaunch the sidecar and the database.

  1. Create the docker container:

    1. docker create \
    2. --name timescaledb-recovered \
    3. --network timescaledb-net \
    4. -e POSTGRES_PASSWORD=insecure \
    5. -e POSTGRES_INITDB_WALDIR=/var/lib/postgresql/data/pg_wal \
    6. -e PGDATA=/var/lib/postgresql/data/pg_data \
    7. timescale/timescaledb:latest-pg10 postgres
    1. docker run -it --rm \
    2. -v ~/backups:/backups \
    3. --volumes-from timescaledb-recovered \
    4. -e WALE_LOG_DESTINATION=stderr \
    5. -e WALE_FILE_PREFIX=file://localhost/backups \
    6. timescale/timescaledb-wale:latest \wal-e \
    7. backup-fetch /var/lib/postgresql/data/pg_data LATEST
  2. Recreate the configuration files. These are backed up from the original database instance:

  3. Create a recovery.conf file that tells PostgreSQL how to recover:

    1. docker run -it --rm \
    2. --volumes-from timescaledb-recovered \
    3. timescale/timescaledb:latest-pg10 \
    4. sh -c 'echo "restore_command='\''/usr/bin/wget wale/wal-fetch/%f -O -'\''" > /var/lib/postgresql/data/pg_data/recovery.conf'

When you have recovered the data and the configuration files, and have created a recovery configuration file, you can relaunch the sidecar. You might need to remove the old one first. When you relaunch the sidecar, it replays the last WAL segments that might be missing from the base backup. The you can relaunch the database, and check that recovery was successful.

  1. Relaunch the WAL-E sidecar:

    1. docker run \
    2. --name wale \
    3. --network timescaledb-net \
    4. -v ~/backups:/backups \
    5. --volumes-from timescaledb-recovered \
    6. -e WALE_LOG_DESTINATION=stderr \
    7. -e PGWAL=/var/lib/postgresql/data/pg_wal \
    8. -e PGDATA=/var/lib/postgresql/data/pg_data \
    9. -e PGHOST=timescaledb \
    10. -e PGPASSWORD=insecure \
    11. -e PGUSER=postgres \
    12. -e WALE_FILE_PREFIX=file://localhost/backups \
    13. timescale/timescaledb-wale:latest
  2. Relaunch the TimescaleDB docker container:

    1. docker start timescaledb-recovered
  3. Verify that the database started up and recovered successfully: