How to Back Up and Restore PostgreSQL in Docker

Database backups are a critical component of any robust application infrastructure. When running PostgreSQL within Docker containers, backup procedures require specific approaches that differ from traditional database management.

This guide will walk you through the most effective methods for backing up and restoring PostgreSQL databases in containerized environments. Whether managing a small project or enterprise-level databases, understanding these techniques will help ensure your data remains secure and recoverable in any scenario.

Why Traditional Backup Approaches May Not Work with Dockerized PostgreSQL

Standard Docker volume backups (e.g., creating a tar archive of the data volumes) often fail because PostgreSQL does not automatically detect changes to its data directory once it is running. Restoring data files in this way does not trigger the necessary initialization routines, resulting in the database engine not recognizing the restored files.

Read: How to install and setup Docker on Ubuntu 22.04

Method 1: Using Native PostgreSQL Tools (Recommended)

Creating a Full Database Backup

For a complete backup of all databases, users, and permissions:

docker exec -t your-db-container pg_dumpall -c -U postgres > dump_$(date +%Y-%m-%d_%H_%M_%S).sql

For large databases, compress the output:

docker exec -t your-db-container pg_dumpall -c -U postgres | gzip > dump_$(date +%Y-%m-%d_%H_%M_%S).sql.gz

Or for even better compression:

docker exec -t your-db-container pg_dumpall -c -U postgres | brotli --best > dump_$(date +%Y-%m-%d_%H_%M_%S).sql.br

Backing Up a Single Database

If you need to back up a specific database:

docker exec -t your-db-container pg_dump --no-owner -U postgres your-database-name > dump_$(date +%Y-%m-%d_%H_%M_%S).sql

Restoring From a Backup

To restore from an uncompressed backup:

cat your_dump.sql | docker exec -i your-db-container psql -U postgres

For compressed backups:

gunzip < your_dump.sql.gz | docker exec -i your-db-container psql -U postgres

If restoring to a specific database:

cat your_dump.sql | docker exec -i your-db-container psql -U postgres -d your-database-name

Read: Quick Guide to Installing PostgreSQL on Ubuntu 22.04

Common Issues with the Pipe Method

For very large databases, piping the data via cat might lead to out-of-memory errors. In such cases:

# Copy the dump into the container
docker cp local/path/to/db.dump CONTAINER_ID:/db.dump

# Access the container shell
docker exec -it CONTAINER_ID bash

# Restore from within the container
pg_restore -U postgres -d DATABASE_NAME --no-owner -1 /db.dump
    

Method 2: Using Data Containers and Volume Management

Backup Process

docker run --rm \
   --user root \
   --volumes-from your-db-container \
   -v /backup/location:/backup \
   ubuntu tar cvf /backup/db.tar /var/lib/postgresql/data
    

Restoration Process

Stop your PostgreSQL container, restore the data, then restart it:

docker stop your-db-container

docker run --rm \
   --user root \
   --volumes-from your-db-container \
   -v /backup/location:/backup \
   ubuntu bash -c "cd /var && tar xvf /backup/db.tar --strip 1"

docker start your-db-container
    

Read: How to clean up unused Docker containers, images and volumes

Method 3: Automated Backups with a Dedicated Container

For production environments, automate backups using a dedicated backup container. An example using docker-compose:

services:
  db:
    image: postgres:14
    container_name: your-db-container
    # other configuration...

  pgbackups:
    container_name: postgres-backup
    image: prodrigestivill/postgres-backup-local
    restart: always
    volumes:
      - ./backup:/backups
    links:
      - db:db
    depends_on:
      - db
    environment:
      - POSTGRES_HOST=db
      - POSTGRES_DB=your_database
      - POSTGRES_USER=postgres
      - POSTGRES_PASSWORD=your_password
      - POSTGRES_EXTRA_OPTS=-Z9 --schema=public --blobs
      - SCHEDULE=@daily
      - BACKUP_KEEP_DAYS=7
      - BACKUP_KEEP_WEEKS=4
      - BACKUP_KEEP_MONTHS=6
      - HEALTHCHECK_PORT=81
    

This setup creates daily backups, applies retention policies, and stores backups in a mounted volume accessible from the host.

Best Practices for PostgreSQL Backups in Docker

  • Test your backups regularly: Schedule periodic restoration tests to ensure backup validity.
  • Store backups externally: Move backups off the Docker host to safeguard against host failures.
  • Monitor backup size and duration: Track any unexpected changes that could indicate issues.
  • Use compression for large databases: Save storage space and improve transfer speeds.
  • Document your backup/restore procedures: Ensure your team is prepared for emergency recoveries.
  • Consider incremental backup strategies: For very large databases, use tools like WAL-E or pgBackRest.
  • Explicitly use authentication flags: Enhance portability and reduce dependency on container configurations.
  • Include restoration commands in documentation: Provide exact procedures for recovering data.

Troubleshooting Common Issues

  • Authentication Failures: Specify the user in Docker commands (e.g., --user postgres).
  • Encoding Issues on Windows: Use file copy methods rather than piping to handle character encoding.
  • Container Access Issues: Confirm you are executing commands in the correct environment.
  • Out of Memory During Restoration: For large dumps, use the file copy method or increase system memory.

Advanced Topic: Point-in-Time Recovery

For critical production databases, consider implementing point-in-time recovery (PITR) using PostgreSQL’s write-ahead logs (WAL). This involves:

  • Configuring WAL archiving in your PostgreSQL container.
  • Setting up a process to back up WAL files.
  • Restoring a base backup and replaying WAL files.

This advanced technique minimizes data loss but requires detailed planning and configuration.

Conclusion

Properly backing up and restoring PostgreSQL databases in Docker environments is essential for data safety and business continuity. The native PostgreSQL tools (pg_dump, pg_dumpall, and psql) generally offer the most reliable approach, especially for moderate-sized databases. For automated backups, dedicated containers streamline the process and ensure consistent schedules.

By following the techniques in this guide, you can build a robust backup strategy that protects your data and enhances your disaster recovery capabilities.

FAQ

How often should I back up my PostgreSQL database in Docker?

The backup frequency depends on your data change rate and acceptable data loss. Critical production databases may require hourly backups, while development databases might be backed up daily or weekly.

Can I restore a backup to a different PostgreSQL version?

You can generally restore to a newer version but not to an older one. Logical backups (using pg_dump/pg_dumpall) are recommended for version upgrades.

How do I restore a specific table rather than the entire database?

Use pg_dump with table-specific options to back up just that table, then restore it using psql targeted to your database.

Will my backup include stored procedures and triggers?

Yes, pg_dump and pg_dumpall include schema objects like procedures, triggers, and functions by default.

How can I verify my backup is valid without restoring it?

You can use pg_restore --list to view the contents of a backup file or restore to a temporary database to verify integrity.

Does Docker volume backup include WAL files for point-in-time recovery?

Standard volume backups include WAL files present in the data directory, but not necessarily archived WAL files if external archiving is configured.

How do I handle backing up very large databases (100GB+)?

For large databases, consider incremental backup tools like pgBackRest or split your backup strategy to include physical backups with WAL archiving.

Can I automate backup verification?

Yes, you can script restoration tests to temporary databases and run validation queries to ensure data integrity.

 

 


If you like the content, we would appreciate your support by buying us a coffee. Thank you so much for your visit and support.

 

Marianne elanotta

Marianne is a graduate in communication technologies and enjoys sharing the latest technological advances across various fields. Her programming skills include Java OO and Javascript, and she prefers working on open-source operating systems. In her free time, she enjoys playing chess and computer games with her two children.

Leave a Reply