All docker files look something like this
services:
service_name:
image: author/project:latest
container_name: service_name
volumes:
- service_data:/app/data/
volumes:
service_data:
Yes, this makes the data to persist, but it creates a directory with a random name inside /var/lib/docker/volumes/
This makes it really hard to actually have ownership of the data of the service (for example to create backups, or to migrate to another host)
Why is it standard practice to use this instead of having a directory mounted inside at the same level you have your docker-compose.yml?
Like this - ./service_data:/app/data
I think the primary reason is that it was designed to be a larger install than what most of you are using.
If you’re doing a giant k8s, with a bunch of pods that come and go dynamically, portability, consistency, and management are key. And for the most part you’re not handling long-term persistent data the same way as you do when you’re just installing containers on your home lab or development environment.
But since most people don’t have this constraints, blind mounts make a hell of a lot more sense if you were only running one server in one location.
Literally 90% of my compose looks like this.
But volumes makes it easier with permissions as the maintainer doesnt have to provide any infornation to noobs like me.It’s not “standard practice”, it’s the least problematic way to distribute example templates for mass consumption without needing extra steps for the user to configure. It is assumed that anyone looking to run something in production would follow best practices and harden where needed.
I’m not a huge docker expert, but I recently spun up a tandoor…dev, and their config instructions explicitly point out a couple of mounts that have to be volumes and can not be binds.
Docker’s own comments are https://docs.docker.com/engine/storage/volumes/ which my tl;dr is faster, can be shared by multiple containers, and can be a remote (NFS/CIFS) target.
I’d guess that maintainers use the volume structure to let docker handle the details of creating and maintaining the mount, rather than put it on the user, who may be spinning up their first-ever docker and may make all kind of naive mistakes.
Named volumes are often the default because there is no chance of them conflicting with other services or containers running on the system.
Say you deployed two different docker compose apps each with their own MariaDB. With named volumes there is zero chance of those conflicting (at least from the filesystem perspective).
This also better facilitates easier cleanup. The apps documentation can say “docker compose down -v”, and they are done. Instead of listing a bunch of directories that need to be cleaned up.
Those lingering directories can also cause problems for users that might have wanted a clean start when their app is broken, but with a bind mount that broken database schema won’t have been deleted for them when they start up the services again.
All that said, I very much agree that when you go to deploy a docker service you should consider changing the named volumes to standard bind mounts for a couple of reasons.
-
When running production applications I don’t want the volumes to be able to be cleaned up so easily. A little extra protection from accidental deletion is handy.
-
The default location for named volumes doesn’t work well with any advanced partitioning strategies. i.e. if you want your database volume on a different partition than your static web content.
-
Old reason and maybe more user preference at this point but back before the docker overlay2 storage driver had matured we used the btrfs driver instead and occasionally Docker would break and we would need to wipe out the entire /var/lib/docker btrfs filesystem, so I just personally want to keep anything persistent out of that directory.
So basically application writers should use named volumes to simplify the documentation/installation/maintenance/cleanup of their applications.
Systems administrators running those applications should know and understand the docker compose well enough to change those settings to make them production ready for their environment. Reading through it and making those changes ends up being part of learning how the containers are structured in the first place.
-
I change them all to bind mounts. Managed volumes is where data goes to die, if it’s not in my file tree I’ll forget it.
All of mine use bind mounts so I can just tar-gz the whole deploy folder for backups and migrations. For volumes that connect to remote shares (SMB, NFS, etc) I use named volumes and let Docker take care of their lifecycle.
If named docker volumes would let me specify the local filesystem location, I’d use them. As-is, I rarely do.
I tend to change volumes to bind mounts. Makes it easier to backup or move the service.
Might want to avoid using relative paths with bind mounts and declare the full path. It has caused me headaches before.
For me it really depends on the use-case. A lot of times I want persistence but don’t really care to access the data outside of the container. So rather than using the extra brainpower to make up folders myself and ensure paths don’t change I just let Docker handle those details for me. Also I use Podman a fair amount and it seems to be more troublesome when it comes to bind mounts.
I assume it’s because it reduces the possibility of other processes outside of the linked containers accessing the files (so security and stability).
Why would it reduce it?
If you want to secure it, use selinux and add :Z which truly eliminates the possibility
Easier cleanup
😏
I do that for data I want to persist, but which I don’t care about backing up (eg caches)