Python: Docker volumes -- where is my SQLite database file?
The Python application in a Docker image writes some data to a SQLite database. Stop the container, and re-run again, the data are no longer there! A volume must be specified when running an image to persist the data. But where is the SQLite database file, in both Windows 10 and Linux? We’re discussing volumes and where volumes are on disks for both operating systems.
Python: Docker volumes – where is my SQLite database file? |
In this post, we're using the images created in Python: Docker image build -- save to and load from *.tar files. In a nutshell, these images contain a Python project, which uses a SQLite database. The same images are built for both Windows 10 Pro ( 64-bit, x64-based ), and linux/arm64 ( AArch64 ): which is my Synology DS218 box. And we're going to look at Docker volumes on both operating systems.
The code used to build Docker images in this post can be cloned with:
git clone -b v1.0.2 https://github.com/behai-nguyen/flask-restx-demo.git
Please also note the git clone command above also includes a test HTML page using JQuery test_client_app\jquery-ajax\TreeAPIClient.html, just copy it to a web site or a virtual web directory, and run it from there, for API URL use http://ip-address:port/api/v1/trees.
Table of contents
- Environments
- Docker Desktop uses WSL 2 based engine
- Docker volumes on Windows 10 Pro
- Docker volumes on Synology DS218 ( linux/arm64 )
- --volume / -v or --mount option?
- Concluding remarks
Environments
- Synology DS218 -- DSM 7.1-42661 Update 3.
- Windows 10 Pro -- version 10.0.19044 build 19044. System name is DESKTOP-7BA02KU.
- Windows Subsystem for Linux ( WSL 2 ) -- running Ubuntu 20.04.4 LTS (Focal Fossa).
- Windows “docker” CLI -- version 20.10.12, build e91ed57.
- Windows Docker Desktop -- version 4.4.3. The latest version is 4.10.1.
- Synology DS218 -- it's accessed via its device name omphalos-nas-01 instead of its IP address.
Docker Desktop uses WSL 2 based engine
I installed Windows Subsystem for Linux ( WSL 2 ) prior to Docker Desktop and all associated CLIs. I think I've installed Docker Desktop using all recommended options, when the installer picked WSL 2 based engine, I did agree with that, even though I was not at all sure what it means:
I have found out that when we run Docker images using Docker volumes on my Windows 10 Pro, Docker volumes actually live on WSL 2.
That means, in this post, for Windows 10 Pro, Docker volumes discussions are in the context of WSL 2 based engine only.
Docker volumes on Windows 10 Pro
If we don't specify a valid volume when running, then SQLite data only persists as long as the container is active, when we stop this container, the data will be lost, next time we run the same image again, we will have no data.
Official documents related to volumes:
Docker volume options
To use volumes, we must use either the -v short for --volume, or the --mount option. For examples:
C:\>docker run -v datavolume:/flask_restx_demo -d --publish 8000:8000 --rm flask-restx-demo
C:\>docker run --mount source=datavolume,target=/flask_restx_demo -d --publish 8000:8000 --rm flask-restx-demo
The above two ( 2 ) commands cause the containers to use the same volume. We can use the Swagger UI page http://localhost:8000/api/v1/ui to enter and to query data; or the client test page http://localhost/work/TreeAPIClient.html, for API URL use http://localhost:8000/api/v1/trees.
Change some data. Stop the container. Start again using either command -- regardless of the test client, we should see the data from the previous run.
“datavolume” and “flask_restx_demo” values
Consider the below two ( 2 ) commands:
C:\>docker run -v datavolume:/xyz -d --publish 8000:8000 --rm flask-restx-demo
C:\>docker run --mount source=datavolume,target=/abc -d --publish 8000:8000 --rm flask-restx-demo
they'll start and run successfully. But the data will not be persisted. That is, if we stop the container, and re-run the same command again, any data previously entered will no longer exist: -v datavolume:/xyz and --mount source=datavolume,target=/abc are not valid volumes.
❶ datavolume -- in the context of this post, this value will become a sub-directory on the file system as we shall see later. And we should be free to specify anything we see fit, as long as it's unique.
❷ flask_restx_demo, xyz and abc -- in the context of this post, they mean the same thing. The official documents listed above explain this, but I could not fully understand it. My experimentations show that this value must match the value specified for WORKDIR in the Dockerfile. For this image, it is:
WORKDIR /flask_restx_demo
Please see Dockerfile on GitHub. So that means only flask_restx_demo is valid. I've come to this observation by building and running another image with a different value for WORKDIR.
Further to the above, these two ( 2 ) commands:
C:\>docker run --mount source=datavolume,target=/flask_restx_demo -d --publish 8000:8000 --rm flask-restx-demo
C:\>docker run --mount source=behaivolume,target=/flask_restx_demo -d --publish 8000:8000 --rm flask-restx-demo
will result in two ( 2 ) valid and independent volumes.
The next section, Name container and “flask_restx_demo”, should illustrate flask_restx_demo a bit further.
Name container and “flask_restx_demo”
We can start a container with a specific name via the --name option, see Docker run reference | Name (--name):
D:\>docker run --name restx-demo --mount source=datavolume,target=/flask_restx_demo --publish 8000:8000 --rm flask-restx-demo
restx-demo should be listed for command:
D:\>docker ps -a
We can inspect restx-demo using:
D:\>docker inspect restx-demo
The output is very long, but we're interested in the following extracted sections:
...
"Mounts": [
{
"Type": "volume",
"Source": "datavolume",
"Target": "/flask_restx_demo"
}
],
...
"Mounts": [
{
"Type": "volume",
"Name": "datavolume",
"Source": "/var/lib/docker/volumes/datavolume/_data",
"Destination": "/flask_restx_demo",
"Driver": "local",
"Mode": "z",
"RW": true,
"Propagation": ""
}
],
...
"Image": "flask-restx-demo",
"Volumes": null,
"WorkingDir": "/flask_restx_demo",
"Entrypoint": null,
"OnBuild": null,
"Labels": {}
...
Do the values of properties Target, Destination and WorkingDir suggest any relationship to WORKDIR /flask_restx_demo in Dockerfile?
We discuss
"Source": "/var/lib/docker/volumes/datavolume/_data",
in the next section Docker volumes on disk.
Docker volumes on disk
On Windows 10, Docker Desktop lists volumes, and individual volume data as per screen capture below:
List volumes using CLI:
D:\>docker volume ls
At the time of this post, there was only one:
D:\>docker volume ls
DRIVER VOLUME NAME
local datavolume
We can inspect the a volume to get more detail on it, with:
D:\>docker volume inspect datavolume
[
{
"CreatedAt": "2022-07-28T13:33:51Z",
"Driver": "local",
"Labels": null,
"Mountpoint": "/var/lib/docker/volumes/datavolume/_data",
"Name": "datavolume",
"Options": null,
"Scope": "local"
}
]
I didn't understand what property Mountpoint is about, only that it is a Unix path:
"Mountpoint": "/var/lib/docker/volumes/datavolume/_data",
The answer provided by user craftsmannadeem in Locating data volumes in Docker Desktop (Windows) helps me. But within WSL 2, under root privilege, I don't see anything under /mnt/wsl/docker-desktop-data/, please see screen capture below:
However, if I paste this \\wsl$\docker-desktop-data\version-pack-data\community\docker to Windows File Explorer, I can see its content:
-- Are /mnt/wsl/docker-desktop-data/ and \\wsl$\docker-desktop-data\ referring to the same location? Or am I barking up the wrong tree?
Drill down to volumes | datavolume | _data, the entries are somewhat similar to those displayed in Docker Desktop above:
The data entries in flask_restx_demo.db match those retrieved by Swagger UI.
Docker volumes on Synology DS218 ( linux/arm64 )
I've already had the image flask-restx-demo-arm64 loaded in the post mentioned at the beginning:
Run it:
$ sudo docker run --network=host --mount source=datavolume,target=/flask_restx_demo -v "/run/docker.sock:/var/run/docker.sock" --rm flask-restx-demo-arm64
To recap, the Swagger UI URL is http://omphalos-nas-01:8000/api/v1/ui; the test client page is http://localhost/work/TreeAPIClient.html, where the API URL is http://omphalos-nas-01:8000/api/v1/trees.
Let's inspect the volume:
$ sudo docker volume inspect datavolume
We've seen similar output in the previous section:
[
{
"CreatedAt": "2022-07-26T11:13:59+10:00",
"Driver": "local",
"Labels": null,
"Mountpoint": "/volume1/docker/volumes/datavolume/_data",
"Name": "datavolume",
"Options": null,
"Scope": "local"
}
]
We're interested in Mountpoint as before:
"Mountpoint": "/volume1/docker/volumes/datavolume/_data",
Where does /volume1/docker come from? Please see Synology DS218: unsupported Docker installation and usage… | Mount more disk space -- basically, it's Docker's root folder which I've manually configured.
Let's see the Mountpoint content:
The entries are similar to those in Windows 10 Pro -- but the directory path is much easier to understand and to locate 🙈
The Portainer administrative UI also shows volume entries:
--volume / -v or --mount option?
This official document docker run, section Add bind mounts or volumes using the --mount flag, states:
Even though there is no plan to deprecate –volume, usage of –mount is recommended.
Personally, I find --mount much clearer than -v / --volume. I'll be using the --mount option from now on.
Concluding remarks
I do hope I haven't made any mistakes in this post.
I do also understand that the issues discussed in this post are only Docker volume's basics. I did read through the official documents, but they are not registered yet.
Thank you for reading... and I hope you find this post of some helps.