Unable to `mount` overlayfs in Docker container when inside a LXC with a ZFS pool
I'm currently working to improve performance and throughput of our automation infrastructure, most of which is a combination of Bash/Shell scripts, Python scripts, Docker, Jenkins, etc. We use Yocto to build embedded Linux distributions for specialized hardware and we have a Docker image to define/run our build environment/process.
Because of how our Docker containers work, using the
-v option to bind-mount the host file system into itself, there're race conditions whenever you want to run parallel jobs. To help remedy this, I'm using a Bash script to automate the setup of an overlay file system. That allows me to transparently present the environment to the Docker containers in the way they expect it to be, without them "realizing" that there're overlays underneath to prevent actual data races.
This was tested in several Linux systems including my Laptop (Ubuntu 20.04) and several virtual machines (Ubuntu 20.04), without issues. However, I noticed that when the docker containers exist inside a LXC-based system container (using Ubuntu 20.04 and
ext4 fs), the
mount command executed from inside the docker container fails. (Docker has Ubuntu 14.04.05 LTS inside.)
The question boils down to this: How can I successfully run the
mount command from within a Docker container, that is running within a LXC-based container, so that the Docker container itself can set up and use the overlay filesystem?
One of the servers I manage hosts several Jenkins nodes inside LXC-based system containers. All of the LXC-based Jenkins nodes are running Ubuntu 20.04 LTS, exist within the same ZFS Pool, and are kept up-to-date. (For environment details, please see the end of this post.)
The overlay setup step was written to execute as part of the "startup" process when the Docker container is launched. The launch command looks basically as follows (with some actual data being ommitted/
$ docker run --rm -it --privileged \ -e USER_MOUNT_CACHE_OVERLAY=1 \ -v <host work directory path>:/home/workdir \ -v <host Git repositories path>:/home/localRepos \ <image>:<tag> \ bash
The Bash script uses
mktemp to create the (work and read-write) directories that will be used for the overlay. A manual example of the
mount command being used is:
$ sudo mount -t overlay sstate-cache \ -o lowerdir=sstate-cache,upperdir=overlayfs/cache-rw,workdir=overlayfs/cache-work \ sstate-cache
When I do this in my Laptop or any other non-LXC node, everything works fine. However, when the Docker container running the
mount command exists inside a LXC node, this error shows up:
mount: wrong fs type, bad option, bad superblock on /home/workdir/sstate-cache, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so
The exit code returned is
32, which is simply documented in the
man mount pages as "mount failure". The contents in
/var/log/syslog don't seem to have anything relevant. The
dmesg command, shows these:
overlayfs: filesystem on '/var/lib/docker/check-overlayfs-support309123547/upper' not supported as upperdir overlayfs: filesystem on '/var/lib/docker/check-overlayfs-support422513534/upper' not supported as upperdir [...] overlayfs: filesystem on 'overlayfs/cache-rw' not supported as upperdir
I've been trying to fix this since last week, but I still have no idea why this error would show up nor how to fix it. Many search results have not been relevant to my specific case.
Some Things I've Tried
I found that Docker required the
--privileged option in order to allow the
mount command to work, so that's the reason it's there. This fixed the original mount issue in my Laptop and other VMs. (For the LXC nodes, it simply prevents a Docker crash; you'd see Go-lang stack traces otherwise.)
But LXD/LXC has its own security options. Its
security.nesting had already been set to
true by me a few years back to let Docker containers to run; this has not been an issue. I tried making the LXC container itself privileged with:
$ lxc config set <node> security.privileged true
<node> is the name of the LXC node, but it made no difference. Note that replacing/destroying the ZFS Pool and/or LXC itself are not valid options.
Remarks (Could Be Wrong)
While the file system of the LXC-based node is
ext4, as can be confirmed by looking at the filesystem table
$ cat /etc/fstab LABEL=cloudimg-rootfs / ext4 defaults 0 0
the entire LXC-based container is stored in a ZFS Pool. A few years ago, I had enabled ZFS Compression in the physical host, which should've been completely transparent not only to the LXCs, but also the Docker containers. However, I observed issues with the
du command, where it would calculate incorrect disk usage results, which then caused other parts of our build process to fail.
While I can't be certain, and however unlikely this may be (I have no way to test/verify this), I have been asking myself if there're maybe some other ZFS options that could be affecting this. To me, it seems more likely that existing LXC options might do the trick, but I'm not sure which ones those could be.
I already took a look at this question elsewhere, but I've not found any similar error messages.
Environment Details (Host, LXCs, Docker)
Operating System (Physical Host)
$ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 20.04.1 LTS Release: 20.04 Codename: focal
ZFS Version (Physical Host)
$ zfs -V zfs-0.8.3-1ubuntu12.4 zfs-kmod-0.8.3-1ubuntu12.4
LXC Version (Physical Host, Snap Package)
$ lxc --version 4.7
Docker Version (Inside LXC, Jenkins Node)
$ docker --version Docker version 19.03.12, build 48a66213fe
Operating System (Inside Docker Container)
$ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 14.04.5 LTS Release: 14.04 Codename: trusty