Running docker inside an unprivileged LXC container on Proxmox#


TL;DR This is a brief description of the setup process for running docker in unprivileged LXC containers on proxmox. There are two primary sources, one is a post on Reddit1 and a more general discussion2 on linuxcontainers.org.


Motivation Docker containers can be useful, even though Proxmox LXC containers offer the same set of functions.

For example, I prefer Docker over LXC, where official pre-defined docker-compose.ymls exist and are suggested in documentations.

However, there is some confusion about running Docker inside Proxmox.

Several sources suggest that Docker can only be run inside a full VM, or a privileged LXC container, with full access to the host system.

Usually, this will be the wrong approach.

Full VMs in Proxmox consume reserved system resources such as CPU, Memory etc. An unprivileged LXC container, however, will share available resources with all other containers on the host.

This means, if the total available Memory on the Hypervisor is 32 GB, it is entirely possible to create several LXC containers and make 32 GB of memory avialable to each of them. The total available memory will be shared.

When to not use Docker in unprivileged LXC

Full VMs are officially recommended for Docker, over running inside unprivileged containers. One of the main reasons is that VMs are fully virtualized, whereas LXC containers simply run all processes using the host (the hypervisor). Unprivileged containers use a combination of app-armor rules and uid-mapping to prevent any malicious access to the host, but if you are doing serious production work or you know that your Docker tools may be insecure, use a VM instead of LXC.


Prepare Proxmox#


On Proxmox, the overlay and aufs Kernel modules must be enabled to support Docker-LXC-Nesting.

echo -e "overlay\naufs" >> /etc/modules-load.d/modules.conf

Reboot Proxmox and verify that the modules are active:

lsmod | grep -E 'overlay|aufs'

Create an unprivileged LXC container#


Follow the Proxmox docs to create an unprivileged LXC container, either through the web UI or using the shell.

For example, I used the folowing settings:

This LXC container config will be stored at:

/etc/pve/lxc/100.conf

Open this config and add:

features: keyctl=1,nesting=1

Afterwards, the 100.conf will look similar to this:

arch: amd64
cores: 2
features: keyctl=1,nesting=1
hostname: docker
memory: 4096
nameserver: 192.168.40.1
net0: name=eth0,bridge=vmbr1,firewall=1,gw=192.168.40.1,ip=192.168.40.9/24,tag=40,type=veth
ostype: debian
rootfs: storagedocker:100/vm-100-disk-0.raw,size=60G
searchdomain: local.mytld.com
startup: order=1
swap: 4096
unprivileged: 1

Setup Docker in LXC#


Now, login to the newly created LXC container via ssh.

Optionally install sudo:

apt install sudo

Set time zone. In unprivileged containers, use:

dpkg-reconfigure tzdata

Install Docker. This is from the docs.

# 2x
apt-get update && apt-get upgrade
# Docker
apt-get install apt-transport-https ca-certificates curl gnupg2 software-properties-common -y
curl -fsSL https://download.docker.com/linux/debian/gpg | apt-key add -
add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/debian \
   $(lsb_release -cs) \
   stable"
apt-get update
apt-get install docker-ce -y

Change the storage driver to overlay2.

echo -e '{\n  "storage-driver": "overlay2"\n}' >> /etc/docker/daemon.json

Note

Keep an eye open if you have subnets in the 192.168.0.0 range. This range is among the list of subnets that docker may select for the default_network. See issue #37823.

It is possible to remove 192.168.0.0 from this list, by updating daemon.json, e.g.:

{
    "storage-driver": "overlay2",
    "bip": "193.168.1.5/24",
    "default-address-pools":
    [
        {"base":"172.17.0.0/16","size":24}
    ]
}

Optionally install docker-compose. Follow the docs.

Test Docker#


Restart the LXC container and test Docker setup.

systemctl status docker
docker run hello-world

Hello from Docker!

Yay!

Conclusions#


Now, what is neat about this setup is that it is entirely possible to have several LXC containers that run separate Docker systems.

For example, I use the above Docker LXC for hosting stable services in the local service network VLAN.

In another LXC container, I have Docker setup for experimental containers, with quick access to docker system prune --all && docker volume prune.

The performance of this Docker-LXC-nesting is negligible, since all resources are shared and running Docker containers do not consume resources, if they are not active.

Caveats#

Special container permissions#

Except for one case (see below), I did not have any issues with this setup for over a year now, running several unprivileged LXC containers with individual docker hosts alongside.

Gitlab Docker: open /proc/sys/kernel/domainname: permission denied

Keep an eye out for Docker setups that require access to special system ressources. The only time this happened to me was with the Gitlab docker, and it was easy to solve.3

Gitlab tried to modify the sysctl domainname, which is not allowed in unprivileged LXC containers. Removing hostname from the docker-compose.yml solved this issue.

ZFS#

Under ZFS, the r/w performance of Docker inside LXC may be significantly reduced. Since I am using a Raid 1 storage directly mounted from the host, this issue does not apply to me.

Currently, using Raid 1, I am not protected against bit rot. So my plan is to switch to ZFS. Then I will give this a try.


  1. The core of this content appeared in a post on reddit, April 19, 2020. 

  2. Setup of chore comes from a general discussion on linuxcontainers.org 

  3. Gitlab domainname issue on unpriviliged LXC #743#issuecomment-860164507