What exactly is a Container?

The concept of containers is rooted in Linux. Check out this RedHat blog about the history of containers. When people talk about containers, they are more or less talking about Linux containers.

However, the Linux Kernel doesn’t have a native object that represents a “container”. From the perspective of the kernel, containers are just processes. But what makes these processes special?

The best way to look at the properties of a process in a container is to look at some demos with the help of Docker, a tool that can create and run containers.

Filesystem Isolation

Firstly, a process in a container has an isolated view of the filesystem. In the demo below, we created a container based on the ubuntu image.

If we navigate to the root directory via cd /, we notice that the root filesystem of the process in a container is not the same one as the root filesystem on the host system. Modifying the root filesystem within the container will have no impact on the host system.

docker run -it ubuntu bash
cd /
ls
# bin  boot  dev  etc  home  lib  media  mnt  opt  proc
#  root  run  sbin  srv  sys  tmp  usr  var

# host system
cd /
ls
# bin    dev   lib         mnt   opt   run   srv       tmp
# boot   etc   lost+found  proc  sbin  swapfile  usr
# cdrom  home  media       root  snap  sys       var

The new root filesystem comes from the ubuntu image. A docker image is an executable file. A docker image is made up of filesystems layered over each other. These layers form the base for a container’s root filesystem.

Pid Isolation

Processes in a container have an isolated view of other processes running on the host. In the example below, if we perform ps -a -u to list all processes in the container, we only see the process running bash and ps -a -u. However, if we perform ps -a -u on the host system, we see a lot more processes.

Furthermore, in the example below the process perceives its pid as 1. However, from the perspective of the host system, the process running bash is 6098.

docker run -it ubuntu bash
ps -a -u
# USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
# root           1  0.2  0.0   4136  3200 pts/0    Ss   07:19   0:00 bash
# root           9  0.0  0.0   6412  2432 pts/0    R+   07:19   0:00 ps -a -u
echo $$
# 1

# host system
ps -a -u
# USER     PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
...
root       6098  0.0  0.0   4136  3200 pts/0    Ss+  15:19   0:00 bash

User ID Isolation

Processes in a container have an isolated view of things like user IDs and group IDs. This enables a process to run as different users inside and outside the container.

In the example below, we enable the user namespace via --userns-remap=default. The process in the container perceives its uid as 0. But if we look at the user corresponding to the process from the host system, the user is 165536.

sudo dockerd --userns-remap=default
sudo docker run -it --rm busybox /bin/sh
id
# uid=0(root) gid=0(root) groups=0(root),10(wheel)

# host system
ps -a -u
# USER        PID   %CPU %MEM    VSZ   RSS TTY     STAT  START TIME  COMMAND
# ...
# 165536     14154  0.0  0.0   3984  1920 pts/0    Ss+  14:33   0:00 /bin/sh

Resource Restriction

In Docker, you can constrain resources that the container can access. For example, you can limit the amount of memory the process can take, the number of CPUs the container can run on, etc. Check out Docker’s doc for the full list of resources that can be constrained.

As an example, here is how you can limit the container to have a memory limit of 128 mb.

docker run -it --memory 128m ubuntu bash

Secret behind Containers

The secret behind how a container can provide the isolation properties demonstrated above boils down to the following Linux primitives:

  • Namespaces
  • Capabilities
  • cgroups

We will cover these in greater detail throughout the blog!