Containers fail to start showing the following Docker errors when describing the pod:
Error response from daemon: OCI runtime create failed: container_linux.go:348: starting container process caused "process_linux.go:301: running exec setns process for init caused \"exit status 40\"": unknown
The error may indicate problems with memory allocation on the host even if the system may appear to have enough resources looking at the memory usage with standard Linux
Such failures are also often accompanied by page allocation errors that can be seen in the kernel log (
[Tue Mar 31 20:08:11 2020] runc:[1:CHILD]: page allocation failure: order:8, mode:0xc0d0
In addition, another symptom of the issue may be high kubelet memory usage:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2496 root 20 0 13.454g 9.362g 36472 S 10.9 9.9 949:57.99 kubelet
The issue was observed on RHEL-based systems running older Linux kernels, such as
There are a couple of possible workarounds for the issue.
Upgrading the kernel to a newer version is one of them. In the example mentioned above, after upgrading the kernel from
3.10.0-1127.8.2.el7 a user stopped seeing memory issues.
Another possible workaround is to set
vm.zone_reclaim_mode kernel parameter to 1 if it’s set to 0 in order to enable more aggressive memory reclaim mode so the system can reclaim back from cached memory:
$ sysctl -w vm.zone_reclaim_mode=1
Note: To make the kernel parameter persist across node restarts, set it via a conf file in
The following RedHat KB article provides more information about the issue and possible workarounds mentioned above.