Skip to main content

Preventing Container Escape

Container escape occurs when an attacker exploits security flaws to break out of a container and gain access to the host system. This can lead to privilege escalation, data theft, lateral movement, and full cluster compromise.

To mitigate container escape risks, Kubernetes administrators should enforce strict security controls at the container and pod level.


1. Disable Privileged Containers

Required knowledge for the CKS certification.

Running a container with privileged: true grants it full access to the host system, increasing the risk of escape. Attackers can leverage this setting to mount the host’s filesystem, manipulate system processes, or escalate privileges.

Secure Configuration

apiVersion: v1
kind: Pod
metadata:
name: secure-pod
spec:
containers:
- name: app-container
image: secure-image
securityContext:
privileged: false

Why It Matters

  • Prevents access to kernel modules and sensitive host files.
  • Restricts direct interaction with the host’s networking and devices.

2. Disable Host Namespace Sharing

Host namespaces expose the container to the host’s processes, network, and mount points, allowing attackers to manipulate system resources.

Secure Configuration

apiVersion: v1
kind: Pod
metadata:
name: secure-pod
spec:
hostPID: false
hostIPC: false
hostNetwork: false
containers:
- name: app-container
image: secure-image

Why It Matters

  • Prevents the container from accessing host processes and system files.
  • Reduces the risk of privilege escalation through shared namespaces.

3. Use Seccomp to Restrict Syscalls

Required knowledge for the CKS certification.

Seccomp (Secure Computing Mode) filters system calls available to containers, reducing the attack surface.

Secure Configuration with Seccomp

In Kubernetes 1.19 and later, seccomp profiles can be configured via the seccompProfile field in securityContext.

apiVersion: v1
kind: Pod
metadata:
name: seccomp-pod
spec:
securityContext:
seccompProfile:
type: RuntimeDefault
containers:
- name: app-container
image: secure-image

Why It Matters

  • Blocks dangerous syscalls used for privilege escalation.
  • Prevents exploitation of kernel vulnerabilities.

4. Drop Unnecessary Linux Capabilities

Required knowledge for the CKS certification.

By default, containers have a set of capabilities that allow interactions with system resources. Reducing these capabilities minimizes the potential attack surface.

Secure Configuration

apiVersion: v1
kind: Pod
metadata:
name: secure-pod
spec:
containers:
- name: app-container
image: secure-image
securityContext:
capabilities:
drop:
- ALL

Why It Matters

  • Removes privileges that could be exploited for escape.
  • Ensures the container operates with the minimum necessary privileges.

5. Enforce Pod Security Standards

Required knowledge for the CKS certification.

Use Pod Security Admission (PSA) or PodSecurityPolicies (PSP) (deprecated in Kubernetes 1.21+) to enforce security restrictions that block high-risk configurations.

Example: Enforcing a Restricted Security Policy

apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: restricted-psp
spec:
privileged: false
runAsUser:
rule: MustRunAsNonRoot
allowPrivilegeEscalation: false
seLinux:
rule: RunAsAny
fsGroup:
rule: MustRunAs
readOnlyRootFilesystem: true

For newer Kubernetes versions using Pod Security Admission, apply a restricted policy:

apiVersion: v1
kind: Namespace
metadata:
name: secure-namespace
labels:
pod-security.kubernetes.io/enforce: restricted

Why It Matters

  • Ensures that pods run with the least privilege.
  • Blocks dangerous configurations like root access and writable root filesystems.

6. Restrict Host Filesystem Access

Containers should not have access to the host’s filesystem, especially directories like /proc, /sys, and /var/run/docker.sock.

Key Directories That Must Be Protected

PathRisk
/procGrants access to host processes, kernel settings, and system info. /proc/1/root exposes the host’s root filesystem.
/sysAllows modification of kernel parameters and hardware configurations.
/var/run/docker.sockProvides full control over Docker and allows spawning new privileged containers.
/devExposes host devices and can be abused to gain raw disk access.
/etcContains system-wide configuration files, including sensitive credentials.
/rootContains the root user’s home directory, which may include SSH keys and configuration files.

Stronger Secure Pod Configuration

To truly restrict host filesystem access, use a combination of:

  • Read-only root filesystem
  • Explicitly disabling hostPath volumes
  • Using AppArmor, Seccomp, and SELinux policies
  • Denying all unnecessary mounts
apiVersion: v1
kind: Pod
metadata:
name: secure-pod
spec:
containers:
- name: app-container
image: secure-image
securityContext:
readOnlyRootFilesystem: true
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
seccompProfile:
type: RuntimeDefault
appArmorProfile: runtime/default
volumeMounts:
- name: tmp
mountPath: /tmp
volumes:
- name: tmp
emptyDir: {}

Why It Matters

  • Prevents modifying critical system filesreadOnlyRootFilesystem: true
  • Blocks privilege escalation techniquesallowPrivilegeEscalation: false
  • Reduces attack surface by dropping all capabilitiescapabilities.drop: ALL
  • Restricts syscalls with SeccompseccompProfile: RuntimeDefault
  • Applies additional restrictions with AppArmorappArmorProfile: runtime/default
  • Prevents unnecessary filesystem mounts → No hostPath or sensitive volume mounts

7. Implement Container Sandboxing

Required knowledge for the CKS certification.

Using sandboxed runtimes like Kata Containers, gVisor, or Firecracker adds additional isolation layers between the container and the host.

Example: Running a Pod with gVisor

apiVersion: v1
kind: Pod
metadata:
name: sandboxed-pod
spec:
runtimeClassName: gvisor
containers:
- name: app-container
image: secure-image

Why It Matters

  • Prevents attackers from escaping the container to the host.
  • Creates additional isolation layers for security.

Conclusion

Preventing container escape is essential for Kubernetes security. By disabling privileged containers, enforcing security profiles, restricting filesystem access, and using sandboxed runtimes, administrators can significantly reduce the risk of container breakout attacks.