Troubleshooting is the art of turning frustration into curiosity and chaos into clarity …
Introduction
Running applications as a non-root user is a crucial security practice, especially in containerized environments like Kubernetes. While Kubernetes provides the securityContext
and its runAsUser
feature to simplify this, implementing it can sometimes lead to unexpected challenges. In this article, I share my journey of troubleshooting and overcoming these hurdles, ensuring that the application runs securely without elevated privileges. Whether you're encountering similar challenges or just beginning to explore the runAsUser
configuration, this article offers practical insights and solutions to strengthen your skills in implementing security for Kubernetes Pods. I encourage you to read through to the end for a comprehensive understanding.
The Situation
Realizing the importance of running applications as non root user and the complications it can create while going for business compliance, I decided to test securityContext
feature in k8s for an nginx based application. By default, the container from nginx image runs as root. Based on my intention, I am changing the behaviour to run as user id "400" ( A randon UID). It is added under container spec, add below securityContext
spec:
containers:
- image: nginx
securityContext:
runAsUser: 400
name: app1
I exepected the pod to come up, but it decided not to. It complains of permission issues. The container do not have permission to access certain files, while starting up.
The directory /usr/share/nginx/html is not mounted.
Therefore, over-writing the default index.html file with some useful information:
tee: /usr/share/nginx/html/index.html: Permission denied
Praqma Network MultiTool (with NGINX) - app1-59b5fff48c-645xb - 10.42.3.20 - HTTP: 80 , HTTPS: 443
========================= IMPORTANT ==============================
/docker/entrypoint.sh: line 26: can't create /usr/share/nginx/html/index.html: Permission denied
cat: can't open '/root/press-release.md': Permission denied
==================================================================
nginx: [alert] could not open error log file: open() "/var/lib/nginx/logs/error.log" failed (13: Permission denied)
2025/01/09 05:17:03 [warn] 1#1: the "user" directive makes sense only if the master process runs with super-user privileges, ignored in /etc/nginx/nginx.conf:3
2025/01/09 05:17:03 [emerg] 1#1: cannot load certificate "/certs/server.crt": BIO_new_file() failed (SSL: error:0200100D:system library:fopen:Permission denied:fopen('/certs/server.crt','r') error:2006D002:BIO routines:BIO_new_file:system lib)
kubectl logs app1-5bffc684c4-gmlhj -n proda
/docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration
/docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/
/docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh
10-listen-on-ipv6-by-default.sh: info: can not modify /etc/nginx/conf.d/default.conf (read-only file system?)
/docker-entrypoint.sh: Sourcing /docker-entrypoint.d/15-local-resolvers.envsh
/docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.sh
/docker-entrypoint.sh: Launching /docker-entrypoint.d/30-tune-worker-processes.sh
/docker-entrypoint.sh: Configuration complete; ready for start up
2025/01/10 06:08:40 [warn] 1#1: the "user" directive makes sense only if the master process runs with super-user privileges, ignored in /etc/nginx/nginx.conf:2
nginx: [warn] the "user" directive makes sense only if the master process runs with super-user privileges, ignored in /etc/nginx/nginx.conf:2
2025/01/10 06:08:40 [emerg] 1#1: mkdir() "/var/cache/nginx/client_temp" failed (13: Permission denied)
nginx: [emerg] mkdir() "/var/cache/nginx/client_temp" failed (13: Permission denied)
kubectl logs app1-5bffc684c4-gmlhj -n proda
Defaulted container "app1" out of: app1, debugger-nk2jx (ephem), debugger-42pkr (ephem), debugger-4vblr (ephem)
Error from server: Get "https://192.168.1.7:10250/containerLogs/proda/app1-5bffc684c4-gmlhj/app1": proxy error from 127.0.0.1:6443 while dialing 192.168.1.7:10250, code 502: 502 Bad Gateway
➜ kubectl debug -n proda -it app1-5bffc684c4-gmlhj --image=busybox
Defaulting debug container name to debugger-nk2jx.
If you don't see a command prompt, try pressing enter.
warning: couldn't attach to pod/app1-5bffc684c4-gmlhj, falling back to streaming logs: error dialing backend: proxy error from 127.0.0.1:6443 while dialing 192.168.1.7:10250, code 502: 502 Bad Gateway
Error from server: Get "https://192.168.1.7:10250/containerLogs/proda/app1-5bffc684c4-gmlhj/debugger-nk2jx": proxy error from 127.0.0.1:6443 while dialing 192.168.1.7:10250, code 502: 502 Bad Gateway
I notice the errors mentioning about ENTRYPOINT, permission issue, failed to open file etc. I also notice that for some pods it complains about 502 bad gateway. WHY? This could be some unrelated issue. May be the images do not have the corresponding UID? But why does it complain about BAD Gateway after sometime. Let's understand the workflow when we run a kubectl log
command.
- kubectl will perform the basic validation, gets the k8s API endpoint, user token, certs and construct a http request to API server.
- The k8s API server will perform authentication, authorization and retrieves the Pod information from etcd. From this information it gets the node where the pod is running currently.
- It then proxies the log request to the kubelet on the node. The request will look like below as the error log. The URL request is send by API servers, internal proxy
https://192.168.1.7:10250/containerLogs/proda/app1-5bffc684c4-gmlhj/app1
- The kubelet in the node will recieve the request and verifies if the pod and container exists in the node. In our case the container might be down and hence the 502 response to API server, from kubelet ? Or it could be some issue with kubelet and the node itself. It should have returned 404 instead of 502. This need to be verified.
- If the container had existed, it will connect to the container runtime and grabbed the logs from the location of container logs.
My suspicion of kubelet issue on the node side, came true after I ran the pod in another node. As it never complained about the 502 gateway isue. Drained the node and deleted it.
Focusing on the main issue (Permission issue ), let's see how we can fix this.
- Find the existing UID of the application image and use it in the securityContext. Ideally the application which the container will be running should have a UID. Let me run the image locally and get the details from /etc/passwd
➜ docker run -it --rm nginx /bin/bash
root@cdb7b7a8cb62:/# grep nginx /etc/passwd
nginx:x:101:101:nginx user:/nonexistent:/bin/false
I am updating the UID to 101 in the pod definition and running it.
spec:
containers:
- image: nginx
securityContext:
runAsUser: 101
Still getting the permission issue;
nginx: [warn] the "user" directive makes sense only if the master process runs with super-user privileges, ignored in /etc/nginx/nginx.conf:2
2025/01/10 06:08:40 [emerg] 1#1: mkdir() "/var/cache/nginx/client_temp" failed (13: Permission denied)
nginx: [emerg] mkdir() "/var/cache/nginx/client_temp" failed (13: Permission denied)
It is complaining about permission issues for /var/cache/nginx folder as well as a warning about user
directive in config. Before I fix this, I wanted to test, what change it will bring when I enable the securityContext at POD level -> spec.template.spec.securityContext
spec:
securityContext:
runAsUser: 101
containers:
- image: nginx
#securityContext:
# runAsUser: 101
# runAsGroup: 101
➜ kubectl apply -f deploy.yaml -n proda
deployment.apps/app1 configured
➜ kubectl get pods -n proda
NAME READY STATUS RESTARTS AGE
app1-694d5bc75f-5pjs5 1/1 Running 0 3s
app1-694d5bc75f-bg46q 1/1 Running 0 5s
app1-694d5bc75f-2t78l 1/1 Running 0 5s
The pod is up now, but I observe that the main process is run as root and its child process is run as nginx user (UID 101)
root@app1-694d5bc75f-2t78l:/# ps -aef
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 00:51 ? 00:00:00 nginx: master process nginx -g daemon off;
nginx 30 1 0 00:51 ? 00:00:00 nginx: worker process
nginx 31 1 0 00:51 ? 00:00:00 nginx: worker process
root 32 0 0 00:51 pts/0 00:00:00 bash
This is not what I want. Let me also try fsGroup and runAsGroup options in securityContext. While fsGroup affects file system group ownership for mounted volumes, runAsGroup affects the primary group ID for container processes
spec:
securityContext:
runAsUser: 101
runAsGroup: 101
fsGroup: 101
containers:
- image: nginx
I noticed, still the container started as user root
.
Can we fix this by adjusting the user directivie in nginx config file. Let's add below config file for nginx.
apiVersion: v1
kind: ConfigMap
metadata:
name: nginx-config
namespace: check-host-ns
data:
nginx.conf: |
user nginx;
events {
worker_connections 1024;
}
http {
server {
listen 8080;
location / {
root /usr/share/nginx/html;
index index.html;
}
}
}
Note that, I have added a user and changed the default port to 8080.
Also adjusted the deploy resource to include the nginx config map as mount volume
...
volumeMounts:
- name: nginx-config
mountPath: /etc/nginx/nginx.conf
subPath: nginx.conf
volumes:
- name: nginx-config
configMap:
name: nginx-config
...
➜ kubectl exec -it app1-6c86f8bbfc-2v76n -n proda -- id
uid=0(root) gid=0(root) groups=0(root)
Still it is starting as root user. Also the permission of nginx directories are owned by root.
➜ kubectl exec -it app1-6c86f8bbfc-2v76n -n proda -- ls -ld /var/cache/nginx
drwxr-xr-x 1 root root 4096 Jan 11 01:08 /var/cache/nginx
Now let me try the powers of an init container. I can change the permission of folders using an initContainer.
initContainers:
- name: fix-permissions
image: busybox
command:
- sh
- -c
- |
chown -R 101:101 /var/cache/nginx
chown -R 101:101 /usr/share/nginx/html
volumeMounts:
- name: nginx-cache
mountPath: /var/cache/nginx
- name: nginx-html
mountPath: /usr/share/nginx/html
➜ kubectl exec -it app1-98758c996-2nn5h -n proda -- ls -ld /var/cache/nginx
drwxrwxrwx 7 nginx nginx 4096 Jan 11 02:31 /var/cache/nginx
Okey, so now the folder permission is fixed. But still the pod is started as root
user.
Is there a better way? How can I start the process as nginx user, so that all folders come up with it's permission. The ideal solution will be to update the Dockerfile for the image to include USER directive. But here we are finding a solution within the kubernetes resources.
The container runtime will start the container as per the ENTRYPOINT entry of the container image. Can we adjust that a little bit, so that it starts as nginx user. I am going to try overwrite it within k8s by starting a shell and running the nginx inside it as argument.
Let's try below;
containers:
- image: nginx
command:
- /bin/sh
args:
- -c
- |
chown -R 101:101 /var/cache/nginx /var/run /usr/share/nginx/html && \
nginx -g "daemon off;"
securityContext:
runAsUser: 101
runAsGroup: 101
The problem here is, I am asking the pod to run as nginx user and trying to change folder permissions on /var/run
which are restricted to root user.
It requires root priviledge to change permission on /var/run
folder. I think we can leverage an initContainer to change the permissions of folders and let the main container just start nginx.
containers:
- image: nginx
command:
- /bin/sh
args:
- -c
- |
nginx -g "daemon off;"
securityContext:
runAsUser: 101
runAsGroup: 101
name: app1
volumeMounts:
- name: nginx-config
mountPath: /etc/nginx/nginx.conf
subPath: nginx.conf
- name: nginx-cache
mountPath: /var/cache/nginx
- name: nginx-html
mountPath: /usr/share/nginx/html
- name: nginx-run
mountPath: /var/run
initContainers:
- name: fix-permissions
image: busybox
command:
- sh
- -c
- |
mkdir -p /var/run && \
touch /var/run/nginx.pid && \
chown -R 101:101 /var/run
chown -R 101:101 /var/cache/nginx
chown -R 101:101 /usr/share/nginx/html
volumeMounts:
- name: nginx-cache
mountPath: /var/cache/nginx
- name: nginx-html
mountPath: /usr/share/nginx/html
- name: nginx-run
mountPath: /var/run
This worked like a charm.
➜ kubectl get pods -n proda
NAME READY STATUS RESTARTS AGE
app1-5f9cf87669-f5jp8 1/1 Running 0 3h39m
app1-5f9cf87669-jzhtr 1/1 Running 0 3h39m
app1-5f9cf87669-w6m9h 1/1 Running 0 3h39m
app1-5f9cf87669-z8j2r 1/1 Running 0 3h39m
➜ kubectl exec -it app1-5f9cf87669-f5jp8 -n proda -c app1 -- id
uid=101(nginx) gid=101(nginx) groups=101(nginx)
➜ kubectl exec -it app1-5f9cf87669-f5jp8 -n proda -c app1 -- ls -ld /usr/share/nginx/html
drwxrwxrwx 2 nginx nginx 4096 Jan 11 08:35 /usr/share/nginx/html
Now, you might argue that running the initContainer
as a root user is necessary to change folder permissions. However, it's important to remember that an initContainer
is a special type of container in a Pod that executes before any main container starts. Its primary purpose is to perform initialization tasks such as setting up configurations, fixing file permissions, or ensuring dependencies are ready. The unique advantage of initContainers
lies in their ability to share volumes mounted at the Pod level, ensuring that the main container starts only after the initContainer
has successfully completed its tasks.
Situation Under Control ?
Now, that we achieved our objective, for pruduction environments, I would suggest to fix it at image level. In below dockerfile, I am using the USER directive to make sure the container starts as user nginx.
FROM nginx:latest
COPY nginx.conf /etc/nginx/nginx.conf
RUN mkdir -p /var/cache/nginx && chown -R nginx:nginx /var/cache/nginx
USER nginx
This was tested in a local desktop docker setup.
➜ docker build -t nginx-local:latest .
➜ securitycontext docker run -it --rm nginx-local id
uid=101(nginx) gid=101(nginx) groups=101(nginx)
➜ securitycontext docker run -it --rm nginx-local ls -ld /var/cache/nginx
drwxr-xr-x 1 nginx nginx 4096 Nov 26 16:44 /var/cache/nginx
We can push this image and use it in the pod image, without any initContainers.
Conclusion
During this troubleshooting exercise, we gained hands-on experience configuring and running containers as a non-root user using the securityContext
feature in Kubernetes. Each challenge provided valuable insights as we implemented solutions to overcome various barriers. Key aspects we addressed included understanding how kubectl logs
operates, leveraging the capabilities of initContainers
, overriding the ENTRYPOINT
set by container images within Kubernetes Pod definitions, and ultimately creating a custom Docker image with the USER
directive baked into it.
Hope you enjoyed this article. If you liked my article, you can follow my publication for future articles, which give me the motivation to share more. If you have any comments, please write to me. https://medium.com/@asishmm https://devopsforyou.com/
Thank You !