Conversation
MoisesGSalas
left a comment
There was a problem hiding this comment.
hi @timmc-edx, sorry for the late response.
I left a question in my comment, I'm mostly thinking about running codejail in a Kubernetes cluster with multiple independent openedx instances.
| * The ``NPROC`` limit constrains the ability of the *current* process to | ||
| create new threads and processes, but the usage count (how many processes | ||
| already exist) is the sum across *all* processes with the same UID, even in | ||
| other containers on the same host where the UID may be mapped to a different | ||
| username. This constraint also applies to the app user due to how the | ||
| rlimits are applied. Even if a UIDs are chosen so they aren't used by other | ||
| software on the host, multiple codejail sandbox processes on the same host | ||
| will share this usage pool and can reduce each other's ability to create | ||
| processes. In this situation, ``NPROC`` will need to be set higher than it | ||
| would be for a single codejail instance taking a single request at a time. |
There was a problem hiding this comment.
So if I'm getting this right, if the app user spawns multiple sandboxes (for example the codejail service handling multiple requests) the process pool will be shared between them. But not only that the same pool will be shared across different containers in the same host? is that correct? then if one codejail instance is running alongside other instances and I set NPROC to a low value it might always fail?
There was a problem hiding this comment.
That's correct, yes. It's a fundamental limit of how rlimit operates. One option would be to ensure that your codejail pods are spread out over several hosts (using Kubernetes' anti-affinity mechanism). Also see the notes here on how to choose UIDs for the app and sandbox users: https://github.com/openedx/codejail-service/blob/main/docs/deployment.rst#app-user-uid
I think a longer term solution would be to replace the current codejail mechanism with something that spins up a container per execution (giving better memory confinement) and that also uses systemd's virtual-user mechanism (which creates an ephemeral user with randomized UID, for better NPROC isolation).
No description provided.