Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 19 additions & 6 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -265,12 +265,25 @@ Limitations
* Sandbox isolation is achieved via AppArmor confinement. Codejail facilitates
this, but cannot isolate execution without the use of AppArmor.
* Resource limits can only be constrained using the mechanisms that Linux's
rlimit makes available. While rlimit can limit the size of any one file that
a process can create, and can limit the number of files it has open at any
one time, it cannot limit the total number of files written, and therefore
cannot limit the total number of bytes written across *all* files.
A partial mitigation is to constrain the max execution time. (All files
written in the sandbox will be deleted at end of execution, in any case.)
rlimit makes available. Some notable deficiencies:

* While rlimit's ``FSIZE`` can limit the size of any one file that
a process can create, and can limit the number of files it has open at any
one time, it cannot limit the total number of files written, and therefore
cannot limit the total number of bytes written across *all* files.
A partial mitigation is to constrain the max execution time. (All files
written in the sandbox will be deleted at end of execution, in any case.)
* The ``NPROC`` limit constrains the ability of the *current* process to
create new threads and processes, but the usage count (how many processes
already exist) is the sum across *all* processes with the same UID, even in
other containers on the same host where the UID may be mapped to a different
username. This constraint also applies to the app user due to how the
rlimits are applied. Even if a UIDs are chosen so they aren't used by other
software on the host, multiple codejail sandbox processes on the same host
will share this usage pool and can reduce each other's ability to create
processes. In this situation, ``NPROC`` will need to be set higher than it
would be for a single codejail instance taking a single request at a time.
Comment on lines +276 to +285
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if I'm getting this right, if the app user spawns multiple sandboxes (for example the codejail service handling multiple requests) the process pool will be shared between them. But not only that the same pool will be shared across different containers in the same host? is that correct? then if one codejail instance is running alongside other instances and I set NPROC to a low value it might always fail?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct, yes. It's a fundamental limit of how rlimit operates. One option would be to ensure that your codejail pods are spread out over several hosts (using Kubernetes' anti-affinity mechanism). Also see the notes here on how to choose UIDs for the app and sandbox users: https://github.com/openedx/codejail-service/blob/main/docs/deployment.rst#app-user-uid

I think a longer term solution would be to replace the current codejail mechanism with something that spins up a container per execution (giving better memory confinement) and that also uses systemd's virtual-user mechanism (which creates an ephemeral user with randomized UID, for better NPROC isolation).


* Sandboxes do not have strong isolation from each other. Under proper
configuration, untrusted code should not be able to discover other actively
running code executions, but if this assumption is violated then one sandbox
Expand Down
12 changes: 8 additions & 4 deletions codejail/jail_code.py
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,8 @@ def is_configured(command):
"VMEM": 0,
# Size of files creatable, in bytes, defaulting to nothing can be written.
"FSIZE": 0,
# The number of processes and threads to allow.
# The number of processes and threads to allow for the sandbox user (total
# across entire host).
"NPROC": 15,
# Whether to use a proxy process or not. None means use an environment
# variable to decide. NOTE: using a proxy process is NOT THREAD-SAFE, only
Expand Down Expand Up @@ -127,14 +128,17 @@ def set_limit(limit_name, value):
* `"FSIZE"`: the maximum size of files creatable by the jailed code,
in bytes. The default is 0 (no files may be created).

* `"NPROC"`: the maximum number of process or threads creatable by the
jailed code. The default is 15.
* `"NPROC"`: the maximum number of process or threads allowed for
jailed code across the entire host (combined across all instances
in all containers). This includes processes owned by the same UID
in containers where that UID is mapped to a different username.
The default is 15.

* `"PROXY"`: 1 to use a proxy process, 0 to not use one. This isn't
really a limit, sorry about that.

Limits are process-wide, and will affect all future calls to jail_code.
Providing a limit of 0 will disable that limit.
Providing a limit of 0 will disable that limit, unless otherwise specified.

"""
LIMITS[limit_name] = value
Expand Down