In addition to the already existing option to enter jail mode (which is
set indefinitely), there should be a less restrictive option that should
allow exiting jail mode when doing the execve syscall.
This option will be useful for programs that need this kind of security
layer only in their runtime, but they're meant to actually initiate
another program in the end.
In all instances, it should be clear that the jailing of a process is
ending when the process exits.
This is a preparation before introducing another option to set a process
as jailed until it calls the execve syscall.
The whole concept of Jails was far more complicated than I actually want
it to be, so let's reduce the complexity of how it works from now on.
Please note that we always leaked the attach count of a Jail object in
the fork syscall if it failed midway.
Instead, we should have attach to the jail just before registering the
new Process, so we don't need to worry about unsuccessful Process
creation.
The reduction of complexity in regard to jails means that instead of
relying on jails to provide PID isolation, we could simplify the whole
idea of them to be a simple SetOnce, and let the ProcessList (now called
ScopedProcessList) to be responsible for this type of isolation.
Therefore, we apply the following changes to do so:
- We make the Jail concept no longer a class of its own. Instead, we
simplify the idea of being jailed to a simple ProtectedValues boolean
flag. This means that we no longer check of matching jail pointers
anywhere in the Kernel code.
To set a process as jailed, a new prctl option was added to set a
Kernel SetOnce boolean flag (so it cannot change ever again).
- We provide Process & Thread methods to iterate over process lists.
A process can either iterate on the global process list, or if it's
attached to a scoped process list, then only over that list.
This essentially replaces the need of checking the Jail pointer of a
process when iterating over process lists.
We add a prctl option which would be called once after the dynamic
loader has finished to do text relocations before calling the actual
program entry point.
This change makes it much more obvious when we are allowed to change
a region protection access from being writable to executable.
The dynamic loader should be able to do this, but after a certain point
it is obvious that such mechanism should be disabled.
These syscalls are not necessary on their own, and they give the false
impression that a caller could set or get the thread name of any process
in the system, which is not true.
Therefore, move the functionality of these syscalls to be options in the
prctl syscall, which makes it abundantly clear that these operations
could only occur from a running thread in a process that sees other
threads in that process only.
It makes much more sense to have these actions being performed via the
prctl syscall, as they both require 2 plain arguments to be passed to
the syscall layer, and in contrast to most syscalls, we don't get in
these removed syscalls an automatic representation of Userspace<T>, but
two FlatPtr(s) to perform casting on them in the prctl syscall which is
suited to what has been done in the removed syscalls.
Also, it makes sense to have these actions in the prctl syscall, because
they are strongly related to the process control concept of the prctl
syscall.
Instead of using a special case of the annotate_mapping syscall, let's
introduce a new prctl option to disallow further annotations of Regions
as new syscall Region(s).