Files
Liav A. 2a4a096e0f Kernel+runc: Make unshare syscalls more fd-oriented
Instead of creating a new resource that has its own ID number and work
with it directly, we can create a file that describes the unshared
resource, execute ioctl calls on it and only enter into it in the end,
essentially creating the resource only during the last call instead
of the previous method of creation of a resource when "attaching" to
that resource.

We can enter a resource for current program execution, after the exec
syscall, or both.
That change allows userspace to create a resource and attach to it only
in the new program, which makes it more comfortable to do cleanups or
track the new process, outside of the created container.

It should be noted that until this commit, we entered a resource without
detaching the old one, essentially leaking the attach counter of a
resource. While this bug didn't have severe effects, it was obvious that
a proper cleanup userspace code later on wouldn't work in that situation
anyway, so this commit changes the way we work, and the terminology of
entering a resource is actually to **replace** it.

These changes essentially open an opportunity to extend runc to be a
container manager rather being launcher of a containerized environment,
which makes it possible to do all sorts of nice cleanups and tracking of
containers' states.
2026-03-14 11:45:37 +01:00
..