Commit Graph

443 Commits

Author SHA1 Message Date
Sönke Holz
7b0553eb13 Kernel/x86: Add missing rcx + r11 clobber to Syscall::invoke with 4 args
The 4-arg version was missing those.
2024-08-21 08:17:17 -04:00
Liav A.
cb10f70394 Kernel: Change internal handling of filesystem-specific options
Instead of using a raw `KBuffer` and letting each implementation to
populating the specific flags on its own, we change things so we only
let each FileSystem implementation to validate the flag and its value
but then store it in a HashMap which its key is the flag name and
the value is a special new class called `FileSystemSpecificOption`
which wraps around `AK::Variant<...>`.

This approach has multiple advantages over the previous:
- It allows runtime inspection of what the user has set on a `MountFile`
  description for a specific filesystem.
- It ensures accidental overriding of filesystem specific option that
  was already set is not possible
- It removes ugly casting of a `KBuffer` contents to a strongly-typed
  values. Instead, a strongly-typed `AK::Variant` is used which ensures
  we always get a value without doing any casting.

Please note that we have removed support for ASCII string-oriented flags
as there were no actual use cases, and supporting such type would make
`FileSystemSpecificOption` more complicated unnecessarily for now.
2024-08-03 20:35:06 +02:00
Liav A.
0e6624dc86 Kernel: Introduce the unshare syscall family
These 2 syscalls are responsible for unsharing resources in the system,
such as hostname, VFS root contexts and process lists.

Together with an appropriate userspace implementation, these syscalls
could be used for creating a sandbox environment (containers) for user
programs.
2024-07-21 11:44:23 +02:00
Liav A.
4370bbb3ad Kernel+Userland: Introduce the copy_mount syscall
This new syscall will be used by the upcoming runc (run-container)
utility.

In addition to that, this syscall allows userspace to neatly copy RAMFS
instances to other places, which was not possible in the past.
2024-07-21 11:44:23 +02:00
Liav A.
dd59fe35c7 Kernel+Userland: Reduce jails to be a simple boolean flag
The whole concept of Jails was far more complicated than I actually want
it to be, so let's reduce the complexity of how it works from now on.
Please note that we always leaked the attach count of a Jail object in
the fork syscall if it failed midway.
Instead, we should have attach to the jail just before registering the
new Process, so we don't need to worry about unsuccessful Process
creation.

The reduction of complexity in regard to jails means that instead of
relying on jails to provide PID isolation, we could simplify the whole
idea of them to be a simple SetOnce, and let the ProcessList (now called
ScopedProcessList) to be responsible for this type of isolation.

Therefore, we apply the following changes to do so:
- We make the Jail concept no longer a class of its own. Instead, we
  simplify the idea of being jailed to a simple ProtectedValues boolean
  flag. This means that we no longer check of matching jail pointers
  anywhere in the Kernel code.
  To set a process as jailed, a new prctl option was added to set a
  Kernel SetOnce boolean flag (so it cannot change ever again).
- We provide Process & Thread methods to iterate over process lists.
  A process can either iterate on the global process list, or if it's
  attached to a scoped process list, then only over that list.
  This essentially replaces the need of checking the Jail pointer of a
  process when iterating over process lists.
2024-07-21 11:44:23 +02:00
Liav A.
91c87c5b77 Kernel+Userland: Prepare for considering VFSRootContext when mounting
Expose some initial interfaces in the mount-related syscalls to select
the desired VFSRootContext, by specifying the VFSRootContext index
number.

For now there's still no way to create a different VFSRootContext, so
the only valid IDs are -1 (for currently attached VFSRootContext) or 1
for the first userspace VFSRootContext.
2024-07-21 11:44:23 +02:00
Liav A.
e89726562b Kernel: Remove the AllMiceDevice class
This device was a short-lived solution to allow userspace (WindowServer)
to easily support hotplugging mouse devices with presumably very small
modifications on userspace side.

Now that we have a proper mechanism to propagate hotplug events from the
DeviceMapper program to any program that needs to get such events, we no
longer need this device, so let's remove it.
2024-07-06 21:42:32 +02:00
Liav A.
a1a2470c1e Documentation: Add a small guide on how to allocate new major numbers
Contributors that want to add new types of character or block devices
must follow these guidelines in order to keep everything intact.
2024-07-06 21:42:32 +02:00
Liav A.
16244c490a Kernel: Allocate all device major numbers within one known header file
We used to allocate major numbers quite randomly, with no common place
to look them up if needed.
This commit is changing that by placing all major number allocations
under a new C++ namespace, in the API/MajorNumberAllocation.h file.

We also add the foundations of what is needed before we can publish this
information (allocated numbers for block and char devices) to userspace.
2024-07-06 21:42:32 +02:00
Liav A.
7f5a2c1466 Kernel: Register block and character devices in separate HashMaps
Instead of putting everything in one hash map, let's distinguish between
the devices based on their type.

This change makes the devices semantically separated, and is considered
a preparation before we could expose a comprehensive list of allocations
per major numbers and their purpose.
2024-07-06 21:42:32 +02:00
implicitfield
4574a8c334 Kernel+LibC+LibCore: Implement mknodat(2) 2024-05-14 22:30:39 +02:00
Liav A.
15ddc1f17a Kernel+Userland: Reject W->X prot region transition after a prctl call
We add a prctl option which would be called once after the dynamic
loader has finished to do text relocations before calling the actual
program entry point.

This change makes it much more obvious when we are allowed to change
a region protection access from being writable to executable.
The dynamic loader should be able to do this, but after a certain point
it is obvious that such mechanism should be disabled.
2024-05-14 12:41:51 -06:00
Timothy Flynn
ec492a1a08 Everywhere: Run clang-format
The following command was used to clang-format these files:

    clang-format-18 -i $(find . \
        -not \( -path "./\.*" -prune \) \
        -not \( -path "./Base/*" -prune \) \
        -not \( -path "./Build/*" -prune \) \
        -not \( -path "./Toolchain/*" -prune \) \
        -not \( -path "./Ports/*" -prune \) \
        -type f -name "*.cpp" -o -name "*.mm" -o -name "*.h")

There are a couple of weird cases where clang-format now thinks that a
pointer access in an initializer list, e.g. `m_member(ptr->foo)`, is a
lambda return statement, and it puts spaces around the `->`.
2024-04-24 16:50:01 -04:00
Sönke Holz
243d7003a2 Kernel+LibC+LibELF: Move TLS handling to userspace
This removes the allocate_tls syscall and adds an archctl option to set
the fs_base for the current thread on x86-64, since you can't set that
register from userspace. enter_thread_context loads the fs_base for the
next thread on each context switch.
This also moves tpidr_el0 (the thread pointer register on AArch64) to
the register state, so it gets properly saved/restored on context
switches.

The userspace TLS allocation code is kept pretty similar to the original
kernel TLS code, aside from a couple of style changes.

We also have to add a new argument "tls_pointer" to
SC_create_thread_params, as we otherwise can't prevent race conditions
between setting the thread pointer register and signal handling code
that might be triggered before the thread pointer was set, which could
use TLS.
2024-04-19 16:46:47 -06:00
Sönke Holz
57f4f8caf8 Kernel+LibC: Introduce new archctl syscall
This syscall will be used for architecture-specific operations.
2024-04-19 16:46:47 -06:00
Dan Klishch
5ed7cd6e32 Everywhere: Use east const in more places
These changes are compatible with clang-format 16 and will be mandatory
when we eventually bump clang-format version. So, since there are no
real downsides, let's commit them now.
2024-04-19 06:31:19 -04:00
Liav A
0734de9f9a Kernel+Userland: Add mount MS_SRCHIDDEN option
Either we mount from a loop device or other source, the user might want
to obfuscate the given source for security reasons, so this option will
ensure this will happen.
If passed during a mount, the source will be hidden when reading from
the /sys/kernel/df node.
2024-03-13 15:33:47 -06:00
Liav A
5dcf03ad9a Kernel/Devices: Introduce the LoopDevice device
This device is a block device that allows a user to effectively treat an
Inode as a block device.

The static construction method is given an OpenFileDescription reference
but validates that:
- The description has a valid custody (so it's not some arbitrary file).
  Failing this requirement will yield EINVAL.
- The description custody points to an Inode which is a regular file, as
  we only support (seekable) regular files. Failing this requirement
  will yield ENOTSUP.

LoopDevice can be used to mount a regular file on the filesystem like
other supported types of (physical) block devices.
2024-03-13 15:33:47 -06:00
Timothy Flynn
4b777397b5 Kernel: Define bitwise operations for KeyModifier
This type is designed to be use as a flag. Define bitwise operations for
convenience.
2024-03-06 07:46:18 +01:00
implicitfield
a69e113d49 Kernel: Add a definition for the FAT32 FSInfo structure 2024-02-24 15:54:52 -07:00
implicitfield
8b77737f8e Kernel: Move FAT structure definitions to Kernel/API
We need to be able to include these definitions from userspace as the
upcoming mkfs.fat utility will depend on these.
2024-02-24 15:54:52 -07:00
Nico Weber
4409b33145 AK: Make IndexSequence use size_t
This makes it possible to use MakeIndexSequqnce in functions like:

    template<typename T, size_t N>
    constexpr auto foo(T (&a)[N])

This means AK/StdLibExtraDetails.h must now include AK/Types.h
for size_t, which means AK/Types.h can no longer include
AK/StdLibExtras.h (which arguably it shouldn't do anyways),
which requires rejiggering some things.

(IMHO Types.h shouldn't use AK::Details metaprogramming at all.
FlatPtr doesn't necessarily have to use Conditional<> and ssize_t could
maybe be in its own header or something. But since it's tangential to
this PR, going with the tried and true "lift things that cause the
cycle up to the top" approach.)
2024-02-11 18:53:00 +01:00
Jelle Raaijmakers
015622bc22 Kernel: Set correct KeyCode count
The underlying value of the `KeyCode::Key_*` enum values starts at 0,
so we should add 1 to `Key_Menu` to get the correct count.
2024-01-14 15:06:37 -07:00
hanaa12G
7abda6a36f Kernel: Add new sysconf option _SC_GETGR_R_SIZE_MAX 2024-01-06 04:59:50 -07:00
Liav A
c8f27d7cb8 Kernel+Userland: Implement support for PS2 scan code set 2
This scan code set is more advanced than the basic scan code set 1, and
is required to be supported for some bare metal hardware that might not
properly enable the PS2 first port translation in the i8042 controller.

LibWeb can now also generate bindings for keyboard events like the Pause
key, as well as other function keys (such as Right Alt, etc).

The logic for handling scan code sets is implemented by the PS2 keyboard
driver and is abstracted from the main HID KeyboardDevice code which
only handles "standard" KeyEvent(s).
2024-01-04 10:38:03 -07:00
Andrew Kaster
d3025668a4 Revert "Kernel+Userland: Implement support for PS2 scan code set 2"
This reverts commit 61a385fc01.

The commit broke the shift and caps lock key from working.
2023-12-29 22:02:19 +01:00
Liav A
61a385fc01 Kernel+Userland: Implement support for PS2 scan code set 2
This scan code set is more advanced than the basic scan code set 1, and
is required to be supported for some bare metal hardware that might not
properly enable the PS2 first port translation in the i8042 controller.

LibWeb can now also generate bindings for keyboard events like the Pause
key, as well as other function keys (such as Right Alt, etc).

The logic for handling scan code sets is implemented by the PS2 keyboard
driver and is abstracted from the main HID KeyboardDevice code which
only handles "standard" KeyEvent(s).
2023-12-29 16:40:59 +01:00
Liav A
b89cc81674 Kernel/HID: Expose character map index in the KeyEvent structure
This will be used later on by WindowServer so it will not use the
scancode, which will represent the actual character index in the
keyboard mapping when using scan code set 2.
2023-12-29 16:40:59 +01:00
Idan Horowitz
519214697b Kernel: Mark sys$getsockname as not needing the big process lock
This syscall does not access any big process lock protected resources.
2023-12-26 19:20:21 +01:00
Idan Horowitz
ed5406e47d Kernel: Mark sys$getpeername as not needing the big process lock
This syscall does not access any big process lock protected resources.
2023-12-26 19:20:21 +01:00
Idan Horowitz
24a60c5a10 Kernel: Mark sys$ioctl as not needing the big process lock
This syscall does not access any big process lock protected resources.
2023-12-26 19:20:21 +01:00
Idan Horowitz
d63667dbf1 Kernel: Mark sys$kill_thread as not needing the big process lock
This syscall does not access any big process lock protected resources.
2023-12-26 19:20:21 +01:00
Idan Horowitz
b44628c1fb Kernel: Mark sys$join_thread as not needing the big process lock
This syscall does not access any big process lock protected resources.
2023-12-26 19:20:21 +01:00
Idan Horowitz
82e6090f47 Kernel: Mark sys$detach_thread as not needing the big process lock
This syscall does not access any big process lock protected resources.
2023-12-26 19:20:21 +01:00
Idan Horowitz
b49a0e2c61 Kernel: Mark sys$create_thread as not needing the big process lock
Now that the master TLS region is spinlock protected, this syscall does
not access any big process lock protected resources.
2023-12-26 19:20:21 +01:00
Idan Horowitz
6a4b93b3e0 Kernel: Protect processes' master TLS with a fine-grained spinlock
This moves it out of the scope of the big process lock, and allows us
to wean some syscalls off it, starting with sys$allocate_tls.
2023-12-26 19:20:21 +01:00
Sönke Holz
fd8858ace2 Kernel/riscv64: Add RISC-V Syscall API 2023-11-10 15:51:31 -07:00
Romain Chardiny
61ac554a34 Kernel/Net: Implement TCP_NODELAY 2023-11-08 09:31:54 +01:00
Liav A
26f96d2a42 Kernel+Userland: Add option for duration of /dev/beep producing sound 2023-11-03 15:19:33 +01:00
Liav A
1b00618fd9 Kernel+Userland: Replace the beep syscall with the new /dev/beep device
There's no need to have separate syscall for this kind of functionality,
as we can just have a device node in /dev, called "beep", that allows
writing tone generation packets to emulate the same behavior.

In addition to that, we remove LibC sysbeep function, as this function
was never being used by any C program nor it was standardized in any
way.
Instead, we move the userspace implementation to LibCore.
2023-11-03 15:19:33 +01:00
Liav A
50429d3b22 LibC+Kernel: Move GPU-related API methods to a LibC header file
The Kernel/API directory in general shouldn't include userspace code,
but structure definitions that both are shared between the Kernel and
userspace.

All users of the ioctl API obviously use LibC so LibC is the most common
and shared library for the affected programs.
2023-09-15 11:05:25 -06:00
Liav A
8fe74c7d57 LibC+Kernel: Move device-files related methods to a LibC header file
The Kernel/API directory in general shouldn't include userspace code,
but structure definitions that both are shared between the Kernel and
userspace.

LibC is the most appropriate place for these methods as they're already
included in the sys/sysmacros.h file to create a set of convenient
macros for these methods.
2023-09-15 11:05:25 -06:00
Jakub Berkop
54e79aa1d9 Kernel+ProfileViewer: Display additional filesystem events 2023-09-09 11:26:51 -06:00
Liav A
39c93f63c8 Kernel: Move FileSystem/DeviceFileTypes.h => API/DeviceFileTypes.h
This file will be used by userspace code later on, so let's move to the
API directory.
2023-09-07 11:50:50 -06:00
Liav A
1c0aa51684 Kernel+Userland: Remove the {get,set}_thread_name syscalls
These syscalls are not necessary on their own, and they give the false
impression that a caller could set or get the thread name of any process
in the system, which is not true.

Therefore, move the functionality of these syscalls to be options in the
prctl syscall, which makes it abundantly clear that these operations
could only occur from a running thread in a process that sees other
threads in that process only.
2023-08-25 11:51:52 +02:00
Seal Sealy
1262a7d142 Kernel: Alias MAXNAMLEN to NAME_MAX
MAXNAMLEN is the BSD name for NAME_MAX, as used by some programs.
2023-08-18 11:43:19 +02:00
Lucas CHOLLET
cd0fe4bb48 Kernel: Mark sys$poll as not needing the big lock 2023-08-01 05:35:26 +02:00
Timothy Flynn
f798e43ea8 Kernel: Add a key code modifier to detect the number pad
This is analagous to how Qt exposes whether the number pad was used for
a key press.
2023-07-09 06:32:20 +02:00
Liav A
23a7ccf607 Kernel+LibCore+LibC: Split the mount syscall into multiple syscalls
This is a preparation before we can create a usable mechanism to use
filesystem-specific mount flags.
To keep some compatibility with userland code, LibC and LibCore mount
functions are kept being usable, but now instead of doing an "atomic"
syscall, they do multiple syscalls to perform the complete procedure of
mounting a filesystem.

The FileBackedFileSystem IntrusiveList in the VFS code is now changed to
be protected by a Mutex, because when we mount a new filesystem, we need
to check if a filesystem is already created for a given source_fd so we
do a scan for that OpenFileDescription in that list. If we fail to find
an already-created filesystem we create a new one and register it in the
list if we successfully mounted it. We use a Mutex because we might need
to initiate disk access during the filesystem creation, which will take
other mutexes in other parts of the kernel, therefore making it not
possible to take a spinlock while doing this.
2023-07-02 01:04:51 +02:00
Liav A
8142f7b196 Kernel: Mark sys$get_dir_entries as not needing the big lock
After examination of all overriden Inode::traverse_as_directory methods
it seems like proper locking is already existing everywhere, so there's
no need to take the big process lock anymore, as there's no access to
shared process structures anyway.
2023-05-27 10:58:58 +02:00