serenity

mirror of https://github.com/SerenityOS/serenity synced 2026-05-11 09:26:28 +02:00

Author	SHA1	Message	Date
Sönke Holz	577bc7ef95	Kernel/MM: Handle concurrent page faults properly in handle_zero_fault() This essentially reverts `5ada38f9c3`. Previously, two threads could end up trying to allocate a committed page at once, possibly resulting in a panic because we tried to allocate more pages than committed. Another problem was that a thread could incorrectly think that the page fault was already handled. This can happen if the thread handling the page fault already set the physical page slot to the newly allocated page, but didn't remap the page yet. We check if a page fault was already processed based on the physical page slot contents. This issue is not causing problems currently, since thinking a page fault was already handled and incorrectly returning will still work eventually when the other thread is done remapping the page. However, a future commit will add extra assertions checking that page faults were already handled appropriately if we couldn't find a reason for the fault. These assertions would trip on this. Prevent these issues by taking the lock for a longer amount of time. There might be a better solution to this, but that would likely require more complex code changes. Also modify the code in handle_fault() a bit to avoid using should_cow() for zero faults. The checks in should_cow() can refer to a different physical page if the page fault was handled immediately after the check.	2026-01-22 12:47:45 +01:00
Sönke Holz	23de6e93bd	Kernel/MM: Unify the non-RISC-V and RISC-V page fault handler Instead of having to maintain two different page fault handler implementations, let's unify the two by using the more generic RISC-V implementation. The RISC-V implementation doesn't depend on the processor providing the reason why a page fault occured. We don't need to know whether it's a NotPresent or ProtectionViolation fault to handle it correctly, as we already have enough metadata.	2026-01-22 12:47:45 +01:00
Sönke Holz	76b1277f81	Kernel/MM: Always print the faulting address in the page fault handler Only some dbgln()s were previously not printing fault.vaddr().	2026-01-22 12:47:45 +01:00
Sönke Holz	0759eefaed	Kernel/MM: Make RISC-V page fault implementation checks exhaustive This causes us to print a more useful error message than "Unexpected page fault". Additionally, this change will be necessary in a future commit, which expects us to handle all reasons for page faults exhaustively.	2026-01-22 12:47:45 +01:00
Sönke Holz	e5f671a71c	Kernel/riscv64: Set the access mode to Read for instruction page faults This matches the x86-64 and AArch64 behavior. This required moving the is_instruction_fetch() check before the is_read() check, since is_read() is now also true for instruction fetch page faults.	2026-01-22 12:47:45 +01:00
Sönke Holz	dcca347b0b	Kernel/MM: Rename handle_{dirty_on => inode}_write_fault ... and always use it for inode write faults. In the RISC-V page fault implementation, we previously only used this function if `should_dirty_on_write()` returned true. If it didn't, we'd use `handle_inode_fault()`. But this shouldn't be necessary, since `handle_dirty_on_write_fault()` already handles cases where the page isn't mapped yet. This should now also cause a page to be immediately marked as dirty if the first access to it was a write. Previously, this should have caused two page faults: one for loading the inode page and one for making it dirty.	2026-01-22 12:47:45 +01:00
Sönke Holz	2a2df29687	Kernel/MM: Remove unreachable code from Region::handle_fault() This code in the x86-64 and AArch64 page fault handler should be unreachable, since Region::map_individual_page_impl() always maps all non-null physical pages, therefore we should never get a PageNotPresent fault if the page slot is set to a non-null value. Lazy committed pages are implemented by mapping them read-only, so a write access to them will result in a ProtectionViolation fault, which will call very similar code in Region::handle_zero_fault(). Similarly in the RISC-V version, we already handle lazy committed page faults by calling Region::handle_zero_fault() a couple lines earlier if the page fault was generated by a write access and the region is writable.	2026-01-22 12:47:45 +01:00
implicitfield	2039c17902	Kernel/Memory: Prefer non-recursive Spinlocks	2025-06-05 22:02:40 +02:00
Sönke Holz	a57cad622b	Kernel/MM: Enforce W^X more strictly again The stricter W^X protection introduced by `af3d3c5c4a` was accidentally broken by `5194ab59b5`, since it didn't set the shadow permission bits to the initial Region permissions.	2025-04-27 17:15:07 +02:00
NoobZang	2e5560e96f	Kernel: Remove unnecessary shift operations in Region constructor The shift operations were originally introduced in `af3d3c5` to record the permission bits set for Region, but now has been replaced by the method introduced in `5194ab59b`, so the shift operations can be removed.	2025-04-27 12:18:07 +02:00
Sönke Holz	1af69c3782	Kernel/riscv64: Don't use __builtin___clear_cache on RISC-V Clang compiles that builtin to an abort for freestanding environments because RISC-V does not have an instruction to the flush the instruction cache for all harts. We don't support SMP on RISC-V currently, so simply use a `fence.i` for now.	2025-03-16 11:27:27 +01:00
Sönke Holz	d73a8fe750	Kernel/MM: Synchronize executable memory after handling an inode fault ARM requires an explicit cache flush when modifying executable memory: https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/caches-and-self-modifying-code __builtin___clear_cache compiles to a no-op if no explicit flushing is needed (like on x86): https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html#index-_005f_005fbuiltin_005f_005f_005fclear_005fcache	2025-01-13 19:30:54 +01:00
Sönke Holz	b3bae90e71	Kernel/x86: Set the WC PAT memory type for MemoryType::NonCacheable This removes the old Region::set_write_combine function and replaces all usages of it with setting the MemoryType to NonCacheable instead.	2024-11-23 19:29:50 +01:00
Sönke Holz	d3a0ae5c57	Kernel/MM: Replace Region::Cacheable with a more generic MemoryType enum This replaces all usages of Cacheable::Yes with MemoryType::Normal and Cacheable::No with either MemoryType::NonCacheable or MemoryType::IO, depending on the context. The Page{Directory,Table}::set_cache_disabled function therefore also has been replaced with a more appropriate set_memory_type_function. Adding a memory_type "getter" would not be as easy, as some architectures may not support all memory types, so getting the memory type again may be a lossy conversion. The is_cache_disabled function was never used, so just simply remove it altogether. There is no difference between MemoryType::NonCacheable and MemoryType::IO on x86 for now. Other architectures currently don't respect the MemoryType at all.	2024-11-23 19:29:50 +01:00
brody-qq	a0b021cbcf	Kernel/Memory: Fix crash on writes to shared file mmaps Writes to SharedInodeVMObjects could cause a Protection Violation if a page was marked as dirty by a different process. This happened due to a combination of 2 things: * handle_dirty_on_write_fault() was skipped if a page was already marked as dirty * when a page was marked as dirty, only the Region that caused the page fault was remapped This commit: * fixes the crash by making handle_fault() stop checking if a page was marked dirty before running handle_dirty_on_write_fault() * modifies handle_dirty_on_write_fault() so that it always marks the page as dirty and remaps the page (this avoids a 2nd bug that was never hit due to the 1st bug)	2024-08-10 16:19:12 +02:00
brody-qq	faa6395a11	Kernel/Memory: Add more efficient method for remapping single page This commit introduces VMObject::remap_regions_single_page(). This method remaps a single page in all regions associated with a VMObject. This is intended to be a more efficient replacement for remap_regions() in cases where only a single page needs to be remapped. This commit also updates the cow page fault handling code to use this new method.	2024-07-12 08:52:06 -04:00
brody-qq	e14f954988	Kernel/Memory: Fix shared anonymous mmap changes not being shared Writes to a MAP_SHARED \| MAP_ANONYMOUS mmap region were not visible to other processes sharing the mmap region. This was happening because the page fault handler was not remapping the VMObject's m_regions after allocating a new page. This commit fixes the problem by calling remap_regions() after assigning a new page to the VMObject in the page fault handler. This remapping only occurs for shared Regions.	2024-07-12 08:52:06 -04:00
brody-qq	781ded408b	Kernel/Memory: Small refactor of handle_zero_fault() This commit makes the following minor changes to handle_zero_fault(): * cleans up a call to static_cast(), replacing it with a reference (a future commit will also use this reference). * replaces a call to vmobject() with the new reference mentioned above. * moves the definition of already_handled to inside the block where already_handled is used.	2024-07-12 08:52:06 -04:00
brody-qq	2278b17c42	Kernel/Memory: Remove cow map updates from try_allocate_split_region() AddressSpace::try_allocate_split_region() was updating the cow map of new_region based on the cow map of source_region. The problem is that both new_region and source_region reference the same vmobject and the same cow map, so these cow map updates didn't actually change anything. This commit: * removes the cow map updates from try_allocate_split_region() * removes Region::set_should_cow() since it is no longer used	2024-07-12 08:52:06 -04:00
brody-qq	3e9b269bcd	Kernel/Memory: Make mmap objects track dirty pages InodeVMObjects now track dirty and clean pages. This tracking of dirty and clean pages is used by the msync and purge syscalls. dirty page tracking works using the following rules: * when a new InodeVMObject is made, all pages are marked clean. * writes to clean InodeVMObject pages will cause a page fault, the fault handler will mark the page as dirty. * writes to dirty InodeVMObject pages do not cause page faults. * if msync is called, only dirty pages are flushed to storage (and marked clean). * if purge syscall is called, only clean pages are discarded.	2024-07-07 18:25:32 +02:00
Idan Horowitz	3aa1bd520b	Kernel: Support re-mapping MMIOVMObject-backed regions This is required for example when write combine is enabled on a region after the initial mapping.	2024-06-25 17:46:37 +02:00
brody-qq	6f6966fb55	Kernel: Remove redundant VERIFY() Removes a VERIFY() that is already checked earlier in the function	2024-06-05 20:18:44 +01:00
Idan Horowitz	26cff62a0a	Kernel: Rename Memory::PhysicalPage to Memory::PhysicalRAMPage Since these are now only used to represent RAM pages, (and not MMIO pages) rename them to make their purpose more obvious.	2024-05-17 15:38:28 -06:00
Idan Horowitz	827322c139	Kernel: Stop allocating physical pages for mapped MMIO regions As MMIO is placed at fixed physical addressed, and does not need to be backed by real RAM physical pages, there's no need to use PhysicalPage instances to track their pages. This results in slightly reduced allocations, but more importantly makes MMIO addresses which end up after the normal RAM ranges work, like 64-bit PCI BARs usually are.	2024-05-17 15:38:28 -06:00
Sönke Holz	6654021655	Kernel/riscv64: Don't hard-code the page fault reason on RISC-V Instead, rewrite the region page fault handling code to not use PageFault::type() on RISC-V. I split Region::handle_fault into having a RISC-V-specific implementation, as I am not sure if I cover all page fault handling edge cases by solely relying on MM's own region metadata. We should probably also take the processor-provided page fault reason into account, if we decide to merge these two implementations in the future.	2024-03-25 14:18:38 -06:00
Liav A	336fb4f313	Kernel: Move InterruptDisabler to the Interrupts subdirectory	2023-06-04 21:32:34 +02:00
Liav A	7c0540a229	Everywhere: Move global Kernel pattern code to Kernel/Library directory This has KString, KBuffer, DoubleBuffer, KBufferBuilder, IOWindow, UserOrKernelBuffer and ScopedCritical classes being moved to the Kernel/Library subdirectory. Also, move the panic and assertions handling code to that directory.	2023-06-04 21:32:34 +02:00
Liav A	1b04726c85	Kernel: Move all tasks-related code to the Tasks subdirectory	2023-06-04 21:32:34 +02:00
Idan Horowitz	1c2dbed38a	Kernel: Extend the lifetime of Regions during page fault handling Previously we had a race condition in the page fault handling: We were relying on the affected Region staying alive while handling the page fault, but this was not actually guaranteed, as an munmap from another thread could result in the region being removed concurrently. This commit closes that hole by extending the lifetime of the region affected by the page fault until the handling of the page fault is complete. This is achieved by maintaing a psuedo-reference count on the region which counts the number of in-progress page faults being handled on this region, and extending the lifetime of the region while this counter is non zero. Since both the increment of the counter by the page fault handler and the spin loop waiting for it to reach 0 during Region destruction are serialized using the appropriate AddressSpace spinlock, eventual progress is guaranteed: As soon as the region is removed from the tree no more page faults on the region can start. And similarly correctness is ensured: The counter is incremented under the same lock, so any page faults that are being handled will have already incremented the counter before the region is deallocated.	2023-04-06 20:30:03 +03:00
Timon Kruiper	697c5ca5e5	Kernel: Move Memory/PageDirectory.{cpp,h} to arch-specific directory The handling of page tables is very architecture specific, so belongs in the Arch directory. Some parts were already architecture-specific, however this commit moves the rest of the PageDirectory class into the Arch directory. While we're here the aarch64/PageDirectory.{h,cpp} files are updated to be aarch64 specific, by renaming some members and removing x86_64 specific code.	2023-01-27 11:41:43 +01:00
Ben Wiederhake	65b420f996	Everywhere: Remove unused includes of AK/Memory.h These instances were detected by searching for files that include AK/Memory.h, but don't match the regex: \\b(fast_u32_copy\|fast_u32_fill\|secure_zero\|timing_safe_compare)\\b This regex is pessimistic, so there might be more files that don't actually use any memory function. In theory, one might use LibCPP to detect things like this automatically, but let's do this one step after another.	2023-01-02 20:27:20 -05:00
kleines Filmröllchen	a6a439243f	Kernel: Turn lock ranks into template parameters This step would ideally not have been necessary (increases amount of refactoring and templates necessary, which in turn increases build times), but it gives us a couple of nice properties: - SpinlockProtected inside Singleton (a very common combination) can now obtain any lock rank just via the template parameter. It was not previously possible to do this with SingletonInstanceCreator magic. - SpinlockProtected's lock rank is now mandatory; this is the majority of cases and allows us to see where we're still missing proper ranks. - The type already informs us what lock rank a lock has, which aids code readability and (possibly, if gdb cooperates) lock mismatch debugging. - The rank of a lock can no longer be dynamic, which is not something we wanted in the first place (or made use of). Locks randomly changing their rank sounds like a disaster waiting to happen. - In some places, we might be able to statically check that locks are taken in the right order (with the right lock rank checking implementation) as rank information is fully statically known. This refactoring even more exposes the fact that Mutex has no lock rank capabilites, which is not fixed here.	2023-01-02 18:15:27 -05:00
Timon Kruiper	9827c11d8b	Kernel: Move InterruptDisabler out of Arch directory The code in this file is not architecture specific, so it can be moved to the base Kernel directory.	2022-10-17 20:11:31 +02:00
Liav A	60b088b89a	Kernel: Send SIGBUS to threads that use after valid Inode mmaped range According to Dr. POSIX, we should allow to call mmap on inodes even on ranges that currently don't map to any actual data. Trying to read or write to those ranges should result in SIGBUS being sent to the thread that did violating memory access. To implement this restriction, we simply check if the result of read_bytes on an Inode returns 0, which means we have nothing valid to map to the program, hence it should receive a SIGBUS in that case.	2022-09-26 20:00:34 +03:00
Liav A	6e26e9fb29	Revert "Kernel: Send SIGBUS to threads that use after valid Inode mmaped range" This reverts commit `0c675192c9`.	2022-09-24 13:49:40 +02:00
Liav A	0c675192c9	Kernel: Send SIGBUS to threads that use after valid Inode mmaped range According to Dr. POSIX, we should allow to call mmap on inodes even on ranges that currently don't map to any actual data. Trying to read or write to those ranges should result in SIGBUS being sent to the thread that did violating memory access.	2022-09-16 14:55:45 +03:00
Andreas Kling	a3b2b20782	Kernel: Remove global MM lock in favor of SpinlockProtected Globally shared MemoryManager state is now kept in a GlobalData struct and wrapped in SpinlockProtected. A small set of members are left outside the GlobalData struct as they are only set during boot initialization, and then remain constant. This allows us to access those members without taking any locks.	2022-08-26 01:04:51 +02:00
Andreas Kling	2c72d495a3	Kernel: Use RefPtr instead of LockRefPtr for PhysicalPage I believe this to be safe, as the main thing that LockRefPtr provides over RefPtr is safe copying from a shared LockRefPtr instance. I've inspected the uses of RefPtr<PhysicalPage> and it seems they're all guarded by external locking. Some of it is less obvious, but this is an area where we're making continuous headway.	2022-08-24 18:35:41 +02:00
Andreas Kling	d3e8eb5918	Kernel: Make file-backed memory regions remember description permissions This allows sys$mprotect() to honor the original readable & writable flags of the open file description as they were at the point we did the original sys$mmap(). IIUC, this is what Dr. POSIX wants us to do: https://pubs.opengroup.org/onlinepubs/9699919799/functions/mprotect.html Also, remove the bogus and racy "W^X" checking we did against mappings based on their current inode metadata. If we want to do this, we can do it properly. For now, it was not only racy, but also did blocking I/O while holding a spinlock.	2022-08-24 14:57:51 +02:00
Andreas Kling	d6ef18f587	Kernel: Don't hog the MM lock while unmapping regions We were holding the MM lock across all of the region unmapping code. This was previously necessary since the quickmaps used during unmapping required holding the MM lock. Now that it's no longer necessary, we can leave the MM lock alone here.	2022-08-24 14:57:51 +02:00
Andreas Kling	6cd3695761	Kernel: Stop taking MM lock while using regular quickmaps You're still required to disable interrupts though, as the mappings are per-CPU. This exposed the fact that our CR3 lookup map is insufficiently protected (but we'll address that in a separate commit.)	2022-08-22 17:56:03 +02:00
Andreas Kling	c8375c51ff	Kernel: Stop taking MM lock while using PD/PT quickmaps This is no longer required as these quickmaps are now per-CPU. :^)	2022-08-22 17:56:03 +02:00
Andreas Kling	11eee67b85	Kernel: Make self-contained locking smart pointers their own classes Until now, our kernel has reimplemented a number of AK classes to provide automatic internal locking: - RefPtr - NonnullRefPtr - WeakPtr - Weakable This patch renames the Kernel classes so that they can coexist with the original AK classes: - RefPtr => LockRefPtr - NonnullRefPtr => NonnullLockRefPtr - WeakPtr => LockWeakPtr - Weakable => LockWeakable The goal here is to eventually get rid of the Lock* classes in favor of using external locking.	2022-08-20 17:20:43 +02:00
Andreas Kling	5ada38f9c3	Kernel: Reduce time under VMObject lock while handling zero faults We only need to hold the VMObject lock while inspecting and/or updating the physical page array in the VMObject.	2022-08-19 12:52:48 +02:00
Andreas Kling	a84d893af8	Kernel/x86: Re-enable interrupts ASAP when handling page faults As soon as we've saved CR2 (the faulting address), we can re-enable interrupt processing. This should make the kernel more responsive under heavy fault loads.	2022-08-19 12:14:57 +02:00
Andreas Kling	4bc3745ce6	Kernel: Make Region's physical page accessors safer to use Region::physical_page() now takes the VMObject lock while accessing the physical pages array, and returns a RefPtr<PhysicalPage>. This ensures that the array access is safe. Region::physical_page_slot() now VERIFY()'s that the VMObject lock is held by the caller. Since we're returning a reference to the physical page slot in the VMObject's physical page array, this is the best we can do here.	2022-08-18 19:20:33 +02:00
Andreas Kling	b560442fe1	Kernel: Don't hog VMObject lock when remapping a region page We really only need the VMObject lock when accessing the physical pages array, so once we have a strong pointer to the physical page we want to remap, we can give up the VMObject lock. This fixes a deadlock I encountered while building DOOM on SMP.	2022-08-18 18:56:35 +02:00
Andreas Kling	10399a258f	Kernel: Move Region physical page accessors out of line	2022-08-18 18:52:34 +02:00
Andreas Kling	75348bdfd3	Kernel: Don't require MM lock for Region::set_page_directory() The MM lock is not required for this, it's just a simple ref-counted pointer assignment.	2022-08-18 18:52:34 +02:00
Andreas Kling	27c1135d30	Kernel: Don't remap all regions from Region::remap_vmobject_page() When handling a page fault, we only need to remap the faulting region in the current process. There's no need to traverse all regions that map the same VMObject and remap them cross-process as well. Those other regions will get remapped lazily by their own page fault handlers eventually. Or maybe they won't and we avoided some work. :^)	2022-08-18 18:52:34 +02:00

1 2 3

106 Commits