For some context, write_bytes_locked used to simply bail out before
writing any data if there weren't enough blocks to cover the entire size
of an inode before 1bf7f99a.
We're not actually restoring that behavior here, since computing the
amount of blocks to be allocated would get exceedingly complex,
considering that there could always be holes in between already
allocated blocks.
Instead, this simply makes allocate_blocks() bail out properly if there
aren't any free blocks left.
Fixes#24980
In Ext2FSInode::compute_block_list_impl(), each call to
process_block_array() creates a new ByteBuffer, which leads to a
kmalloc() call. The ByteBuffer is then discarded when
process_block_array() exits, leading to a kfree() call.
This leads to repeated kmalloc() and kfree() calls as ByteBuffers are
created and destroyed each time process_block_array() is called.
This commit makes it so that only 1 ByteBuffer is created for each level
of inode indirect block (so only 3 ByteBuffers are created at most).
These ByteBuffers are re-used on each call to process_block_array().
This reduces the number of kmalloc() and kfree() calls during
compute_block_list_impl(), especially for larger files.
Instead of using a raw `KBuffer` and letting each implementation to
populating the specific flags on its own, we change things so we only
let each FileSystem implementation to validate the flag and its value
but then store it in a HashMap which its key is the flag name and
the value is a special new class called `FileSystemSpecificOption`
which wraps around `AK::Variant<...>`.
This approach has multiple advantages over the previous:
- It allows runtime inspection of what the user has set on a `MountFile`
description for a specific filesystem.
- It ensures accidental overriding of filesystem specific option that
was already set is not possible
- It removes ugly casting of a `KBuffer` contents to a strongly-typed
values. Instead, a strongly-typed `AK::Variant` is used which ensures
we always get a value without doing any casting.
Please note that we have removed support for ASCII string-oriented flags
as there were no actual use cases, and supporting such type would make
`FileSystemSpecificOption` more complicated unnecessarily for now.
The superblock of an ext2 filesystem is always found on the storage
device at offset 1024. This 1024 number was hardcoded in the Ext2FS
code.
This commit:
* adds a constexpr to replace the hardcoded 1024 values
* removes a comment about one of the the hardcoded 1024 values which is
now umnecessary
With this change, we no longer preallocate blocks when an inode's size
is updated, and instead only allocate the minimum amount of blocks when
the inode is actually written to.
This is a large commit, since this is essentially a complete rewrite of
the key low-level functions that handle reading/writing blocks. This is,
however, a necessary prerequisite of being able to write holes.
The previous version of `flush_block_list()` (along with its numerous
helper functions) was entirely reliant on all blocks being sequential.
In contrast to the previous implementation, the new version
of `flush_block_list()` simply writes out the difference between the old
block list and the new block list by calculating the correct indirect
block(s) to update based on the relevant block's logical index.
`compute_block_list()` has also been rewritten, since the estimated
amount of meta blocks was incorrectly calculated for files with holes as
a result of the estimated amount of blocks being a function of the file
size. Since it isn't possible to accurately compute the shape of the
block list without traversing it, we no longer try to perform such a
computation, and instead simply search through all of the allocated
indirect blocks.
`compute_block_list_with_meta_blocks()` has also been removed in favor
of the new `compute_meta_blocks()`, since meta blocks are fundamentally
distinct from data blocks due to there being no mapping between any
logical block index and the physical block index.
Since we now only store blocks that are actually allocated, it is
entirely valid for the block list to be empty, so this commit lifts the
restrictions on accessing inodes with an empty block list.
This device is a block device that allows a user to effectively treat an
Inode as a block device.
The static construction method is given an OpenFileDescription reference
but validates that:
- The description has a valid custody (so it's not some arbitrary file).
Failing this requirement will yield EINVAL.
- The description custody points to an Inode which is a regular file, as
we only support (seekable) regular files. Failing this requirement
will yield ENOTSUP.
LoopDevice can be used to mount a regular file on the filesystem like
other supported types of (physical) block devices.
Such operation is almost equivalent to writing on an Inode, so lock the
Inode m_inode_lock exclusively.
All FileSystem Inode implementations then override a new method called
truncate_locked which should implement the actual truncating.
When the FileSystem does a sync, it gathers up all the inodes with
dirty metadata into a vector. The inode mutex is not held while
checking the inode dirty bit, which can lead to a kernel panic
due to concurrent inode modifications.
Fixes: #21796
Previously, attempting to update an ext2 inode with a UID or GID
larger than 65535 would overflow. We now write the high bits of UIDs
and GIDs to the same place that Linux does within the `osd2` struct.
We first must flush the superblock through the BlockBasedFileSystem
methods properly and only then clear the DiskCache pointer, to prevent a
possible kernel panic due to nullptr dereference.
Since this is the block size that file system drivers *should* set,
let's name it the logical block size, just like most file systems such
as ext2 already do anyways.
This never was a logical block size, it always was a device specific
block size. Ideally the block size would change in accordance to
whatever the driver wants to use, but that is a change for the future.
For now, let's get rid of this confusing naming.
This also makes it easier to understand and reference where these
(sometimes rather arbitrary) calculations come from.
This also fixes a bug where group_index_from_block_index assumed 1KiB
blocks.
This is a preparation before we can create a usable mechanism to use
filesystem-specific mount flags.
To keep some compatibility with userland code, LibC and LibCore mount
functions are kept being usable, but now instead of doing an "atomic"
syscall, they do multiple syscalls to perform the complete procedure of
mounting a filesystem.
The FileBackedFileSystem IntrusiveList in the VFS code is now changed to
be protected by a Mutex, because when we mount a new filesystem, we need
to check if a filesystem is already created for a given source_fd so we
do a scan for that OpenFileDescription in that list. If we fail to find
an already-created filesystem we create a new one and register it in the
list if we successfully mounted it. We use a Mutex because we might need
to initiate disk access during the filesystem creation, which will take
other mutexes in other parts of the kernel, therefore making it not
possible to take a spinlock while doing this.
This has KString, KBuffer, DoubleBuffer, KBufferBuilder, IOWindow,
UserOrKernelBuffer and ScopedCritical classes being moved to the
Kernel/Library subdirectory.
Also, move the panic and assertions handling code to that directory.
The contents of the directory inode could change if we are not taking so
we must take the m_inode_lock to prevent corruption when reading the
directory contents.
"Wherever applicable" = most places, actually :^), especially for
networking and filesystem timestamps.
This includes changes to unzip, which uses DOSPackedTime, since that is
changed for the FAT file systems.
That's what this class really is; in fact that's what the first line of
the comment says it is.
This commit does not rename the main files, since those will contain
other time-related classes in a little bit.
There was only one permanent storage location for these: as a member
in the Mount class.
That member is never modified after Mount initialization, so we don't
need to worry about races there.
A lot of places were relying on AK/Traits.h to give it strnlen, memcmp,
memcpy and other related declarations.
In the quest to remove inclusion of LibC headers from Kernel files, deal
with all the fallout of this included-everywhere header including less
things.
Because the ".." entry in a directory is a separate inode, if a
directory is renamed to a new location, then we should update this entry
the point to the new parent directory as well.
Co-authored-by: Liav A <liavalb@gmail.com>