Commit graph

572 commits

Author SHA1 Message Date
Nicholas Baron aa4d41fe2c
AK+Kernel+LibELF: Remove the need for IteratorDecision::Continue
By constraining two implementations, the compiler will select the best
fitting one. All this will require is duplicating the implementation and
simplifying for the `void` case.

This constraining also informs both the caller and compiler by passing
the callback parameter types as part of the constraint
(e.g.: `IterationFunction<int>`).

Some `for_each` functions in LibELF only take functions which return
`void`. This is a minimal correctness check, as it removes one way for a
function to incompletely do something.

There seems to be a possible idiom where inside a lambda, a `return;` is
the same as `continue;` in a for-loop.
2021-05-16 10:36:52 +01:00
Andreas Kling 4d429ba9ea Kernel: Unbreak profiling all processes
Regressed in 8a4cc735b9.
We stopped generating "process created" when enabling profiling,
which led to Profiler getting confused about the missing events.
2021-05-15 21:25:54 +02:00
Liav A 8a4cc735b9 Kernel: Don't use the profile timer if we don't have a timer to assign 2021-05-15 18:08:41 +02:00
Gunnar Beutner 4ab9d8736b Kernel: Make perf_event() work for global profiles
Previously calls to perf_event() would end up in a process-specific
perfcore file even though global profiling was enabled. This changes
the behavior for perf_event() so that these events are stored into
the global profile instead.
2021-05-15 16:28:18 +02:00
Brian Gianforcaro ede1483e48 Kernel: Make Process creation APIs OOM safe
This change looks more involved than it actually is. This simply
reshuffles the previous Process constructor and splits out the
parts which can fail (resource allocation) into separate methods
which can be called from a factory method. The factory is then
used everywhere instead of the constructor.
2021-05-15 09:01:32 +02:00
Andreas Kling 16221305ad LibELF: Remove sketchy use of "undefined" ELF::Image::Section
We were using ELF::Image::section(0) to indicate the "undefined"
section, when what we really wanted was just Optional<Section>.

So let's use Optional instead. :^)
2021-05-15 00:17:55 +02:00
Mart G e7310ba45a Kernel+LibC: Add fstatat
The function fstatat can do the same thing as the stat and lstat
functions. However, it can be passed the file descriptor of a directory
which will be used when as the starting point for relative paths. This
is contrary to stat and lstat which use the current working directory as
the starting for relative paths.
2021-05-14 23:32:10 +02:00
Gunnar Beutner 8614d18956 Kernel: Use a separate timer for profiling the system
This updates the profiling subsystem to use a separate timer to
trigger CPU sampling. This timer has a higher resolution (1000Hz)
and is independent from the scheduler. At a later time the
resolution could even be made configurable with an argument for
sys$profiling_enable() - but not today.
2021-05-14 00:35:57 +02:00
Andreas Kling e46343bf9a Kernel: Make UserOrKernelBuffer R/W helpers return KResultOr<size_t>
This makes error propagation less cumbersome (and also exposed some
places where we were not doing it.)
2021-05-13 23:28:40 +02:00
Brian Gianforcaro 0d50d3ed1e Kernel: Make InodeWatcher::crate API OOM safe 2021-05-13 16:21:53 +02:00
Brian Gianforcaro 51ceb172b9 Kernel: Replace make<T>() with adopt_own_if_nonnull() in sys$module_load 2021-05-13 16:21:53 +02:00
Brian Gianforcaro 956314f0a1 Kernel: Make Process::start_tracing_from API OOM safe
Modify the API so it's possible to propagate error on OOM failure.
NonnullOwnPtr<T> is not appropriate for the ThreadTracer::create() API,
so switch to OwnPtr<T>, use adopt_own_if_nonnull() to handle creation.
2021-05-13 16:21:53 +02:00
sin-ack fe5ca6ca27 Kernel: Implement multi-watch InodeWatcher :^)
This patch modifies InodeWatcher to switch to a one watcher, multiple
watches architecture.  The following changes have been made:

- The watch_file syscall is removed, and in its place the
  create_iwatcher, iwatcher_add_watch and iwatcher_remove_watch calls
  have been added.
- InodeWatcher now holds multiple WatchDescriptions for each file that
  is being watched.
- The InodeWatcher file descriptor can be read from to receive events on
  all watched files.

Co-authored-by: Gunnar Beutner <gunnar@beutner.name>
2021-05-12 22:38:20 +02:00
Gunnar Beutner 22ebd754d3 Kernel: Fix loading ELF images without PT_INTERP
Previously we'd try to load ELF images which did not have
an interpreter set with an incorrect load offset of 0, i.e. way
outside of the part of the address space where we'd expect either
the dynamic loader or the user's executable to reside.

This fixes the problem by using get_load_offset for both executables
which have an interpreter set and those which don't. Notably this
allows us to actually successfully execute the Loader.so binary:

courage:~ $ /usr/lib/Loader.so
You have invoked `Loader.so'. This is the helper program for programs
that use shared libraries. Special directives embedded in executables
tell the kernel to load this program.

This helper program loads the shared libraries needed by the program,
prepares the program to run, and runs it. You do not need to invoke
this helper program directly.
courage:~ $
2021-05-10 20:39:08 +02:00
Brian Gianforcaro 0b7395848a Kernel: Plumb OOM propagation through Custody factory
Modify the Custody::create(..) API so it has the ability to propagate
OOM back to the caller.
2021-05-10 11:55:52 +02:00
Brian Gianforcaro 8bf4201f50 Kernel: Move process creation perf events to PerformanceManager 2021-05-07 15:35:23 +02:00
Brian Gianforcaro ccdcb6a635 Kernel: Add PerformanceManager static class, move perf event APIs there
The current method of emitting performance events requires a bit of
boiler plate at every invocation, as well as having to ignore the
return code which isn't used outside of the perf event syscall. This
change attempts to clean that up by exposing high level API's that
can be used around the code base.
2021-05-07 15:35:23 +02:00
Brian Gianforcaro 11306d7121
Kernel: Modify TimeManagement::current_time(..) API so it can't fail. (#6869)
The fact that current_time can "fail" makes its use a bit awkward.
All callers in the Kernel are trusted besides syscalls, so assert
that they never get there, and make sure all current callers perform
validation of the clock_id with TimeManagement::is_valid_clock_id().

I have fuzzed this change locally for a bit to make sure I didn't
miss any obvious regression.
2021-05-05 18:51:06 +02:00
Brian Gianforcaro 234c6ae32d Kernel: Change Inode::{read/write}_bytes interface to KResultOr<ssize_t>
The error handling in all these cases was still using the old style
negative values to indicate errors. We have a nicer solution for this
now with KResultOr<T>. This change switches the interface and then all
implementers to use the new style.
2021-05-02 13:27:37 +02:00
Sahan Fernando bd563f0b3c Kernel: Make processes start with a 16-byte-aligned stack 2021-05-01 20:08:35 +02:00
Brian Gianforcaro a678851b41 Kernel: Harden sys$setgroups Vector usage against OOM 2021-05-01 09:10:30 +02:00
Itamar 6bbd2ebf83 Kernel+LibELF: Support initializing values of TLS data
Previously, TLS data was always zero-initialized.

To support initializing the values of TLS data, sys$allocate_tls now
receives a buffer with the desired initial data, and copies it to the
master TLS region of the process.

The DynamicLinker gathers the initial TLS image and passes it to
sys$allocate_tls.

We also now require the size passed to sys$allocate_tls to be
page-aligned, to make things easier. Note that this doesn't waste memory
as the TLS data has to be allocated in separate pages anyway.
2021-04-30 18:47:39 +02:00
Itamar 373e8bcbc7 Kernel: Give a name to the Master TLS region allocation 2021-04-30 18:47:39 +02:00
Gunnar Beutner 3c0355a398 Kernel: Accepted socket file descriptors should not inherit flags
For example Linux accepts an additional argument for flags in accept4()
that let the user specify what flags they want. However, by default
accept() should not inherit those flags from the listener socket.
2021-04-30 11:43:19 +02:00
Jesse Buhagiar 60cdbc9397 Kernel/LibC: Implement setreuid 2021-04-30 11:35:17 +02:00
Brian Gianforcaro a8765fa673 Kernel: Harden sys$select Vector usage against OOM.
Theoretically the append should never fail as we have in-line storage
of FD_SETSIZE, which should always be enough. However I'm planning on
removing the non-try variants of AK::Vector when compiling in kernel
mode in the future, so this will need to go eventually. I suppose it
also protects against some unforeseen bug where we we can append more
than FD_SETSIZE items.
2021-04-29 20:31:15 +02:00
Brian Gianforcaro 0ca668f59c Kernel: Harden sys$munmap Vector usage against OOM.
Theoretically the append should never fail as we have in-line storage
of 2, which should be enough. However I'm planning on removing the
non-try variants of AK::Vector when compiling in kernel mode in the
future, so this will need to go eventually. I suppose it also protects
against some unforeseen bug where we we can append more than 2 items.
2021-04-29 20:31:15 +02:00
Brian Gianforcaro 569c5a8922 Kernel: Harden sys$purge Vector usage against OOM.
sys$purge() is a bit unique, in that it is probably in the systems
advantage to attempt to limp along if we hit OOM while processing
the vmobjects to purge. This change modifies the algorithm to observe
OOM and continue trying to purge any previously visited VMObjects.
2021-04-29 20:31:15 +02:00
Brian Gianforcaro b3096276bb Kernel: Harden sys$poll Vector usage against OOM. 2021-04-29 20:31:15 +02:00
Brian Gianforcaro 119b7be249 Kernel: Harden sys$execve Vector usage against OOM. 2021-04-29 20:31:15 +02:00
Brian Gianforcaro 454d2fd42a Kernel: Harden sys$readv / sys$writev Vector usage against OOM. 2021-04-29 20:31:15 +02:00
Brian Gianforcaro cd29eb7867 Kernel: Harden sys$sendmsg / sys$recvmsg Vector usage against OOM. 2021-04-29 20:31:15 +02:00
Gunnar Beutner d9ee2c6a89 Kernel: Avoid overrunning the user-specified buffers in select() 2021-04-28 23:05:10 +02:00
Gunnar Beutner aa792062cb Kernel+LibC: Implement the socketpair() syscall 2021-04-28 14:19:45 +02:00
Jelle Raaijmakers b630e39fbb Kernel: Check futex command if realtime clock is used 2021-04-27 09:19:55 +02:00
Gunnar Beutner 659507696c Kernel: Fix incorrect argument for thread_exit events 2021-04-26 23:26:58 +02:00
Gunnar Beutner 1c02848e54 Kernel: Log thread exits for global profiles 2021-04-26 23:26:58 +02:00
Gunnar Beutner afeee35cbf Kernel: Avoid calling characters() where not necessary 2021-04-26 23:26:58 +02:00
Gunnar Beutner eb798d5538 Kernel+Profiler: Improve profiling subsystem
This turns the perfcore format into more a log than it was before,
which lets us properly log process, thread and region
creation/destruction. This also makes it unnecessary to dump the
process' regions every time it is scheduled like we did before.

Incidentally this also fixes 'profile -c' because we previously ended
up incorrectly dumping the parent's region map into the profile data.

Log-based mmap support enables profiling shared libraries which
are loaded at runtime, e.g. via dlopen().

This enables profiling both the parent and child process for
programs which use execve(). Previously we'd discard the profiling
data for the old process.

The Profiler tool has been updated to not treat thread IDs as
process IDs anymore. This enables support for processes with more
than one thread. Also, there's a new widget to filter which
process should be displayed.
2021-04-26 17:13:55 +02:00
Linus Groh dbe72fd962 Everywhere: Remove empty line after function body opening curly brace 2021-04-25 20:20:00 +02:00
Brian Gianforcaro 8d6e9fad40 Kernel: Remove the now defunct LOCKER(..) macro. 2021-04-25 09:38:27 +02:00
Jelle Raaijmakers 54f5b1346c Kernel: Support null act argument for sigaction syscall
Userspace can provide a null argument for the `act` argument to the
`sigaction` syscall to not set any new behavior. This is described
here:

https://pubs.opengroup.org/onlinepubs/007904875/functions/sigaction.html

Without this fix, the `copy_from_user(...)` invocation on `user_act`
fails and makes the syscall return early.
2021-04-24 23:00:28 +02:00
Andreas Kling b91c49364d AK: Rename adopt() to adopt_ref()
This makes it more symmetrical with adopt_own() (which is used to
create a NonnullOwnPtr from the result of a naked new.)
2021-04-23 16:46:57 +02:00
Maciej Zygmanowski 6efcc2fc99 Kernel: Don't allow to kill kernel processes
The protection was only for SIGKILL before.
2021-04-23 13:26:02 +02:00
Brian Gianforcaro 1682f0b760 Everything: Move to SPDX license identifiers in all files.
SPDX License Identifiers are a more compact / standardized
way of representing file license information.

See: https://spdx.dev/resources/use/#identifiers

This was done with the `ambr` search and replace tool.

 ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
2021-04-22 11:22:27 +02:00
Gunnar Beutner db3fd11646 Kernel: Remove requirement for the thread entitlement for the futex syscall
GCC inserts calls to pthread_mutex_lock when compiling C++ code with
threads enabled.
2021-04-20 21:08:17 +02:00
Brian Gianforcaro 4ed682aebc Kernel: Add a syscall to clear the profiling buffer
While profiling all processes the profile buffer lives forever.
Once you have copied the profile to disk, there's no need to keep it
in memory. This syscall surfaces the ability to clear that buffer.
2021-04-19 18:30:37 +02:00
FalseHonesty 3123ffb19d Kernel: Add ptrace commands for reading/writing the debug registers
This adds PT_PEEKDEBUG and PT_POKEDEBUG to allow for reading/writing
the debug registers, and updates the Kernel's debug handler to read the
new information from the debug status register.
2021-04-18 17:02:40 +02:00
Gunnar Beutner f033416893 Kernel+LibC: Clean up how assertions work in the kernel and LibC
This also brings LibC's abort() function closer to the spec.
2021-04-18 11:11:15 +02:00
Gunnar Beutner cf13fa57cd Kernel: Allow system calls from the dynamic loader
Previously the dynamic loader would become unused after it had invoked
the program's entry function. However, in order to support exceptions
and - at a later point - dlfcn functionality we need to call back
into the dynamic loader at runtime.

Because the dynamic loader has a static copy of LibC it'll attempt to
invoke syscalls directly from its text segment. For this to work the
executable region for the dynamic loader needs to have syscalls enabled.
2021-04-18 10:55:25 +02:00
Linus Groh 2b0c361d04 Everywhere: Fix a bunch of typos 2021-04-18 10:30:03 +02:00
Gunnar Beutner c3ee70591a Kernel: Read the ELF header from the inode rather than the mapped pages
Reading from the mapping doesn't work when the text segment has a non-zero
offset because in that case the first mapped page doesn't contain the ELF
header.
2021-04-14 13:12:52 +02:00
Gunnar Beutner 2d91761cf6 Kernel: Make sure the offset stays the same when using mremap()
When using mmap() on a file with a non-zero offset subsequent
calls to mremap() would incorrectly reset the offset to zero.
2021-04-14 13:12:52 +02:00
Idan Horowitz 2c93123daf Kernel: Replace process' regions vector with a Red Black tree
This should provide some speed up, as currently searches for regions
containing a given address were performed in O(n) complexity, while
this container allows us to do those in O(logn).
2021-04-12 18:03:44 +02:00
Idan Horowitz 497c759ab7 Kernel: Remove old region from process' regions vector before splitting
This does not affect functionality right now, but it means that the
regions vector will now never have any overlapping regions, which will
allow the use of balance binary search trees instead of a vector in the
future. (since they require keys to be exclusive)
2021-04-12 18:03:44 +02:00
Liav A 8e3e3a71cb Kernel: Introduce a new HID subsystem
The end goal of this commit is to allow to boot on bare metal with no
PS/2 device connected to the system. It turned out that the original
code relied on the existence of the PS/2 keyboard, so VirtualConsole
called it even though ACPI indicated the there's no i8042 controller on
my real machine because I didn't plug any PS/2 device.
The code is much more flexible, so adding HID support for other type of
hardware (e.g. USB HID) could be much simpler.

Briefly describing the change, we have a new singleton called
HIDManagement, which is responsible to initialize the i8042 controller
if exists, and to enumerate its devices. I also abstracted a bit
things, so now every Human interface device is represented with the
HIDDevice class. Then, there are 2 types of it - the MouseDevice and
KeyboardDevice classes; both are responsible to handle the interface in
the DevFS.

PS2KeyboardDevice, PS2MouseDevice and VMWareMouseDevice classes are
responsible for handling the hardware-specific interface they are
assigned to. Therefore, they are inheriting from the IRQHandler class.
2021-04-03 11:57:23 +02:00
Itamar ba0df27653 Kernel: Support write() after setting O_APPEND on a non-seekable file
Previously, Process::do_write would error if the O_APPEND flag was set
on a non-seekable file.

Other systems (such as Linux) seem to be OK with doing this, so we now
do not attempt to seek to the end the file if it's not seekable.
2021-03-29 19:56:54 +02:00
Andreas Kling 497c4c3858 Kernel: Let's also not reverse the blocking flag for FIONBIO.. 2021-03-29 08:59:22 +02:00
Andreas Kling 0f270afda8 Kernel: Let's allow unsetting non-blocking mode with FIONBIO as well
Thanks to almightyhydra for pointing this out! :^)
2021-03-29 08:58:13 +02:00
Andreas Kling 5f71bf0cc7 Kernel+LibC: Implement sys$ioctl() FIONBIO
This is another (older) way of making a file descriptor non-blocking.
2021-03-28 17:50:08 +02:00
Hendiadyoin1 0d934fc991 Kernel::CPU: Move headers into common directory
Alot of code is shared between i386/i686/x86 and x86_64
and a lot probably will be used for compatability modes.
So we start by moving the headers into one Directory.
We will probalby be able to move some cpp files aswell.
2021-03-21 09:35:23 +01:00
Itamar 2365e06b12 Kernel: Set TLS-related members of Process after loading static program
We previously ignored these values in the return value of
load_elf_object, which causes us to not allocate a TLS region for
statically-linked programs.
2021-03-19 22:55:53 +01:00
Andreas Kling d48666489c Kernel: Make FileDescription::seek() return KResultOr<off_t>
This exposed a bunch of places where errors were not propagated,
so this patch is forced to deal with them as well.
2021-03-19 10:44:25 +01:00
Jean-Baptiste Boric 6698fd84ff Kernel: Refactor storage stack with u64 as mmap offset 2021-03-19 09:15:19 +01:00
Jean-Baptiste Boric 7a079f7780 LibC+Kernel: Switch off_t to 64 bits 2021-03-17 23:22:42 +01:00
thatdutchguy 569d6d47fe Kernel: sysconf(_SC_CLK_TCK): Use TimeManagement::ticks_per_second() 2021-03-16 21:56:47 +01:00
thatdutchguy 10e3e8f6d4 Kernel: Add _SC_CLK_TCK to sysconf.
Unbreaks the hatari port.
2021-03-16 21:56:47 +01:00
Andreas Kling a166a65eff Kernel: Don't return -EFOO when return type is KResultOr<...> 2021-03-15 09:09:04 +01:00
Hendiadyoin1 eba3fa5e72 Kernel: Make munmap more posix compliant
In case someone tries to unmap a not mapped region (fallback) we should
not return an error, but silently do nothing
2021-03-13 10:00:46 +01:00
Hendiadyoin1 b7f1171a1c Kernel: munmap multiple regions at a time
This implements a fallback to munmap that unmaps multiple regions at a
time, with splitting some when needed.

The way it is implemented is possibly not optimal, due to it searching
without looking into the cache
2021-03-13 10:00:46 +01:00
Andreas Kling 423ed53396 Kernel: Fix rounding of PT_LOAD mappings in sys$execve()
We were not rounding the mappings down/up correctly, which could lead
to executables missing the last 4 KB of text and/or data.
2021-03-12 17:26:24 +01:00
Andreas Kling 73e06a1983 Kernel: Convert klog() => AK::Format in a handful of places 2021-03-12 15:22:35 +01:00
Andreas Kling 49a0f40ff0 Kernel: Inherit the dumpable flag on sys$fork()
This regressed at some point recently. All children were non-dumpable
until manually opting into it.
2021-03-11 14:35:37 +01:00
Andreas Kling 1608ef37d8 Kernel: Move process termination status/signal into protected data 2021-03-11 14:24:08 +01:00
Andreas Kling b7b7a48c66 Kernel: Move process signal trampoline address into protected data 2021-03-11 14:21:49 +01:00
Andreas Kling 08e0e2eb41 Kernel: Move process umask into protected data :^) 2021-03-11 14:21:49 +01:00
Andreas Kling 90c0f9664e Kernel: Don't keep protected Process data in a separate allocation
The previous architecture had a huge flaw: the pointer to the protected
data was itself unprotected, allowing you to overwrite it at any time.

This patch reorganizes the protected data so it's part of the Process
class itself. (Actually, it's a new ProcessBase helper class.)

We use the first 4 KB of Process objects themselves as the new storage
location for protected data. Then we make Process objects page-aligned
using MAKE_ALIGNED_ALLOCATED.

This allows us to easily turn on/off write-protection for everything in
the ProcessBase portion of Process. :^)

Thanks to @bugaevc for pointing out the flaw! This is still not perfect
but it's an improvement.
2021-03-11 14:21:49 +01:00
Andreas Kling de6c5128fd Kernel: Move process pledge promises into protected data 2021-03-10 22:50:00 +01:00
Andreas Kling 3d27269f13 Kernel: Move process parent PID into protected data :^) 2021-03-10 22:30:02 +01:00
Andreas Kling d677a73b0e Kernel: Move process extra_gids into protected data :^) 2021-03-10 22:30:02 +01:00
Andreas Kling cbcf891040 Kernel: Move select Process members into protected memory
Process member variable like m_euid are very valuable targets for
kernel exploits and until now they have been writable at all times.

This patch moves m_euid along with a whole bunch of other members
into a new Process::ProtectedData struct. This struct is remapped
as read-only memory whenever we don't need to write to it.

This means that a kernel write primitive is no longer enough to
overwrite a process's effective UID, you must first unprotect the
protected data where the UID is stored. :^)
2021-03-10 22:30:02 +01:00
Andreas Kling 84725ef3a5 Kernel+UserspaceEmulator: Add sys$emuctl() system call
This returns ENOSYS if you are running in the real kernel, and some
other result if you are running in UserspaceEmulator.

There are other ways we could check if we're inside an emulator, but
it seemed easier to just ask. :^)
2021-03-09 08:58:26 +01:00
Brian Gianforcaro 5f6ab77352 Kernel: Add bitwise operators for Thread::FileBlocker::BlockFlags enum
Switch to using type-safe bitwise operators for the BlockFlags class,
this cleans up a lot of boilerplate casts which are necessary when the
enum is declared as `enum class`.
2021-03-08 18:47:40 +01:00
Ben Wiederhake 501952852c Kernel: Fix pointer over/underflow in create_thread
The expression
    (u8*)params.m_stack_location + stack_size
… causes UBSan to spit out the warning
    KUBSAN: addition of unsigned offset to 0x00000002 overflowed to 0xb0000003
… even though there is no actual overflow happening here.
This can be reproduced by running:
    $ syscall create_thread 0 [ 0 0 0 0 0xb0000001 2 ]
Technically, this is a true-positive: The C++-reference is incredibly strict
about pointer-arithmetic:
    > A pointer to non-array object is treated as a pointer to the first element
    > of an array with size 1. […] [A]ttempts to generate a pointer that isn't
    > pointing at an element of the same array or one past the end invoke
    > undefined behavior.
    https://en.cppreference.com/w/cpp/language/operator_arithmetic
Frankly, this feels silly. So let's just use FlatPtr instead.

Found by fuzz-syscalls. Undocumented bug.

Note that FlatPtr is an unsigned type, so
    user_esp.value() - 4
is defined even if we end up with a user_esp of 0 (this can happen for example
when params.m_stack_size = 0 and params.m_stack_location = 0). The result would
be a Kernelspace-pointer, which would then be immediately flagged by
'MM.validate_user_stack' as invalid, as intended.
2021-03-07 17:31:25 +01:00
Andreas Kling a819eb5016 Kernel: Skip TLB flushes while cloning regions in sys$fork()
Since we know for sure that the virtual memory regions in the new
process being created are not being used on any CPU, there's no need
to do TLB flushes for every mapped page.
2021-03-03 22:57:45 +01:00
Andreas Kling d96a44a738 Kernel: Avoid transient kmalloc heap allocations in sys$select()
Dynamic Vector allocations in sys$select() were showing up in the
full-system profile and since there will never be more than FD_SETSIZE
file descriptors to worry about, we can confidently add enough inline
capacity to this Vector that it never has to kmalloc.

To compensate for the increased stack usage, reduce the size of the
FDInfo struct while we're here. :^)
2021-03-03 20:37:23 +01:00
Andreas Kling 5e7abea31e Kernel+Profiler: Capture metadata about all profiled processes
The perfcore file format was previously limited to a single process
since the pid/executable/regions data was top-level in the JSON.

This patch moves the process-specific data into a top-level array
named "processes" and we now add entries for each process that has
been sampled during the profile run.

This makes it possible to see samples from multiple threads when
viewing a perfcore file with Profiler. This is extremely cool! :^)
2021-03-02 22:38:06 +01:00
Andreas Kling ea500dd3e3 Kernel: Start work on full system profiling :^)
The superuser can now call sys$profiling_enable() with PID -1 to enable
profiling of all running threads in the system. The perf events are
collected in a global PerformanceEventBuffer (currently 32 MiB in size.)

The events can be accessed via /proc/profile
2021-03-02 22:38:06 +01:00
Andreas Kling b425c2602c Kernel: Better handling of allocation failure in profiling
If we can't allocate a PerformanceEventBuffer to store the profiling
events, we now fail sys$profiling_enable() and sys$perf_event()
with ENOMEM instead of carrying on with a broken buffer.
2021-03-02 22:38:06 +01:00
Ben Wiederhake 5c15ca7b84 Kernel: Make sockets use AK::Time 2021-03-02 08:36:08 +01:00
Ben Wiederhake 336303bda4 Kernel: Make kgettimeofday use AK::Time 2021-03-02 08:36:08 +01:00
Ben Wiederhake c040e64b7d Kernel: Make TimeManagement use AK::Time internally
I don't dare touch the multi-threading logic and locking mechanism, so it stays
timespec for now. However, this could and should be changed to AK::Time, and I
bet it will simplify the "increment_time_since_boot()" code.
2021-03-02 08:36:08 +01:00
Ben Wiederhake 2b6546c40a Kernel: Make Thread use AK::Time internally
This commit is very invasive, because Thread likes to take a pointer and write
to it. This means that translating between timespec/timeval/Time would have been
more difficult than just changing everything that hands a raw pointer to Thread,
in bulk.
2021-03-02 08:36:08 +01:00
Ben Wiederhake 8598240193 Kernel: Sanitize all user-supplied timeval's/timespec's
This also removes a bunch of unnecessary EINVAL. Most of them weren't even
recommended by POSIX.
2021-03-02 08:36:08 +01:00
Andreas Kling 4d006de2b9 Kernel: Fix build with IO_DEBUG 2021-03-01 16:07:50 +01:00
Andreas Kling 272c2e6ec5 Kernel: Use Userspace<T> in sys${munmap,mprotect,madvise,msyscall}() 2021-03-01 15:53:33 +01:00
Andreas Kling bebceaa32c Kernel: Use Userspace<T> in sys$select() 2021-03-01 15:07:01 +01:00
Andreas Kling a1a82c1d95 Kernel: Use Userspace<T> in sys$get_dir_entries() 2021-03-01 15:04:31 +01:00
Andreas Kling b5f32be577 Kernel: Use Userspace<T> in sys$get_stack_bounds() 2021-03-01 14:50:36 +01:00
Andreas Kling 122c7b6cbb Kernel: Use Userspace<T> in sys$write() 2021-03-01 14:35:06 +01:00
Andreas Kling 6a6eb8844a Kernel: Use Userspace<T> in sys$sigaction()
fuzz-syscalls found a bunch of unaligned accesses into struct sigaction
via this syscall. This patch fixes that issue by porting the syscall
to Userspace<T> which we should have done anyway. :^)

Fixes #5500.
2021-03-01 14:06:20 +01:00
Andreas Kling ac71775de5 Kernel: Make all syscall functions return KResultOr<T>
This makes it a lot easier to return errors since we no longer have to
worry about negating EFOO errors and can just return them flat.
2021-03-01 13:54:32 +01:00
Andreas Kling 4aa58aaab5 Kernel: Don't disable interrupts while exiting a thread or process
This was another vestige from a long time ago, when exiting a thread
would mutate global data structures that were only protected by the
interrupt flag.
2021-02-25 19:36:36 +01:00
Andreas Kling 8eeb8db2ed Kernel: Don't disable interrupts while dealing with a process crash
This was necessary in the past when crash handling would modify
various global things, but all that stuff is long gone so we can
simplify crashes by leaving the interrupt flag alone.
2021-02-25 19:36:36 +01:00
Andreas Kling 8129f3da52 Kernel: Move SMAP disabler RAII helper to its own file
Added this in a new directory called Kernel/Arch/x86/ where stuff
that applies to both i386 and x86_64 can live.
2021-02-25 17:25:34 +01:00
Andreas Kling 8f70528f30 Kernel: Take some baby steps towards x86_64
Make more of the kernel compile in 64-bit mode, and make some things
pointer-size-agnostic (by using FlatPtr.)

There's a lot of work to do here before the kernel will even compile.
2021-02-25 16:27:12 +01:00
Andreas Kling c11511a0ab Kernel: Move sys$sigaction() implementation inside ARCH(i386) 2021-02-25 11:33:06 +01:00
Andreas Kling 53c6c29158 Kernel: Tighten some typing in Arch/i386/CPU.h
Use more appropriate types for some things.
2021-02-25 11:32:27 +01:00
Brian Gianforcaro 303620ea85 Kernel: Fix pointer overflow in create_thread
KUBSAN found this overflow from syscall fuzzing.

Fixes #5498
2021-02-24 15:14:13 +01:00
Andreas Kling ce1775d81d Kernel: Oops, fix broken sys$uname() function definition 2021-02-24 14:42:38 +01:00
Andreas Kling a48d54dfc5 Kernel: Don't dereference untrusted userspace pointer in sys$uname()
Instead of writing to the userspace utsname struct one field at a time,
build up a utsname on the kernel stack and copy it out to userspace
once it's finished. This is both simpler and gets validity checking
built-in for free.

Found by KUBSAN! :^)

Fixes #5499.
2021-02-24 14:37:36 +01:00
Andreas Kling 5d180d1f99 Everywhere: Rename ASSERT => VERIFY
(...and ASSERT_NOT_REACHED => VERIFY_NOT_REACHED)

Since all of these checks are done in release builds as well,
let's rename them to VERIFY to prevent confusion, as everyone is
used to assertions being compiled out in release.

We can introduce a new ASSERT macro that is specifically for debug
checks, but I'm doing this wholesale conversion first since we've
accumulated thousands of these already, and it's not immediately
obvious which ones are suitable for ASSERT.
2021-02-23 20:56:54 +01:00
Brian Gianforcaro d934e77522 Kernel: Use copy_n_from_user in sys$setgroups to check for overflow 2021-02-21 17:12:01 +01:00
Brian Gianforcaro 4743afeaf4 Kernel: Use already computed nfds_checked value when copying from user mode.
- We've already computed the number of fds * sizeof(pollfd), so use it
  instead of needlessly doing it again.

- Use fds_copy.data() instead off address of indexing the vector.
2021-02-21 17:12:01 +01:00
Brian Gianforcaro 1c0e2947d7 Kernel: Use copy_n_from_user in sys$setkeymap 2021-02-21 17:12:01 +01:00
Brian Gianforcaro 26bba8e100 Kernel: Populate ELF::AuxilaryValue::Platform from Processor object.
Move this to the processor object so it can easily be implemented
when Serenity is compiled for a different architecture.
2021-02-21 17:06:24 +01:00
Brian Gianforcaro a977cdd9ac Kernel: Remove unneeded Thread::set_default_signal_dispositions
The `default_signal_action(u8 signal)` function already has the
full mapping. The only caveat being that now we need to make
sure the thread constructor and clear_signals() method do the work
of resetting the m_signal_action_data array, instead or relying on
the previous logic in set_default_signal_dispositions.
2021-02-21 12:54:39 +01:00
Andreas Kling 84b2d4c475 Kernel: Add "map_fixed" pledge promise
This is a new promise that guards access to mmap() with MAP_FIXED.

Fixed-address mappings are rarely used, but can be useful if you are
trying to groom the process address space for malicious purposes.

None of our programs need this at the moment, as the only user of
MAP_FIXED is DynamicLoader, but the fixed mappings are constructed
before the process has had a chance to pledge anything.
2021-02-21 01:08:48 +01:00
Andreas Kling 6e83be67b8 Kernel: Release ptrace lock in exec before stopping due to PT_TRACE_ME
If we have a tracer process waiting for us to exec, we need to release
the ptrace lock before stopping ourselves, since otherwise the tracer
will block forever on the lock.

Fixes #5409.
2021-02-19 12:13:54 +01:00
Andreas Kling eb92ec3149 Kernel: Factor out mmap & friends range expansion to a helper function
sys$mmap() and related syscalls must pad to the nearest page boundary
below the base address *and* above the end address of the specified
range. Since we have to do this in many places, let's make a helper.
2021-02-18 18:04:58 +01:00
Andreas Kling 55a9a4f57a Kernel: Use KResult a bit more in sys$execve() 2021-02-18 09:37:33 +01:00
Andreas Kling 5a595ef134 Kernel: Use dbgln_if() in sys$fork() 2021-02-17 15:34:32 +01:00
Andreas Kling 575c7ed414 Kernel: Make sys$msyscall() EFAULT on non-user address
Fixes #5361.
2021-02-16 11:32:00 +01:00
Ben Wiederhake fbb85f9b2f Kernel: Refuse excessively long iovec list, also in readv
This bug is a good example why copy-paste code should eventually be eliminated
from the code base: Apparently the code was copied from read.cpp before
c6027ed7cc, so the same bug got introduced here.

To recap: A malicious program can ask the Kernel to prepare sys-ing to
a huge amount of iovecs. The Kernel must first copy all the vector locations
into 'vecs', and before that allocates an arbitrary amount of memory:
    vecs.resize(iov_count);
This can cause Kernel memory exhaustion, triggered by any malicious userland
program.
2021-02-15 22:09:01 +01:00
AnotherTest 4519950266 Kernel+LibC: Add the _SC_GETPW_R_SIZE_MAX sysconf enum
It just returns 4096 :P
2021-02-15 17:32:56 +01:00
AnotherTest a3a7ab83c4 Kernel+LibC: Implement readv
We already had writev, so let's just add readv too.
2021-02-15 17:32:56 +01:00
Andreas Kling 68e3616971 Kernel: Forked children should inherit the signal trampoline address
Fixes #5347.
2021-02-14 18:38:46 +01:00
Andreas Kling 6ee499aeb0 Kernel: Round old address/size in sys$mremap() to page size multiples
Found by fuzz-syscalls. :^)
2021-02-14 13:15:05 +01:00
Andreas Kling e47bffdc8c Kernel: Add some bits of randomness to the userspace stack pointer
This patch adds a random offset between 0 and 4096 to the initial
stack pointer in new processes. Since the stack has to be 16-byte
aligned, the bottom bits can't be randomized.

Yet another thing to make things less predictable. :^)
2021-02-14 11:53:49 +01:00
Andreas Kling 4188373020 Kernel: Fix TOCTOU in syscall entry region validation
We were doing stack and syscall-origin region validations before
taking the big process lock. There was a window of time where those
regions could then be unmapped/remapped by another thread before we
proceed with our syscall.

This patch closes that window, and makes sys$get_stack_bounds() rely
on the fact that we now know the userspace stack pointer to be valid.

Thanks to @BenWiederhake for spotting this! :^)
2021-02-14 11:47:14 +01:00
Ben Wiederhake c0692f1f95 Kernel: Avoid magic number in sys$poll 2021-02-14 10:57:33 +01:00
Andreas Kling cc341c95aa Kernel: Panic on sys$get_stack_bounds() in stack-less process 2021-02-14 10:51:18 +01:00
Andreas Kling 781d29a337 Kernel+Userland: Give sys$recvfd() an options argument for O_CLOEXEC
@bugaevc pointed out that we shouldn't be setting this flag in
userspace, and he's right of course.
2021-02-14 10:39:48 +01:00
Andreas Kling 09b1b09c19 Kernel: Assert if rounding-up-to-page-size would wrap around to 0
If we try to align a number above 0xfffff000 to the next multiple of
the page size (4 KiB), it would wrap around to 0. This is most likely
never what we want, so let's assert if that happens.
2021-02-14 10:01:50 +01:00
Andreas Kling 1593219a41 Kernel: Map signal trampoline into each process's address space
The signal trampoline was previously in kernelspace memory, but with
a special exception to make it user-accessible.

This patch moves it into each process's regular address space so we
can stop supporting user-allowed memory above 0xc0000000.
2021-02-14 01:33:17 +01:00
Andreas Kling ffdfbf1dba Kernel: Fix wrong sizeof() type in sys$execve() argument overflow check 2021-02-14 00:15:01 +01:00
Andreas Kling c877612211 Kernel: Round down base of partial ranges provided to munmap/mprotect
We were failing to round down the base of partial VM ranges. This led
to split regions being constructed that could have a non-page-aligned
base address. This would then trip assertions in the VM code.

Found by fuzz-syscalls. :^)
2021-02-13 01:49:44 +01:00
Andreas Kling 62f0f73bf0 Kernel: Limit the number of file descriptors sys$poll() can handle
Just slap an arbitrary limit on there so we don't panic if somebody
asks us to poll 1 fajillion fds.

Found by fuzz-syscalls. :^)
2021-02-13 01:18:03 +01:00
Andreas Kling 7551090056 Kernel: Round up ranges to page size multiples in munmap and mprotect
This prevents passing bad inputs to RangeAllocator who then asserts.

Found by fuzz-syscalls. :^)
2021-02-13 01:18:03 +01:00
Ben Wiederhake 546cdde776 Kernel: clock_nanosleep's 'flags' is not a bitset
This had the interesting effect that most, but not all, non-zero values
were interpreted as an absolute value.
2021-02-13 00:40:31 +01:00
Ben Wiederhake e1db8094b6 Kernel: Avoid casting arbitrary user-controlled int to enum
This caused a load-invalid-value warning by KUBSan.

Found by fuzz-syscalls. Can be reproduced by running this in the Shell:

    $ syscall waitid [ 1234 ]
2021-02-13 00:40:31 +01:00
Ben Wiederhake c6027ed7cc Kernel: Refuse excessively long iovec list
If a program attempts to write from more than a million different locations,
there is likely shenaniganery afoot! Refuse to write to prevent kmem exhaustion.

Found by fuzz-syscalls. Can be reproduced by running this in the Shell:

    $ syscall writev 1 [ 0 ] 0x08000000
2021-02-13 00:40:31 +01:00
Ben Wiederhake 987b7f7917 Kernel: Forbid empty and whitespace-only process names
Those only exist to confuse the user anyway.

Found while using fuzz-syscalls.
2021-02-13 00:40:31 +01:00
Ben Wiederhake 1e630fb78a Kernel: Avoid creating unkillable processes
Found by fuzz-syscalls. Can be reproduced by running this in the Shell:

    $ syscall exit_thread

This leaves the process in the 'Dying' state but never actually removes it.

Therefore, avoid this scenario by pretending to exit the entire process.
2021-02-13 00:40:31 +01:00
Andreas Kling 1ef43ec89a Kernel: Move get_interpreter_load_offset() out of Process class
This is only used inside the sys$execve() implementation so just make
it a execve.cpp local function.
2021-02-12 16:30:29 +01:00
Andreas Kling 1f277f0bd9 Kernel: Convert all *Builder::appendf() => appendff() 2021-02-09 19:18:13 +01:00
Andreas Kling 4ff0f971f7 Kernel: Prevent execve/ptrace race
Add a per-process ptrace lock and use it to prevent ptrace access to a
process after it decides to commit to a new executable in sys$execve().

Fixes #5230.
2021-02-08 23:05:41 +01:00
Andreas Kling 4b7b92c201 Kernel: Remove two unused fields from sys$execve's LoadResult 2021-02-08 22:31:03 +01:00
Andreas Kling 0d7af498d7 Kernel: Move ShouldAllocateTls enum from Process to execve.cpp 2021-02-08 22:24:37 +01:00
Andreas Kling b1c9f93fa3 Kernel: Skip generic region lookup in sys$futex and sys$get_stack_bounds
Just ask the process space directly instead of using the generic region
lookup that also checks for kernel regions.
2021-02-08 22:23:29 +01:00
Andreas Kling f39c2b653e Kernel: Reorganize ptrace implementation a bit
The generic parts of ptrace now live in Kernel/Syscalls/ptrace.cpp
and the i386 specific parts are moved to Arch/i386/CPU.cpp
2021-02-08 19:34:41 +01:00
Andreas Kling 45231051e6 Kernel: Set the dumpable flag before switching spaces in sys$execve() 2021-02-08 19:15:42 +01:00
Andreas Kling d746639171 Kernel: Remove outdated code to dump memory layout after exec load 2021-02-08 19:07:29 +01:00
Andreas Kling f1b5def8fd Kernel: Factor address space management out of the Process class
This patch adds Space, a class representing a process's address space.

- Each Process has a Space.
- The Space owns the PageDirectory and all Regions in the Process.

This allows us to reorganize sys$execve() so that it constructs and
populates a new Space fully before committing to it.

Previously, we would construct the new address space while still
running in the old one, and encountering an error meant we had to do
tedious and error-prone rollback.

Those problems are now gone, replaced by what's hopefully a set of much
smaller problems and missing cleanups. :^)
2021-02-08 18:27:28 +01:00
AnotherTest 09a43969ba Everywhere: Replace dbgln<flag>(...) with dbgln_if(flag, ...)
Replacement made by `find Kernel Userland -name '*.h' -o -name '*.cpp' | sed -i -Ee 's/dbgln\b<(\w+)>\(/dbgln_if(\1, /g'`
2021-02-08 18:08:55 +01:00
Andreas Kling b466ede1ea Kernel: Make sure we can allocate kernel stack before creating thread
Wrap thread creation in a Thread::try_create() helper that first
allocates a kernel stack region. If that allocation fails, we propagate
an ENOMEM error to the caller.

This avoids the situation where a thread is half-constructed, without a
valid kernel stack, and avoids having to do messy cleanup in that case.
2021-02-07 19:27:00 +01:00
Andreas Kling d4dd4a82bb Kernel: Don't allow sys$msyscall() on non-mmap regions 2021-02-02 20:16:13 +01:00
Andreas Kling 823186031d Kernel: Add a way to specify which memory regions can make syscalls
This patch adds sys$msyscall() which is loosely based on an OpenBSD
mechanism for preventing syscalls from non-blessed memory regions.

It works similarly to pledge and unveil, you can call it as many
times as you like, and when you're finished, you call it with a null
pointer and it will stop accepting new regions from then on.

If a syscall later happens and doesn't originate from one of the
previously blessed regions, the kernel will simply crash the process.
2021-02-02 20:13:44 +01:00
Ben Wiederhake cbee0c26e1 Kernel+keymap+KeyboardMapper: New pledge for getkeymap 2021-02-01 09:54:32 +01:00
Ben Wiederhake a2c21a55e1 Kernel+LibKeyboard: Enable querying the current keymap 2021-02-01 09:54:32 +01:00
Andreas Kling 6e4e3a7612 Kernel: Remove pledge exception for sys$getsockopt() with SO_PEERCRED
We had an exception that allowed SOL_SOCKET + SO_PEERCRED on local
socket to support LibIPC's PID exchange mechanism. This is no longer
needed so let's just remove the exception.
2021-01-31 09:29:27 +01:00
Andreas Kling 4d777a9bf4 Kernel: Allow changing thread names with the "stdio" promise
It's useful for programs to change their thread names to say something
interesting about what they are working on. Let's not require "thread"
for this since single-threaded programs may want to do it without
pledging "thread".
2021-01-30 23:38:57 +01:00
Andreas Kling 90343eeaeb Revert "Kernel: Return -ENOTDIR for non-directory mount target"
This reverts commit b7b09470ca.

Mounting a file on top of a file is a valid thing we support.
2021-01-30 13:52:12 +01:00
Andreas Kling 123c37e1c0 Kernel: Fix mix-up between MAP_STACK/MAP_ANONYMOUS in prot validation 2021-01-30 10:30:17 +01:00
Andreas Kling e55ef70e5e Kernel: Remove "has made executable exception for dynamic loader" flag
As Idan pointed out, this flag is actually not needed, since we don't
allow transitioning from previously-executable to writable anyway.
2021-01-30 10:06:52 +01:00
Andreas Kling d0c5979d96 Kernel: Add "prot_exec" pledge promise and require it for PROT_EXEC
This prevents sys$mmap() and sys$mprotect() from creating executable
memory mappings in pledged programs that don't have this promise.

Note that the dynamic loader runs before pledging happens, so it's
unaffected by this.
2021-01-29 18:56:34 +01:00
Andreas Kling 51df44534b Kernel: Disallow mapping anonymous memory as executable
This adds another layer of defense against introducing new code into a
running process. The only permitted way of doing so is by mmapping an
open file with PROT_READ | PROT_EXEC.

This does make any future JIT implementations slightly more complicated
but I think it's a worthwhile trade-off at this point. :^)
2021-01-29 14:52:34 +01:00
Andreas Kling af3d3c5c4a Kernel: Enforce W^X more strictly (like PaX MPROTECT)
This patch adds enforcement of two new rules:

- Memory that was previously writable cannot become executable
- Memory that was previously executable cannot become writable

Unfortunately we have to make an exception for text relocations in the
dynamic loader. Since those necessitate writing into a private copy
of library code, we allow programs to transition from RW to RX under
very specific conditions. See the implementation of sys$mprotect()'s
should_make_executable_exception_for_dynamic_loader() for details.
2021-01-29 14:52:27 +01:00
Linus Groh dbbc378fb2 Kernel: Return -ENOTBLK for non-block device Ext2FS mount source
When mounting an Ext2FS, a block device source is required. All other
filesystem types are unaffected, as most of them ignore the source file
descriptor anyway.

Fixes #5153.
2021-01-29 08:45:56 +01:00
Linus Groh b7b09470ca Kernel: Return -ENOTDIR for non-directory mount target
The absence of this check allowed silly things like this:

    # touch file
    # mount /dev/hda file
2021-01-29 08:45:56 +01:00
Sahan Fernando 6876b9a514 Kernel: Prevent mmap-ing as both fixed and randomized 2021-01-29 07:45:00 +01:00
Jorropo 22b0ff05d4
Kernel: sys$mmap PAGE_ROUND_UP size before calling allocate_randomized (#5154)
`allocate_randomized` assert an already sanitized size but `mmap` were
just forwarding whatever the process asked so it was possible to
trigger a kernel panic from an unpriviliged process just by asking some
randomly placed memory and a size non alligned with the page size.
This fixes this issue by rounding up to the next page size before
calling `allocate_randomized`.

Fixes #5149
2021-01-28 22:36:20 +01:00
Andreas Kling b6937e2560 Kernel+LibC: Add MAP_RANDOMIZED flag for sys$mmap()
This can be used to request random VM placement instead of the highly
predictable regular mmap(nullptr, ...) VM allocation strategy.

It will soon be used to implement ASLR in the dynamic loader. :^)
2021-01-28 16:23:38 +01:00
Andreas Kling e67402c702 Kernel: Remove Range "valid" state and use Optional<Range> instead
It's easier to understand VM ranges if they are always valid. We can
simply use an empty Optional<Range> to encode absence when needed.
2021-01-27 21:14:42 +01:00
Andreas Kling 5ab27e4bdc Kernel: sys$mmap() without MAP_FIXED should consider address a hint
If we can't use that specific address, it's still okay to put it
anywhere else in VM.
2021-01-27 21:14:42 +01:00
asynts 7cf0c7cc0d Meta: Split debug defines into multiple headers.
The following script was used to make these changes:

    #!/bin/bash
    set -e

    tmp=$(mktemp -d)

    echo "tmp=$tmp"

    find Kernel \( -name '*.cpp' -o -name '*.h' \) | sort > $tmp/Kernel.files
    find . \( -path ./Toolchain -prune -o -path ./Build -prune -o -path ./Kernel -prune \) -o \( -name '*.cpp' -o -name '*.h' \) -print | sort > $tmp/EverythingExceptKernel.files

    cat $tmp/Kernel.files | xargs grep -Eho '[A-Z0-9_]+_DEBUG' | sort | uniq > $tmp/Kernel.macros
    cat $tmp/EverythingExceptKernel.files | xargs grep -Eho '[A-Z0-9_]+_DEBUG' | sort | uniq > $tmp/EverythingExceptKernel.macros

    comm -23 $tmp/Kernel.macros $tmp/EverythingExceptKernel.macros > $tmp/Kernel.unique
    comm -1 $tmp/Kernel.macros $tmp/EverythingExceptKernel.macros > $tmp/EverythingExceptKernel.unique

    cat $tmp/Kernel.unique | awk '{ print "#cmakedefine01 "$1 }' > $tmp/Kernel.header
    cat $tmp/EverythingExceptKernel.unique | awk '{ print "#cmakedefine01 "$1 }' > $tmp/EverythingExceptKernel.header

    for macro in $(cat $tmp/Kernel.unique)
    do
        cat $tmp/Kernel.files | xargs grep -l $macro >> $tmp/Kernel.new-includes ||:
    done
    cat $tmp/Kernel.new-includes | sort > $tmp/Kernel.new-includes.sorted

    for macro in $(cat $tmp/EverythingExceptKernel.unique)
    do
        cat $tmp/Kernel.files | xargs grep -l $macro >> $tmp/Kernel.old-includes ||:
    done
    cat $tmp/Kernel.old-includes | sort > $tmp/Kernel.old-includes.sorted

    comm -23 $tmp/Kernel.new-includes.sorted $tmp/Kernel.old-includes.sorted > $tmp/Kernel.includes.new
    comm -13 $tmp/Kernel.new-includes.sorted $tmp/Kernel.old-includes.sorted > $tmp/Kernel.includes.old
    comm -12 $tmp/Kernel.new-includes.sorted $tmp/Kernel.old-includes.sorted > $tmp/Kernel.includes.mixed

    for file in $(cat $tmp/Kernel.includes.new)
    do
        sed -i -E 's/#include <AK\/Debug\.h>/#include <Kernel\/Debug\.h>/' $file
    done

    for file in $(cat $tmp/Kernel.includes.mixed)
    do
        echo "mixed include in $file, requires manual editing."
    done
2021-01-26 21:20:00 +01:00
Linus Groh e7183cc762 Kernel: Don't drop pledge()'d promises/execpromises when passing nullptr
When passing nullptr for either promises or execpromises to pledge(),
the expected behaviour is to not change their current value at all - we
were accidentally resetting them to 0, effectively dropping previously
pledge()'d promises.
2021-01-26 18:18:01 +01:00
Andreas Kling c7858622ec Kernel: Update process promise states on execve() and fork()
We now move the execpromises state into the regular promises, and clear
the execpromises state.

Also make sure to duplicate the promise state on fork.

This fixes an issue where "su" would launch a shell which immediately
crashed due to not having pledged "stdio".
2021-01-26 15:26:37 +01:00
Andreas Kling 1e25d2b734 Kernel: Remove allocate_region() functions that don't take a Range
Let's force callers to provide a VM range when allocating a region.
This makes ENOMEM error handling more visible and removes implicit
VM allocation which felt a bit magical.
2021-01-26 14:13:57 +01:00
Linus Groh 629180b7d8 Kernel: Support pledge() with empty promises
This tells the kernel that the process wants to use pledge, but without
pledging anything - effectively restricting it to syscalls that don't
require a certain promise. This is part of OpenBSD's pledge() as well,
which served as basis for Serenity's.
2021-01-25 23:22:21 +01:00
Andreas Kling ab14b0ac64 Kernel: Hoist VM range allocation up to sys$mmap() itself
Instead of letting each File subclass do range allocation in their
mmap() override, do it up front in sys$mmap().

This makes us honor alignment requests for file-backed memory mappings
and simplifies the code somwhat.
2021-01-25 18:57:06 +01:00
asynts eea72b9b5c Everywhere: Hook up remaining debug macros to Debug.h. 2021-01-25 09:47:36 +01:00
asynts 8465683dcf Everywhere: Debug macros instead of constexpr.
This was done with the following script:

    find . \( -name '*.cpp' -o -name '*.h' -o -name '*.in' \) -not -path './Toolchain/*' -not -path './Build/*' -exec sed -i -E 's/dbgln<debug_([a-z_]+)>/dbgln<\U\1_DEBUG>/' {} \;

    find . \( -name '*.cpp' -o -name '*.h' -o -name '*.in' \) -not -path './Toolchain/*' -not -path './Build/*' -exec sed -i -E 's/if constexpr \(debug_([a-z0-9_]+)/if constexpr \(\U\1_DEBUG/' {} \;
2021-01-25 09:47:36 +01:00
asynts acdcf59a33 Everywhere: Remove unnecessary debug comments.
It would be tempting to uncomment these statements, but that won't work
with the new changes.

This was done with the following commands:

    find . \( -name '*.cpp' -o -name '*.h' -o -name '*.in' \) -not -path './Toolchain/*' -not -path './Build/*' -exec awk -i inplace '$0 !~ /\/\/#define/ { if (!toggle) { print; } else { toggle = !toggle } } ; $0 ~/\/\/#define/ { toggle = 1 }' {} \;

    find . \( -name '*.cpp' -o -name '*.h' -o -name '*.in' \) -not -path './Toolchain/*' -not -path './Build/*' -exec awk -i inplace '$0 !~ /\/\/ #define/ { if (!toggle) { print; } else { toggle = !toggle } } ; $0 ~/\/\/ #define/ { toggle = 1 }' {} \;
2021-01-25 09:47:36 +01:00
asynts 1a3a0836c0 Everywhere: Use CMake to generate AK/Debug.h.
This was done with the help of several scripts, I dump them here to
easily find them later:

    awk '/#ifdef/ { print "#cmakedefine01 "$2 }' AK/Debug.h.in

    for debug_macro in $(awk '/#ifdef/ { print $2 }' AK/Debug.h.in)
    do
        find . \( -name '*.cpp' -o -name '*.h' -o -name '*.in' \) -not -path './Toolchain/*' -not -path './Build/*' -exec sed -i -E 's/#ifdef '$debug_macro'/#if '$debug_macro'/' {} \;
    done

    # Remember to remove WRAPPER_GERNERATOR_DEBUG from the list.
    awk '/#cmake/ { print "set("$2" ON)" }' AK/Debug.h.in
2021-01-25 09:47:36 +01:00
Andreas Kling f5d916a881 Kernel: Make sys$anon_create() fail if size == 0
An empty anonymous file is useless since it cannot be resized anyway,
so let's not support creating it.
2021-01-25 09:36:42 +01:00
Luke 50a2cb38e5 Kernel: Fix two error codes being returned as positive in Process::exec
This made the assertion on line 921 think it was a successful exec, when it wasn't.

Fixes #5084
2021-01-24 01:06:24 +01:00
asynts 1c1e577a5e Everywhere: Deprecate dbg(). 2021-01-23 16:46:26 +01:00
Andreas Kling c32176db27 Kernel: Don't preserve set-uid bit in open() and bind() modes
For some reason we were keeping the bits 04777 in file modes. That
doesn't seem right and I can't think of a reason why the set-uid bit
should be allowed to slip through.
2021-01-23 16:45:05 +01:00
Andreas Kling 54f421e170 Kernel: Clear coredump metadata on exec()
If for some reason a process wants to exec after saving some coredump
metadata, we should just throw away the data.
2021-01-23 09:41:11 +01:00
asynts 7b0a1a98d9 Everywhere: Replace a bundle of dbg with dbgln.
These changes are arbitrarily divided into multiple commits to make it
easier to find potentially introduced bugs with git bisect.
2021-01-22 22:14:30 +01:00
asynts a348ab55b0 Everywhere: Replace a bundle of dbg with dbgln.
These changes are arbitrarily divided into multiple commits to make it
easier to find potentially introduced bugs with git bisect.
2021-01-22 22:14:30 +01:00
Andreas Kling 2cd07c6212 Kernel+Userland: Remove "dns" pledge promise alias
This was just an alias for "unix" that I added early on back when there
was some belief that we might be compatible with OpenBSD. We're clearly
never going to be compatible with their pledges so just drop the alias.
2021-01-22 19:39:44 +01:00
Andreas Kling 19d3f8cab7 Kernel+LibC: Turn errno codes into a strongly typed enum
..and allow implicit creation of KResult and KResultOr from ErrnoCode.
This means that kernel functions that return those types can finally
do "return EINVAL;" and it will just work.

There's a handful of functions that still deal with signed integers
that should be converted to return KResults.
2021-01-20 23:20:02 +01:00
Linus Groh 2cc3d68615 Kernel+LibC: Add _SC_TTY_NAME_MAX 2021-01-18 22:28:56 +01:00
Tom 1d621ab172 Kernel: Some futex improvements
This adds support for FUTEX_WAKE_OP, FUTEX_WAIT_BITSET, FUTEX_WAKE_BITSET,
FUTEX_REQUEUE, and FUTEX_CMP_REQUEUE, as well well as global and private
futex and absolute/relative timeouts against the appropriate clock. This
also changes the implementation so that kernel resources are only used when
a thread is blocked on a futex.

Global futexes are implemented as offsets in VMObjects, so that different
processes can share a futex against the same VMObject despite potentially
being mapped at different virtual addresses.
2021-01-17 20:30:31 +01:00
Andreas Kling 992f513ad2 Kernel: Limit exec arguments and environment to 1/8th of stack each
This sort-of matches what some other systems do and seems like a
generally sane thing to do instead of allowing programs to spawn a
child with a nearly full stack.
2021-01-17 18:29:56 +01:00
Andreas Kling 1730c23775 Kernel: Remove a bunch of no-longer-necessary SmapDisablers
We forgot to remove the automatic SMAP disablers after fixing up all
this code to not access userspace memory directly. Let's lock things
down at last. :^)
2021-01-17 15:03:07 +01:00
Andreas Kling bf0719092f Kernel+Userland: Remove shared buffers (shbufs)
All users of this mechanism have been switched to anonymous files and
passing file descriptors with sendfd()/recvfd().

Shbufs got us where we are today, but it's time we say good-bye to them
and welcome a much more idiomatic replacement. :^)
2021-01-17 09:07:32 +01:00
Andreas Kling 05dbfe9ab6 Kernel: Remove sys$shbuf_seal() and userland wrappers
There are no remaining users of this syscall so let it go. :^)
2021-01-17 00:18:01 +01:00