Commit graph

723 commits

Author SHA1 Message Date
Andreas Kling cbcf891040 Kernel: Move select Process members into protected memory
Process member variable like m_euid are very valuable targets for
kernel exploits and until now they have been writable at all times.

This patch moves m_euid along with a whole bunch of other members
into a new Process::ProtectedData struct. This struct is remapped
as read-only memory whenever we don't need to write to it.

This means that a kernel write primitive is no longer enough to
overwrite a process's effective UID, you must first unprotect the
protected data where the UID is stored. :^)
2021-03-10 22:30:02 +01:00
Andreas Kling 84725ef3a5 Kernel+UserspaceEmulator: Add sys$emuctl() system call
This returns ENOSYS if you are running in the real kernel, and some
other result if you are running in UserspaceEmulator.

There are other ways we could check if we're inside an emulator, but
it seemed easier to just ask. :^)
2021-03-09 08:58:26 +01:00
Andreas Kling b425c2602c Kernel: Better handling of allocation failure in profiling
If we can't allocate a PerformanceEventBuffer to store the profiling
events, we now fail sys$profiling_enable() and sys$perf_event()
with ENOMEM instead of carrying on with a broken buffer.
2021-03-02 22:38:06 +01:00
Ben Wiederhake 336303bda4 Kernel: Make kgettimeofday use AK::Time 2021-03-02 08:36:08 +01:00
Ben Wiederhake 05d5e3fad9 Kernel: Remove duplicative kgettimeofday(timeval&) function 2021-03-02 08:36:08 +01:00
Ben Wiederhake c040e64b7d Kernel: Make TimeManagement use AK::Time internally
I don't dare touch the multi-threading logic and locking mechanism, so it stays
timespec for now. However, this could and should be changed to AK::Time, and I
bet it will simplify the "increment_time_since_boot()" code.
2021-03-02 08:36:08 +01:00
Andreas Kling 272c2e6ec5 Kernel: Use Userspace<T> in sys${munmap,mprotect,madvise,msyscall}() 2021-03-01 15:53:33 +01:00
Andreas Kling bebceaa32c Kernel: Use Userspace<T> in sys$select() 2021-03-01 15:07:01 +01:00
Andreas Kling a1a82c1d95 Kernel: Use Userspace<T> in sys$get_dir_entries() 2021-03-01 15:04:31 +01:00
Andreas Kling b5f32be577 Kernel: Use Userspace<T> in sys$get_stack_bounds() 2021-03-01 14:50:36 +01:00
Andreas Kling 122c7b6cbb Kernel: Use Userspace<T> in sys$write() 2021-03-01 14:35:06 +01:00
Andreas Kling 6a6eb8844a Kernel: Use Userspace<T> in sys$sigaction()
fuzz-syscalls found a bunch of unaligned accesses into struct sigaction
via this syscall. This patch fixes that issue by porting the syscall
to Userspace<T> which we should have done anyway. :^)

Fixes #5500.
2021-03-01 14:06:20 +01:00
Andreas Kling ac71775de5 Kernel: Make all syscall functions return KResultOr<T>
This makes it a lot easier to return errors since we no longer have to
worry about negating EFOO errors and can just return them flat.
2021-03-01 13:54:32 +01:00
Linus Groh e265054c12 Everywhere: Remove a bunch of redundant 'AK::' namespace prefixes
This is basically just for consistency, it's quite strange to see
multiple AK container types next to each other, some with and some
without the namespace prefix - we're 'using AK::Foo;' a lot and should
leverage that. :^)
2021-02-26 16:59:56 +01:00
Andreas Kling 8eeb8db2ed Kernel: Don't disable interrupts while dealing with a process crash
This was necessary in the past when crash handling would modify
various global things, but all that stuff is long gone so we can
simplify crashes by leaving the interrupt flag alone.
2021-02-25 19:36:36 +01:00
Andreas Kling 8f70528f30 Kernel: Take some baby steps towards x86_64
Make more of the kernel compile in 64-bit mode, and make some things
pointer-size-agnostic (by using FlatPtr.)

There's a lot of work to do here before the kernel will even compile.
2021-02-25 16:27:12 +01:00
Andreas Kling 5d180d1f99 Everywhere: Rename ASSERT => VERIFY
(...and ASSERT_NOT_REACHED => VERIFY_NOT_REACHED)

Since all of these checks are done in release builds as well,
let's rename them to VERIFY to prevent confusion, as everyone is
used to assertions being compiled out in release.

We can introduce a new ASSERT macro that is specifically for debug
checks, but I'm doing this wholesale conversion first since we've
accumulated thousands of these already, and it's not immediately
obvious which ones are suitable for ASSERT.
2021-02-23 20:56:54 +01:00
Andreas Kling 84b2d4c475 Kernel: Add "map_fixed" pledge promise
This is a new promise that guards access to mmap() with MAP_FIXED.

Fixed-address mappings are rarely used, but can be useful if you are
trying to groom the process address space for malicious purposes.

None of our programs need this at the moment, as the only user of
MAP_FIXED is DynamicLoader, but the fixed mappings are constructed
before the process has had a chance to pledge anything.
2021-02-21 01:08:48 +01:00
Andreas Kling 55a9a4f57a Kernel: Use KResult a bit more in sys$execve() 2021-02-18 09:37:33 +01:00
AnotherTest a3a7ab83c4 Kernel+LibC: Implement readv
We already had writev, so let's just add readv too.
2021-02-15 17:32:56 +01:00
Andreas Kling 781d29a337 Kernel+Userland: Give sys$recvfd() an options argument for O_CLOEXEC
@bugaevc pointed out that we shouldn't be setting this flag in
userspace, and he's right of course.
2021-02-14 10:39:48 +01:00
Andreas Kling 1593219a41 Kernel: Map signal trampoline into each process's address space
The signal trampoline was previously in kernelspace memory, but with
a special exception to make it user-accessible.

This patch moves it into each process's regular address space so we
can stop supporting user-allowed memory above 0xc0000000.
2021-02-14 01:33:17 +01:00
Andreas Kling 1ef43ec89a Kernel: Move get_interpreter_load_offset() out of Process class
This is only used inside the sys$execve() implementation so just make
it a execve.cpp local function.
2021-02-12 16:30:29 +01:00
Andreas Kling 4ff0f971f7 Kernel: Prevent execve/ptrace race
Add a per-process ptrace lock and use it to prevent ptrace access to a
process after it decides to commit to a new executable in sys$execve().

Fixes #5230.
2021-02-08 23:05:41 +01:00
Andreas Kling 0d7af498d7 Kernel: Move ShouldAllocateTls enum from Process to execve.cpp 2021-02-08 22:24:37 +01:00
Andreas Kling 8bda30edd2 Kernel: Move memory statistics helpers from Process to Space 2021-02-08 22:23:29 +01:00
Andreas Kling f1b5def8fd Kernel: Factor address space management out of the Process class
This patch adds Space, a class representing a process's address space.

- Each Process has a Space.
- The Space owns the PageDirectory and all Regions in the Process.

This allows us to reorganize sys$execve() so that it constructs and
populates a new Space fully before committing to it.

Previously, we would construct the new address space while still
running in the old one, and encountering an error meant we had to do
tedious and error-prone rollback.

Those problems are now gone, replaced by what's hopefully a set of much
smaller problems and missing cleanups. :^)
2021-02-08 18:27:28 +01:00
Andreas Kling cf5ab665e0 Kernel: Remove unused Process::for_each_thread_in_coredump() 2021-02-08 18:27:28 +01:00
Ben Wiederhake 0a2304ba05 Everywhere: Fix weird includes 2021-02-08 18:03:57 +01:00
Andreas Kling 5c1c82cd33 Kernel: Remove unused function Process::backtrace() 2021-02-07 19:27:00 +01:00
Andreas Kling b1813e5dae Kernel: Remove some unused declarations from Process 2021-02-07 19:27:00 +01:00
Andreas Kling 823186031d Kernel: Add a way to specify which memory regions can make syscalls
This patch adds sys$msyscall() which is loosely based on an OpenBSD
mechanism for preventing syscalls from non-blessed memory regions.

It works similarly to pledge and unveil, you can call it as many
times as you like, and when you're finished, you call it with a null
pointer and it will stop accepting new regions from then on.

If a syscall later happens and doesn't originate from one of the
previously blessed regions, the kernel will simply crash the process.
2021-02-02 20:13:44 +01:00
Ben Wiederhake cbee0c26e1 Kernel+keymap+KeyboardMapper: New pledge for getkeymap 2021-02-01 09:54:32 +01:00
Ben Wiederhake a2c21a55e1 Kernel+LibKeyboard: Enable querying the current keymap 2021-02-01 09:54:32 +01:00
Andreas Kling d0c5979d96 Kernel: Add "prot_exec" pledge promise and require it for PROT_EXEC
This prevents sys$mmap() and sys$mprotect() from creating executable
memory mappings in pledged programs that don't have this promise.

Note that the dynamic loader runs before pledging happens, so it's
unaffected by this.
2021-01-29 18:56:34 +01:00
Andreas Kling 5ff355c0cd Kernel: Generate coredump backtraces from "threads for coredump" list
This broke with the change that gave each process a list of its own
threads. Since threads are removed slightly earlier from that list
during process teardown, we're not able to use it for generating
coredump backtraces. Fortunately we have the "threads for coredump"
list for just this purpose. :^)
2021-01-28 08:41:18 +01:00
Tom 8104abf640 Kernel: Remove colonel special-case from Process::for_each_thread
Since each Process now has its own list of threads, we don't need
to treat colonel any different anymore. This also means that it
reports all kernel threads, not just the idle threads.
2021-01-28 08:14:12 +01:00
Tom ac3927086f Kernel: Keep a list of threads per Process
This allow us to iterate only the threads of the process.
2021-01-27 22:48:41 +01:00
Tom 03a9ee79fa Kernel: Implement thread priority queues
Rather than walking all Thread instances and putting them into
a vector to be sorted by priority, queue them into priority sorted
linked lists as soon as they become ready to be executed.
2021-01-27 22:48:41 +01:00
Andreas Kling e67402c702 Kernel: Remove Range "valid" state and use Optional<Range> instead
It's easier to understand VM ranges if they are always valid. We can
simply use an empty Optional<Range> to encode absence when needed.
2021-01-27 21:14:42 +01:00
Tom 21d288a10e Kernel: Make Thread::current smp-safe
Change Thread::current to be a static function and read using the fs
register, which eliminates a window between Processor::current()
returning and calling a function on it, which can trigger preemption
and a move to a different processor, which then causes operating
on the wrong object.
2021-01-27 21:12:24 +01:00
Andreas Kling c7858622ec Kernel: Update process promise states on execve() and fork()
We now move the execpromises state into the regular promises, and clear
the execpromises state.

Also make sure to duplicate the promise state on fork.

This fixes an issue where "su" would launch a shell which immediately
crashed due to not having pledged "stdio".
2021-01-26 15:26:37 +01:00
Andreas Kling 1e25d2b734 Kernel: Remove allocate_region() functions that don't take a Range
Let's force callers to provide a VM range when allocating a region.
This makes ENOMEM error handling more visible and removes implicit
VM allocation which felt a bit magical.
2021-01-26 14:13:57 +01:00
Linus Groh 629180b7d8 Kernel: Support pledge() with empty promises
This tells the kernel that the process wants to use pledge, but without
pledging anything - effectively restricting it to syscalls that don't
require a certain promise. This is part of OpenBSD's pledge() as well,
which served as basis for Serenity's.
2021-01-25 23:22:21 +01:00
Linus Groh 678919e9c1 Kernel: Set "pledge_violation" coredump metadata in REQUIRE_PROMISE()
Similar to LibC storing an assertion message before aborting, process
death by pledge violation now sets a "pledge_violation" key with the
respective pledge name as value in its coredump metadata, which the
CrashReporter will then show.
2021-01-20 21:01:15 +01:00
Tom 1d621ab172 Kernel: Some futex improvements
This adds support for FUTEX_WAKE_OP, FUTEX_WAIT_BITSET, FUTEX_WAKE_BITSET,
FUTEX_REQUEUE, and FUTEX_CMP_REQUEUE, as well well as global and private
futex and absolute/relative timeouts against the appropriate clock. This
also changes the implementation so that kernel resources are only used when
a thread is blocked on a futex.

Global futexes are implemented as offsets in VMObjects, so that different
processes can share a futex against the same VMObject despite potentially
being mapped at different virtual addresses.
2021-01-17 20:30:31 +01:00
Andreas Kling bf0719092f Kernel+Userland: Remove shared buffers (shbufs)
All users of this mechanism have been switched to anonymous files and
passing file descriptors with sendfd()/recvfd().

Shbufs got us where we are today, but it's time we say good-bye to them
and welcome a much more idiomatic replacement. :^)
2021-01-17 09:07:32 +01:00
Andreas Kling 05dbfe9ab6 Kernel: Remove sys$shbuf_seal() and userland wrappers
There are no remaining users of this syscall so let it go. :^)
2021-01-17 00:18:01 +01:00
Andreas Kling b818cf898e Kernel+Userland: Remove sys$shbuf_allow_all() and userland wrappers
Nobody is using globally shared shbufs anymore, so let's remove them.
2021-01-16 22:43:03 +01:00
Ben Wiederhake ea5825f2c9 Kernel+LibC: Make sys$getcwd truncate the result silently
This gives us the superpower of knowing the ideal buffer length if it fails.
See also https://github.com/SerenityOS/serenity/discussions/4357
2021-01-16 22:40:53 +01:00
Andreas Kling 01c2480eb3 Kernel+LibC+WindowServer: Remove unused thread/process boost mechanism
The priority boosting mechanism has been broken for a very long time.
Let's remove it from the codebase and we can bring it back the day
someone feels like implementing it in a working way. :^)
2021-01-16 14:52:04 +01:00
Andreas Kling 43109f9614 Kernel: Remove unused syscall sys$minherit()
This is no longer used. We can bring it back the day we need it.
2021-01-16 14:52:04 +01:00
Andreas Kling de31e82f97 Kernel: Remove sys$shbuf_set_volatile() and userland wrappers
There are no remaining users of this syscall so let's remove it! :^)
2021-01-16 14:52:04 +01:00
Linus Groh 1ccc2e6482 Kernel: Store process arguments and environment in coredumps
Currently they're only pushed onto the stack but not easily accessible
from the Process class, so this adds a Vector<String> for both.
2021-01-15 23:26:47 +01:00
Linus Groh 057ae36e32 Kernel: Prevent threads from being destructed between die() and finalize()
Killing remaining threads already happens in Process::die(), but
coredumps are only written in Process::finalize(). We need to keep a
reference to each of those threads to prevent them from being destructed
between those two functions, otherwise coredumps will only ever contain
information about the last remaining thread.

Fixes the underlying problem of #4778, though the UI will need
refinements to not show every thread's backtrace mashed together.
2021-01-15 23:26:47 +01:00
Andreas Kling 64b0d89335 Kernel: Make Process::allocate_region*() return KResultOr<Region*>
This allows region allocation to return specific errors and we don't
have to assume every failure is an ENOMEM.
2021-01-15 19:10:30 +01:00
Andreas Kling fb4993f067 Kernel: Add anonymous files, created with sys$anon_create()
This patch adds a new AnonymousFile class which is a File backed by
an AnonymousVMObject that can only be mmap'ed and nothing else, really.

I'm hoping that this can become a replacement for shbufs. :^)
2021-01-15 13:56:47 +01:00
Andreas Kling f03800cee3 Kernel: Add dedicated "ptrace" pledge promise
The vast majority of programs don't ever need to use sys$ptrace(),
and it seems like a high-value system call to prevent a compromised
process from using.

This patch moves sys$ptrace() from the "proc" promise to its own,
new "ptrace" promise and updates the affected apps.
2021-01-11 22:32:59 +01:00
Andreas Kling 5c73c1bff8 Kernel: Don't dump perfcore for non-dumpable processes
Fixes #4904
2021-01-11 18:53:45 +01:00
Andreas Kling 5dafb72370 Kernel+Profiler: Make profiling per-process and without core dumps
This patch merges the profiling functionality in the kernel with the
performance events mechanism. A profiler sample is now just another
perf event, rather than a dedicated thing.

Since perf events were already per-process, this now makes profiling
per-process as well.

Processes with perf events would already write out a perfcore.PID file
to the current directory on death, but since we may want to profile
a process and then let it continue running, recorded perf events can
now be accessed at any time via /proc/PID/perf_events.

This patch also adds information about process memory regions to the
perfcore JSON format. This removes the need to supply a core dump to
the Profiler app for symbolication, and so the "profiler coredump"
mechanism is removed entirely.

There's still a hard limit of 4MB worth of perf events per process,
so this is by no means a perfect final design, but it's a nice step
forward for both simplicity and stability.

Fixes #4848
Fixes #4849
2021-01-11 11:36:00 +01:00
Itamar f259d96871 Kernel: Avoid collision between dynamic loader and main program
When loading non position-independent programs, we now take care not to
load the dynamic loader at an address that collides with the location
the main program wants to load at.

Fixes #4847.
2021-01-10 22:04:43 +01:00
Itamar 40a8159c62 Kernel: Plumb the elf header of the main program down to Process::load
This will enable us to take the desired load address of non-position
independent programs into account when randomizing the load address
of the dynamic loader.
2021-01-10 22:04:43 +01:00
asynts 019c9eb749 Everywhere: Replace a bundle of dbg with dbgln.
These changes are arbitrarily divided into multiple commits to make it
easier to find potentially introduced bugs with git bisect.
2021-01-09 21:11:09 +01:00
Andreas Kling d991658794 Kernel+LibC: Tidy up assertion failures with a dedicated syscall
This patch adds sys$abort() which immediately crashes the process with
SIGABRT. This makes assertion backtraces a lot nicer by removing all
the gunk that otherwise happens between __assertion_failed() and
actually crashing from the SIGABRT.
2021-01-04 21:57:30 +01:00
Tom 901ef3f1c8 Kernel: Specify default memory order for some non-synchronizing Atomics 2021-01-04 19:13:52 +01:00
Linus Groh 0571a17f57 Kernel+LibELF: Store termination signal in coredump ProcessInfo 2021-01-03 22:12:42 +01:00
William Marlow 747e8de96a Kernel+Loader.so: Allow dynamic executables without an interpreter
Commit a3a9016701 removed the PT_INTERP header
from Loader.so which cleaned up some kernel code in execve. Unfortunately
it prevents Loader.so from being run as an executable
2021-01-03 19:45:16 +01:00
Andreas Kling 5dae85afe7 Kernel: Pass "shared" flag to Region constructor
Before this change, we would sometimes map a region into the address
space with !is_shared(), and then moments later call set_shared(true).

I found this very confusing while debugging, so this patch makes us pass
the initial shared flag to the Region constructor, ensuring that it's in
the correct state by the time we first map the region.
2021-01-02 16:57:31 +01:00
Tom 476f17b3f1 Kernel: Merge PurgeableVMObject into AnonymousVMObject
This implements memory commitments and lazy-allocation of committed
memory.
2021-01-01 23:43:44 +01:00
Tom c3451899bc Kernel: Add MAP_NORESERVE support to mmap
Rather than lazily committing regions by default, we now commit
the entire region unless MAP_NORESERVE is specified.

This solves random crashes in low-memory situations where e.g. the
malloc heap allocated memory, but using pages that haven't been
used before triggers a crash when no more physical memory is available.

Use this flag to create large regions without actually committing
the backing memory. madvise() can be used to commit arbitrary areas
of such regions after creating them.
2021-01-01 23:43:44 +01:00
Linus Groh 91332515a6 Kernel: Add sys$set_coredump_metadata() syscall
This can be used by applications to store information (key/value pairs)
likely useful for debugging, which will then be embedded in the coredump.
2020-12-30 16:28:27 +01:00
Andreas Kling 30dbe9c78a Kernel+LibC: Add a very limited sys$mremap() implementation
This syscall can currently only remap a shared file-backed mapping into
a private file-backed mapping.
2020-12-29 02:20:43 +01:00
Andreas Kling 0e2b7f9c9a Kernel: Remove the per-process icon_id and sys$set_process_icon()
This was a goofy kernel API where you could assign an icon_id (int) to
a process which referred to a global shbuf with a 16x16 icon bitmap
inside it.

Instead of this, programs that want to display a process icon now
retrieve it from the process executable instead.
2020-12-27 01:16:56 +01:00
Andreas Kling 21ccbc2167 Kernel: Expose process executable paths in /proc/all 2020-12-27 01:16:56 +01:00
AnotherTest 7b5aa06702 Kernel: Allow 'elevating' unveil permissions if implicitly inherited from '/'
This can happen when an unveil follows another with a path that is a
sub-path of the other one:
```c++
unveil("/home/anon/.config/whoa.ini", "rw");
unveil("/home/anon", "r"); // this would fail, as "/home/anon" inherits
                           // the permissions of "/", which is None.
```
2020-12-26 16:10:04 +01:00
AnotherTest a9184fcb76 Kernel: Implement unveil() as a prefix-tree
Fixes #4530.
2020-12-26 11:54:54 +01:00
Andreas Kling 82f86e35d6 Kernel+LibC: Introduce a "dumpable" flag for processes
This new flag controls two things:
- Whether the kernel will generate core dumps for the process
- Whether the EUID:EGID should own the process's files in /proc

Processes are automatically made non-dumpable when their EUID or EGID is
changed, either via syscalls that specifically modify those ID's, or via
sys$execve(), when a set-uid or set-gid program is executed.

A process can change its own dumpable flag at any time by calling the
new sys$prctl(PR_SET_DUMPABLE) syscall.

Fixes #4504.
2020-12-25 19:35:55 +01:00
Andreas Kling 89d3b09638 Kernel: Allocate new main thread stack before committing to exec
If the allocation fails (e.g ENOMEM) we want to simply return an error
from sys$execve() and continue executing the current executable.

This patch also moves make_userspace_stack_for_main_thread() out of the
Thread class since it had nothing in particular to do with Thread.
2020-12-25 16:22:01 +01:00
Andreas Kling 2f1712cc29 Kernel: Move ELF auxiliary vector building out of Process class
Process had a couple of members whose only purpose was holding on to
some temporary data while building the auxiliary vector. Remove those
members and move the vector building to a free function in execve.cpp
2020-12-25 15:23:35 +01:00
Andreas Kling 40e9edd798 LibELF: Move AuxiliaryValue into the ELF namespace 2020-12-25 14:48:30 +01:00
Andreas Kling d7ad082afa Kernel+LibELF: Stop doing ELF symbolication in the kernel
Now that the CrashDaemon symbolicates crashes in userspace, let's take
this one step further and stop trying to symbolicate userspace programs
in the kernel at all.
2020-12-25 01:03:46 +01:00
Andreas Kling 8e79bde2b7 Kernel: Move KBufferBuilder to the fallible KBuffer API
KBufferBuilder::build() now returns an OwnPtr<KBuffer> and can fail.
Clients of the API have been updated to handle that situation.
2020-12-18 19:22:26 +01:00
Itamar b4842d33bb Kernel: Generate a coredump file when a process crashes
When a process crashes, we generate a coredump file and write it in
/tmp/coredumps/.

The coredump file is an ELF file of type ET_CORE.
It contains a segment for every userspace memory region of the process,
and an additional PT_NOTE segment that contains the registers state for
each thread, and a additional data about memory regions
(e.g their name).
2020-12-14 23:05:53 +01:00
Itamar efe4da57df Loader: Stabilize loader & Use shared libraries everywhere :^)
The dynamic loader is now stable enough to be used everywhere in the
system - so this commit does just that.
No More .a Files, Long Live .so's!
2020-12-14 23:05:53 +01:00
Itamar 9ca1a0731f Kernel: Support TLS allocation from userspace
This adds an allocate_tls syscall through which a userspace process
can request the allocation of a TLS region with a given size.

This will be used by the dynamic loader to allocate TLS for the main
executable & its libraries.
2020-12-14 23:05:53 +01:00
Itamar 5b87904ab5 Kernel: Add ability to load interpreter instead of main program
When the main executable needs an interpreter, we load the requested
interpreter program, and pass to it an open file decsriptor to the main
executable via the auxiliary vector.

Note that we do not allocate a TLS region for the interpreter.
2020-12-14 23:05:53 +01:00
Tom c455fc2030 Kernel: Change wait blocking to Process-only blocking
This prevents zombies created by multi-threaded applications and brings
our model back to closer to what other OSs do.

This also means that SIGSTOP needs to halt all threads, and SIGCONT needs
to resume those threads.
2020-12-12 21:28:12 +01:00
Tom 4bbee00650 Kernel: disown should unblock any potential waiters
This is necessary because if a process changes the state to Stopped
or resumes from that state, a wait entry is created in the parent
process. So, if a child process does this before disown is called,
we need to clear those entries to avoid leaking references/zombies
that won't be cleaned up until the former parent exits.

This also should solve an even more unlikely corner case where another
thread is waiting on a pid that is being disowned by another thread.
2020-12-12 21:28:12 +01:00
Tom 4c1e27ec65 Kernel: Use TimerQueue for SIGALRM 2020-12-02 13:02:04 +01:00
Tom 046d6855f5 Kernel: Move block condition evaluation out of the Scheduler
This makes the Scheduler a lot leaner by not having to evaluate
block conditions every time it is invoked. Instead evaluate them as
the states change, and unblock threads at that point.

This also implements some more waitid/waitpid/wait features and
behavior. For example, WUNTRACED and WNOWAIT are now supported. And
wait will now not return EINTR when SIGCHLD is delivered at the
same time.
2020-11-30 13:17:02 +01:00
Tom 6a620562cc Kernel: Allow passing a thread argument for new kernel threads
This adds the ability to pass a pointer to kernel thread/process.
Also add the ability to use a closure as thread function, which
allows passing information to a kernel thread more easily.
2020-11-30 13:17:02 +01:00
Sergey Bugaev 098070b767 Kernel: Add unveil('b')
This is a new "browse" permission that lets you open (and subsequently list
contents of) directories underneath the path, but not regular files or any other
types of files.
2020-11-23 18:37:40 +01:00
Nico Weber 323e727a4c Kernel+LibC: Add adjtime(2)
Most systems (Linux, OpenBSD) adjust 0.5 ms per second, or 0.5 us per
1 ms tick. That is, the clock is sped up or slowed down by at most
0.05%.  This means adjusting the clock by 1 s takes 2000 s, and the
clock an be adjusted by at most 1.8 s per hour.

FreeBSD adjusts 5 ms per second if the remaining time adjustment is
>= 1 s (0.5%) , else it adjusts by 0.5 ms as well. This allows adjusting
by (almost) 18 s per hour.

Since Serenity OS can lose more than 22 s per hour (#3429), this
picks an adjustment rate up to 1% for now. This allows us to
adjust up to 36s per hour, which should be sufficient to adjust
the clock fast enough to keep up with how much time the clock
currently loses. Once we have a fancier NTP implementation that can
adjust tick rate in addition to offset, we can think about reducing
this.

adjtime is a bit old-school and most current POSIX-y OSs instead
implement adjtimex/ntp_adjtime, but a) we have to start somewhere
b) ntp_adjtime() is a fairly gnarly API. OpenBSD's adjfreq looks
like it might provide similar functionality with a nicer API. But
before worrying about all this, it's probably a good idea to get
to a place where the kernel APIs are (barely) good enough so that
we can write an ntp service, and once we have that we should write
a way to automatically evaluate how well it keeps the time adjusted,
and only then should we add improvements ot the adjustment mechanism.
2020-11-10 19:03:08 +01:00
Tom 838d9fa251 Kernel: Make Thread refcounted
Similar to Process, we need to make Thread refcounted. This will solve
problems that will appear once we schedule threads on more than one
processor. This allows us to hold onto threads without necessarily
holding the scheduler lock for the entire duration.
2020-09-27 19:46:04 +02:00
Nico Weber b36a2d6686 Kernel+LibC+UserspaceEmulator: Mostly add recvmsg(), sendmsg()
The implementation only supports a single iovec for now.
Some might say having more than one iovec is the main point of
recvmsg() and sendmsg(), but I'm interested in the control message
bits.
2020-09-17 17:23:01 +02:00
Nico Weber c9a3a5b488 Kernel: Use Userspace<> for sys$writev 2020-09-15 20:20:38 +02:00
Tom c8d9f1b9c9 Kernel: Make copy_to/from_user safe and remove unnecessary checks
Since the CPU already does almost all necessary validation steps
for us, we don't really need to attempt to do this. Doing it
ourselves doesn't really work very reliably, because we'd have to
account for other processors modifying virtual memory, and we'd
have to account for e.g. pages not being able to be allocated
due to insufficient resources.

So change the copy_to/from_user (and associated helper functions)
to use the new safe_memcpy, which will return whether it succeeded
or not. The only manual validation step needed (which the CPU
can't perform for us) is making sure the pointers provided by user
mode aren't pointing to kernel mappings.

To make it easier to read/write from/to either kernel or user mode
data add the UserOrKernelBuffer helper class, which will internally
either use copy_from/to_user or directly memcpy, or pass the data
through directly using a temporary buffer on the stack.

Last but not least we need to keep syscall params trivial as we
need to copy them from/to user mode using copy_from/to_user.
2020-09-13 21:19:15 +02:00
Tom 0fab0ee96a Kernel: Rename Process::is_ring0/3 to Process::is_kernel/user_process
Since "rings" typically refer to code execution and user processes
can also execute in ring 0, rename these functions to more accurately
describe what they mean: kernel processes and user processes.
2020-09-10 19:57:15 +02:00
asynts ec1080b18a Refactor: Replace usages of FixedArray with Vector. 2020-09-08 14:01:21 +02:00
AnotherTest 688e54eac7 Kernel: Distinguish between new and old process groups with equal pgids
This does not add any behaviour change to the processes, but it ties a
TTY to an active process group via TIOCSPGRP, and returns the TTY to the
kernel when all processes in the process group die.
Also makes the TTY keep a link to the original controlling process' parent (for
SIGCHLD) instead of the process itself.
2020-08-19 21:21:34 +02:00
Brian Gianforcaro 8e97de2df9 Kernel: Use Userspace<T> for the recvfrom syscall, and Socket implementation
This fixes a bunch of unchecked kernel reads and writes, seems like they
would might exploitable :). Write of sockaddr_in size to any address you
please...
2020-08-19 21:05:28 +02:00
Brian Gianforcaro 9f9b05ba0f Kernel: Use Userspace<T> for the sendto syscall, and Socket implementation
Note that the data member is of type ImmutableBufferArgument, which has
no Userspace<T> usage. I left it alone for now, to be fixed in a future
change holistically for all usages.
2020-08-19 21:05:28 +02:00
Andreas Kling 9ddd540ca9 Kernel: Bump process thread count to a 32-bit value
We should support more than 65535 threads, after all. :^)
2020-08-17 18:05:35 +02:00
Andreas Kling 65f2270232 Kernel+LibC+UserspaceEmulator: Bring back sys$dup2()
This is racy in userspace and non-racy in kernelspace so let's keep
it in kernelspace.

The behavior change where CLOEXEC is preserved when dup2() is called
with (old_fd == new_fd) was good though, let's keep that.
2020-08-15 11:11:34 +02:00
Andreas Kling bf247fb45f Kernel+LibC+UserspaceEmulator: Remove sys$dup() and sys$dup2()
We can just implement these in userspace, so yay two less syscalls!
2020-08-15 01:30:22 +02:00
Brian Gianforcaro 0e627b0273 Kernel: Use Userspace<T> for the exit_thread syscall
Userspace<void*> is a bit strange here, as it would appear to the
user that we intend to de-refrence the pointer in kernel mode.

However I think it does a good join of illustrating that we are
treating the void* as a value type,  instead of a pointer type.
2020-08-10 12:52:15 +02:00
Brian Gianforcaro d3847b3489 Kernel: Use Userspace<T> for the join_thread syscall 2020-08-10 12:52:15 +02:00
Brian Gianforcaro e8917cc5f3 Kernel: Use Userspace<T> for the chroot syscall 2020-08-10 12:52:15 +02:00
Brian Gianforcaro 20e2a5c111 Kernel: Use Userspace<T> for the module_unload syscall 2020-08-10 12:52:15 +02:00
Brian Gianforcaro c4927ceb08 Kernel: Use Userspace<T> for the module_load syscall 2020-08-10 12:52:15 +02:00
Brian Gianforcaro b5a2a215f6 Kernel: Use Userspace<T> for the getrandom syscall 2020-08-10 12:52:15 +02:00
Brian Gianforcaro c8ae244ab8 Kernel: Use Userspace<T> for the shbuf_get syscall 2020-08-10 12:52:15 +02:00
Brian Gianforcaro e073f2b59e Kernel: Use Userspace<T> for the get_thread_name syscall 2020-08-10 12:52:15 +02:00
Brian Gianforcaro 9652b0ae2b Kernel: Use Userspace<T> for the set_thread_name syscall 2020-08-10 12:52:15 +02:00
Brian Gianforcaro 0e20a6df0a Kernel: Use Userspace<T> for the connect syscall 2020-08-10 12:52:15 +02:00
Brian Gianforcaro 8bd9dbc220 Kernel: Use Userspace<T> for the accept syscall 2020-08-10 12:52:15 +02:00
Brian Gianforcaro 02660b5d60 Kernel: Use Userspace<T> for the bind syscall, and implementation 2020-08-10 12:52:15 +02:00
Brian Gianforcaro 2bac7190c8 Kernel: Use Userspace<T> for the chmod syscall 2020-08-10 12:52:15 +02:00
Brian Gianforcaro 82bf6e8133 Kernel: Use Userspace<T> for the umount syscall 2020-08-10 12:52:15 +02:00
Brian Gianforcaro 317800324c Kernel: Use Userspace<T> for the unlink syscall 2020-08-10 12:52:15 +02:00
Brian Gianforcaro ecfe20efd2 Kernel: Use Userspace<T> for the sigpending syscall 2020-08-10 12:52:15 +02:00
Brian Gianforcaro fbb26b28b9 Kernel: Use Userspace<T> for the sigprocmask syscall 2020-08-10 12:52:15 +02:00
Brian Gianforcaro 431145148e Kernel: Use Userspace<T> for the fstat syscall 2020-08-10 12:52:15 +02:00
Brian Gianforcaro 8dd78201a4 Kernel: Use Userspace<T> for the uname syscall 2020-08-10 12:52:15 +02:00
Brian Gianforcaro cfedd62b5c Kernel: Use Userspace<T> for the sethostname syscall 2020-08-10 12:52:15 +02:00
Brian Gianforcaro 1d9554f470 Kernel: Use Userspace<T> for the gethostname syscall 2020-08-10 12:52:15 +02:00
Brian Gianforcaro b069d757a3 Kernel: Use Userspace<T> for the clock_settime syscall 2020-08-10 12:52:15 +02:00
Brian Gianforcaro 1be6145fdf Kernel: Modifiy clock_settime timespec argument to const
The timeppec paramter is read only, and should be const.
2020-08-10 12:52:15 +02:00
Brian Gianforcaro b4d04fd8d1 Kernel: Use Userspace<T> for the clock_gettime syscall 2020-08-10 12:52:15 +02:00
Brian Gianforcaro 48d9f3c2e6 Kernel: Use Userspace<T> for the getresgid syscall 2020-08-10 12:52:15 +02:00
Brian Gianforcaro 3ca18a88d7 Kernel: Use Userspace<T> for the getresuid syscall 2020-08-10 12:52:15 +02:00
Brian Gianforcaro 7943655838 Kernel: Use Userspace<T> for the times syscall 2020-08-10 12:52:15 +02:00
Brian Gianforcaro e7728ca8fd Kernel: Use Userspace<T> for the getgroups syscall 2020-08-10 12:52:15 +02:00
Brian Gianforcaro 80011cd62d Kernel: Use Userspace<T> for the setgroups syscall 2020-08-10 12:52:15 +02:00
Brian Gianforcaro 0f42463eab Kernel: Use Userspace<T> for the execve syscall 2020-08-10 12:52:15 +02:00
Ben Wiederhake bee08a4b9f Kernel: More PID/TID typing 2020-08-10 11:51:45 +02:00
Ben Wiederhake 7bdf54c837 Kernel: PID/PGID typing
This compiles, and fixes two bugs:
- setpgid() confusion (see previous commit)
- tcsetpgrp() now allows to set a non-empty process group even if
  the group leader has already died. This makes Serenity slightly
  more POSIX-compatible.
2020-08-10 11:51:45 +02:00
Ben Wiederhake f5744a6f2f Kernel: PID/TID typing
This compiles, and contains exactly the same bugs as before.
The regex 'FIXME: PID/' should reveal all markers that I left behind, including:
- Incomplete conversion
- Issues or things that look fishy
- Actual bugs that will go wrong during runtime
2020-08-10 11:51:45 +02:00
Brian Gianforcaro 35c745ca54 Kernel: Use Userspace<T> for the realpath syscall 2020-08-07 16:18:36 +02:00
Brian Gianforcaro 30b2c0dc85 Kernel: Use Userspace<T> for the getsockopt syscall and Socket interface
The way getsockopt is implemented for socket types requires us to push
down Userspace<T> using into those interfaces. This change does so, and
utilizes proper copy implementations instead of the kind of haphazard
pointer dereferencing that was occurring there before.
2020-08-07 16:18:36 +02:00
Brian Gianforcaro 6920d5f423 Kernel: Use Userspace<T> for the setsockopt syscall 2020-08-07 16:18:36 +02:00
Brian Gianforcaro 8fa46bcb7d Kernel: Use Userspace<T> for the getsockname syscall 2020-08-07 16:18:36 +02:00
Brian Gianforcaro dc75ea9fc2 Kernel: Use Userspace<T> for the getpeername syscall 2020-08-07 16:18:36 +02:00
Brian Gianforcaro 0db669a9d2 Kernel: Use Userspace<T> for the chown syscall 2020-08-07 16:18:36 +02:00
Brian Gianforcaro 7e7ee2ec94 Kernel: Use Userspace<T> for the mount syscall 2020-08-07 16:18:36 +02:00
Andreas Kling ddab7ab693 Kernel: Store TTY's foreground process as a WeakPtr<Process>
This ensures that we don't leave a stale PGID assigned to the TTY after
the process exits, which would make PID recycling attacks possible.
2020-08-06 11:17:53 +02:00
Brian Gianforcaro 7e6fbef8db Kernel: Partial usage of Userspace<T> for the poll syscall
This change mostly converts poll to Userspace<T> with the caveat
of the fds member of SC_poll_params. It's current usage is a bit
too gnarly for me to take on right now, this appears to need a lot
more love.

In addition to enlightening the syscall to use Userspace<T>, I've
also re-worked most of the handling to use validate_read_and_copy
instead of just directly de-referencing the user pointer. We also
appeared to be missing a re-evaluation of the fds array after the
thread block is awoken.
2020-08-06 10:22:44 +02:00
Brian Gianforcaro fa666f6897 Kernel: Use Userspace<T> for the futex syscall
Utilizie Userspace<T> for the syscall argument itself, as well
as internally in the SC_futex_params struct.

We were double validating the SC_futex_params.timeout validation,
that was removed as well.
2020-08-05 13:03:50 +02:00
Brian Gianforcaro 7490ea9449 Kernel + LibPthread: Use Userspace<T> in the create_thread syscall 2020-08-05 09:36:53 +02:00
Brian Gianforcaro 337e8f98cd Kernel: Use Userspace<T> for the rename syscall 2020-08-05 09:36:53 +02:00
Brian Gianforcaro c1541f4a61 Kernel: Use Userspace<T> for the mknod syscall 2020-08-05 09:36:53 +02:00
Brian Gianforcaro d949b2a367 Kernel: Use Userspace<T> for the set_mmap_name syscall 2020-08-05 09:36:53 +02:00
Brian Gianforcaro 7449921f53 Kernel: Use Userspace<T> for the readlink syscall 2020-08-05 09:36:53 +02:00
Brian Gianforcaro 901dae0227 Kernel: Use Userspace<T> for the mmap syscall 2020-08-05 09:36:53 +02:00
Brian Gianforcaro 74d3b202f1 Kernel: Use Userspace<T> for the waitid syscall 2020-08-05 09:36:53 +02:00
Brian Gianforcaro 84035e1035 Kernel: Use Userspace<T> for the clock_nanosleep syscall 2020-08-05 09:36:53 +02:00
Brian Gianforcaro baa070afb8 Kernel: Use Userspace<T> for the gettimeofday syscall 2020-08-05 09:36:53 +02:00
Brian Gianforcaro 1eeaed31c2 Kernel: Use Userspace<T> for the open syscall 2020-08-05 09:36:53 +02:00
Andreas Kling 58feebeed2 Kernel+LibC: Tidy up sys$ttyname() and sys$ptsname()
- Remove goofy _r suffix from syscall names.
- Don't take a signed buffer size.
- Use Userspace<T>.
- Make TTY::tty_name() return a String instead of a StringView.
2020-08-04 18:17:16 +02:00
Andreas Kling 7de831efc6 Kernel+LibC: Add sys$disown() for disowning child processes
This syscall allows a parent process to disown a child process, setting
its parent PID to 0.

Unparented processes are automatically reaped by the kernel upon exit,
and no sys$waitid() is required. This will make it much nicer to do
spawn-and-forget which is common in the GUI environment.
2020-08-04 18:17:16 +02:00
Andreas Kling b139fb9f38 Kernel: Use Userspace<T> in sys$link() and sys$symlink() 2020-08-03 18:40:28 +02:00
Brian Gianforcaro 2242f69cd6 Kernel: Use Userspace<T> in unveil syscall 2020-08-02 20:54:17 +02:00
Brian Gianforcaro 9db5a1b92f Kernel: Use Userspace<T> in sched_getparam syscall 2020-08-02 20:53:48 +02:00
Tom 538b985487 Kernel: Remove ProcessInspectionHandle and make Process RefCounted
By making the Process class RefCounted we don't really need
ProcessInspectionHandle anymore. This also fixes some race
conditions where a Process may be deleted while still being
used by ProcFS.

Also make sure to acquire the Process' lock when accessing
regions.

Last but not least, there's no reason why a thread can't be
scheduled while being inspected, though in practice it won't
happen anyway because the scheduler lock is held at the same
time.
2020-08-02 17:15:11 +02:00
Tom 5bbf6ed46b Kernel: Fix some crashes due to missing locks
We need to hold m_lock when accessing m_regions.
2020-08-02 17:15:11 +02:00
Andreas Kling e526fa572a Kernel: Convert some more syscalls to Userspace<T>
These are really straightforward when all the helpers just work.
2020-08-02 11:01:00 +02:00
Brian Gianforcaro 2a74c59dec Kernel: Use Userspace<T> in pledge syscall 2020-08-02 10:56:43 +02:00
Brian Gianforcaro ba4cf59d04 Kernel: Use Userspace<T> in setkeymap syscall 2020-08-02 10:56:33 +02:00
Brian Gianforcaro 10e912d68c Kernel: Use Userspace<T> in sched_setparam syscall
Note: I switched from copying the single element out of the sched_param
struct, to copy struct it self as it is identical in functionality.
This way the types match up nicer with the Userpace<T> api's and it
conforms to the conventions used in other syscalls.
2020-08-02 10:55:38 +02:00
Brian Gianforcaro 1209bf82c1 Kernel: Use Userspace<T> in ptrace syscall 2020-08-02 00:29:04 +02:00
Andreas Kling 8d4d1c7457 Kernel: Use Userspace<T> in more syscalls 2020-08-01 11:37:40 +02:00
Andreas Kling 628b3badfb Kernel+AK: Add and use Userspace<T>::unsafe_userspace_ptr()
Since we already have the type information in the Userspace template,
it was a bit silly to cast manually everywhere. Just add a sufficiently
scary-sounding getter for a typed pointer.

Thanks @alimpfard for pointing out that I was being silly with tossing
out the type.

In the future we may want to make this API non-public as well.
2020-07-31 20:56:48 +02:00
Andreas Kling 180207062c Kernel: Use Userspace<T> in sys$utime()
And again, another helper overload.
2020-07-31 16:38:47 +02:00
Andreas Kling 62a4099581 Kernel: Use Userspace<T> in sys$getcwd() and sys$chdir()
Add more validation helper overloads as we go. :^)
2020-07-31 16:34:47 +02:00
Andreas Kling 314dbc10d4 Kernel: Use Userspace<T> for sys$read() and sys$stat()
Add validation helper overloads as needed.
2020-07-31 16:28:37 +02:00
Andreas Kling be7add690d Kernel: Rename region_from_foo() => find_region_from_foo()
Let's emphasize that these functions actually go out and find regions.
2020-07-30 23:52:28 +02:00
Andreas Kling 2e2de125e5 Kernel: Turn Process::FileDescriptionAndFlags into a proper class 2020-07-30 23:50:31 +02:00
Andreas Kling 949aef4aef Kernel: Move syscall implementations out of Process.cpp
This is something I've been meaning to do for a long time, and here we
finally go. This patch moves all sys$foo functions out of Process.cpp
and into files in Kernel/Syscalls/.

It's not exactly one syscall per file (although it could be, but I got
a bit tired of the repetitive work here..)

This makes hacking on individual syscalls a lot less painful since you
don't have to rebuild nearly as much code every time. I'm also hopeful
that this makes it easier to understand individual syscalls. :^)
2020-07-30 23:40:57 +02:00
Andreas Kling b5f54d4153 Kernel+LibC: Add sys$set_process_name() for changing the process name 2020-07-27 19:10:18 +02:00
Nico Weber 4eb967b5eb LibC+Kernel: Start implementing sysconf
For now, only the non-standard _SC_NPROCESSORS_CONF and
_SC_NPROCESSORS_ONLN are implemented.

Use them to make ninja pick a better default -j value.
While here, make the ninja package script not fail if
no other port has been built yet.
2020-07-15 00:07:20 +02:00
Andrew Kaster f96b827990 Kernel+LibELF: Expose ELF Auxiliary Vector to Userspace
The AT_* entries are placed after the environment variables, so that
they can be found by iterating until the end of the envp array, and then
going even further beyond :^)
2020-07-07 10:38:54 +02:00
Andreas Kling 11c4a28660 Kernel: Move headers intended for userspace use into Kernel/API/ 2020-07-04 17:22:23 +02:00
Tom e373e5f007 Kernel: Fix signal delivery
When delivering urgent signals to the current thread
we need to check if we should be unblocked, and if not
we need to yield to another process.

We also need to make sure that we suppress context switches
during Process::exec() so that we don't clobber the registers
that it sets up (eip mainly) by a context switch. To be able
to do that we add the concept of a critical section, which are
similar to Process::m_in_irq but different in that they can be
requested at any time. Calls to Scheduler::yield and
Scheduler::donate_to will return instantly without triggering
a context switch, but the processor will then asynchronously
trigger a context switch once the critical section is left.
2020-07-03 19:32:34 +02:00
Tom 16783bd14d Kernel: Turn Thread::current and Process::current into functions
This allows us to query the current thread and process on a
per processor basis
2020-07-01 12:07:01 +02:00
Andreas Kling d4195672b7 Kernel+LibC: Add sys$recvfd() and sys$sendfd() for fd passing
These new syscalls allow you to send and receive file descriptors over
a local domain socket. This will enable various privilege separation
techniques and other good stuff. :^)
2020-06-24 23:08:09 +02:00
Nico Weber d2684a8645 LibC+Kernel: Implement ppoll
ppoll() is similar() to poll(), but it takes its timeout
as timespec instead of as int, and it takes an additional
sigmask parameter.

Change the sys$poll parameters to match ppoll() and implement
poll() in terms of ppoll().
2020-06-23 14:12:20 +02:00
Nico Weber dd53e070c5 Kernel+LibC: Remove setreuid() / setregid() again
It looks like they're considered a bad idea, so let's not add
them before we need them. I figured it's good to have them in
git history if we ever do need them though, hence the add/remove
dance.
2020-06-18 23:19:16 +02:00
Nico Weber a38754d9f2 Kernel+LibC: Implement seteuid() and friends!
Add seteuid()/setegid() under _POSIX_SAVED_IDS semantics,
which also requires adding suid and sgid to Process, and
changing setuid()/setgid() to honor these semantics.

The exact semantics aren't specified by POSIX and differ
between different Unix implementations. This patch makes
serenity follow FreeBSD. The 2002 USENIX paper
"Setuid Demystified" explains the differences well.

In addition to seteuid() and setegid() this also adds
setreuid()/setregid() and setresuid()/setresgid(), and
the accessors getresuid()/getresgid().

Also reorder uid/euid functions so that they are the
same order everywhere (namely, the order that
geteuid()/getuid() already have).
2020-06-18 23:19:16 +02:00
Andreas Kling 0609eefd57 Kernel: Add "setkeymap" pledge promise 2020-06-18 22:19:36 +02:00
Sergey Bugaev a77405665f Kernel: Fix overflow in Process::validate_{read,write}_typed()
Userspace could pass us a large count to overflow the check. I'm not enough of a
haxx0r to write an actual exploit though.
2020-05-31 21:38:50 +02:00
Sergey Bugaev cddaeb43d3 Kernel: Introduce "sigaction" pledge
You now have to pledge "sigaction" to change signal handlers/dispositions. This
is to prevent malicious code from messing with assertions (and segmentation
faults), which are normally expected to instantly terminate the process but can
do other things if you change signal disposition for them.
2020-05-26 14:35:10 +02:00
Andreas Kling b3736c1b1e Kernel: Use a FlatPtr for the "argument" to ioctl()
Since it's often used to pass pointers, it should really be a FlatPtr.
2020-05-23 15:25:43 +02:00
Andreas Kling f7a75598bb Kernel: Remove Process::any_thread()
This was a holdover from the old times when each Process had a special
main thread with TID 0. Using it was a total crapshoot since it would
just return whichever thread was first on the process's thread list.

Now that I've removed all uses of it, we don't need it anymore. :^)
2020-05-16 12:40:15 +02:00
Andreas Kling 0e7f85c24a Kernel: Sending a signal to a process now goes to the main thread
Instead of falling back to the suspicious "any_thread()" mechanism,
just fail with ESRCH if you try to kill() a PID that doesn't have a
corresponding TID.
2020-05-16 12:33:48 +02:00
Andreas Kling 21d5f4ada1 Kernel: Absorb LibBareMetal back into the kernel
This was supposed to be the foundation for some kind of pre-kernel
environment, but nobody is working on it right now, so let's move
everything back into the kernel and remove all the confusion.
2020-05-16 12:00:04 +02:00
Andreas Kling 2dc051c866 Kernel: Remove sys$getdtablesize()
I'm not sure why this was a syscall. If we need this we can add it in
LibC as a wrapper around sysconf(_SC_OPEN_MAX).
2020-05-16 11:34:01 +02:00
Andreas Kling 3a92d0828d Kernel: Remove the "kernel info page" used for fast gettimeofday()
We stopped using gettimeofday() in Core::EventLoop a while back,
in favor of clock_gettime() for monotonic time.

Maintaining an optimization for a syscall we're not using doesn't make
a lot of sense, so let's go back to the old-style sys$gettimeofday().
2020-05-16 11:33:59 +02:00
Andreas Kling 5bfd893292 Kernel+Userland: Add "settime" pledge promise for setting system time
We now require the "settime" promise from pledged processes who want to
change the system time.
2020-05-08 22:54:17 +02:00
Andreas Kling 042b1f6814 Kernel: Propagate failure to commit VM regions in more places
Ultimately we should not panic just because we can't fully commit a VM
region (by populating it with physical pages.)

This patch handles some of the situations where commit() can fail.
2020-05-08 21:47:08 +02:00
Andreas Kling 6fe83b0ac4 Kernel: Crash the current process on OOM (instead of panicking kernel)
This patch adds PageFaultResponse::OutOfMemory which informs the fault
handler that we were unable to allocate a necessary physical page and
cannot continue.

In response to this, the kernel will crash the current process. Because
we are OOM, we can't symbolicate the crash like we normally would
(since the ELF symbolication code needs to allocate), so we also
communicate to Process::crash() that we're out of memory.

Now we can survive "allocate 300 MB" (only the allocate process dies.)
This is definitely not perfect and can easily end up killing a random
innocent other process who happened to allocate one page at the wrong
time, but it's a *lot* better than panicking on OOM. :^)
2020-05-06 22:28:23 +02:00