Commit graph

241 commits

Author SHA1 Message Date
Hendiadyoin1 7ca3d413f7 Kernel: Pull apart CPU.h
This does not add any functional changes
2021-06-24 00:38:23 +02:00
Gunnar Beutner 3c3a1726df Kernel: Make sure threads which don't do any syscalls are terminated
Steps to reproduce:

$ cat loop.c
int main() { for (;;); }
$ gcc -o loop loop.c
$ ./loop

Terminating this process wasn't previously possible because we only
checked whether the thread should be terminated on syscall exit.
2021-06-19 12:55:00 +02:00
Gunnar Beutner 3c2a6a25da Kernel: Don't finalize a thread while it still has code running
After marking a thread for death we might end up finalizing the thread
while it still has code to run, e.g. via:

Thread::block -> Thread::dispatch_one_pending_signal
-> Thread::dispatch_signal -> Process::terminate_due_to_signal
-> Process::die -> Process::kill_all_threads -> Thread::set_should_die

This marks the thread for death. It isn't destroyed at this point
though.

The scheduler then gets invoked via:

Thread::block -> Thread::relock_process

At that point we still have a registered blocker on the stack frame
which belongs to Thread::block. Thread::relock_process drops the
critical section which allows the scheduler to run.

When the thread is then scheduled out the scheduler sets the thread
state to Thread::Dying which allows the finalizer to destroy the Thread
object and its associated resources including the kernel stack.

This probably also affects objects other than blockers which rely
on their destructor to be run, however the problem was most noticible
because blockers are allocated on the stack of the dying thread and
cause an access violation when another thread touches the blocker
which belonged to the now-dead thread.

Fixes #7823.
2021-06-06 15:58:48 +02:00
Gunnar Beutner 01c75e3a34 Kernel: Don't log profile data before/after the process/thread lifetime
There were a few cases where we could end up logging profiling events
before or after the associated process or thread exists in the profile:

After enabling profiling we might end up with CPU samples before we
had a chance to synthesize process/thread creation events.

After a thread exits we would still log associated kmalloc/kfree
events. Instead we now just ignore those events.
2021-05-30 19:03:03 +02:00
Ben Wiederhake a7c265f341 Everywhere: Sort out superfluous QuickSort.h imports
They were sorta unneeded. :^)
2021-05-29 23:41:54 +01:00
Brian Gianforcaro 6830963321 Kernel: Validate we don't hold s_mm_lock during context switch
Since `s_mm_lock` is a RecursiveSpinlock, if a kernel thread gets
preempted while accidentally hold the lock during switch_context,
another thread running on the same processor could end up manipulating
the state of the memory manager even though they should not be able to.
It will just bump the recursion count and keep going.

This appears to be the root cause of weird bugs like: #7359
Where page protection magically appears to be wrong during execution.

To avoid these cases lets guard this specific unfortunate case and make
sure it can never go unnoticed ever again.

The assert was Tom's idea to help debug this, so I am going to tag him
as co-author of this commit.

Co-Authored-By: Tom <tomut@yahoo.com>
2021-05-25 10:35:41 +02:00
Gunnar Beutner 277f333b2b Kernel: Add support for profiling kmalloc()/kfree() 2021-05-19 22:51:42 +02:00
Gunnar Beutner 8b2ace0326 Kernel: Track performance events for context switches 2021-05-19 22:51:42 +02:00
Liav A 99eab4667a Kernel: Print scheduler state to the display console 2021-05-16 19:58:33 +02:00
Nicholas Baron aa4d41fe2c
AK+Kernel+LibELF: Remove the need for IteratorDecision::Continue
By constraining two implementations, the compiler will select the best
fitting one. All this will require is duplicating the implementation and
simplifying for the `void` case.

This constraining also informs both the caller and compiler by passing
the callback parameter types as part of the constraint
(e.g.: `IterationFunction<int>`).

Some `for_each` functions in LibELF only take functions which return
`void`. This is a minimal correctness check, as it removes one way for a
function to incompletely do something.

There seems to be a possible idiom where inside a lambda, a `return;` is
the same as `continue;` in a for-loop.
2021-05-16 10:36:52 +01:00
Gunnar Beutner 8614d18956 Kernel: Use a separate timer for profiling the system
This updates the profiling subsystem to use a separate timer to
trigger CPU sampling. This timer has a higher resolution (1000Hz)
and is independent from the scheduler. At a later time the
resolution could even be made configurable with an argument for
sys$profiling_enable() - but not today.
2021-05-14 00:35:57 +02:00
Brian Gianforcaro 7463cbdbdb Kernel: Move cpu sample perf event to PerformanceManager 2021-05-07 15:35:23 +02:00
Brian Gianforcaro 64b4e3f34b
Kernel: Add Processor::is_bootstrap_processor() function, and use it. (#6871)
The variety of checks for Processor::id() == 0 could use some assistance
in the readability department. This change adds a new function to
represent this check, and replaces the comparison everywhere it's used.
2021-05-05 18:48:26 +02:00
Tom ec27cbbb2a Kernel: Store whether a thread is the idle thread in Thread directly
This solves a problem where checking whether a thread is an idle
thread may require iterating all processors if it is not the idle
thread of the current processor.
2021-05-04 16:44:02 +02:00
Gunnar Beutner eb798d5538 Kernel+Profiler: Improve profiling subsystem
This turns the perfcore format into more a log than it was before,
which lets us properly log process, thread and region
creation/destruction. This also makes it unnecessary to dump the
process' regions every time it is scheduled like we did before.

Incidentally this also fixes 'profile -c' because we previously ended
up incorrectly dumping the parent's region map into the profile data.

Log-based mmap support enables profiling shared libraries which
are loaded at runtime, e.g. via dlopen().

This enables profiling both the parent and child process for
programs which use execve(). Previously we'd discard the profiling
data for the old process.

The Profiler tool has been updated to not treat thread IDs as
process IDs anymore. This enables support for processes with more
than one thread. Also, there's a new widget to filter which
process should be displayed.
2021-04-26 17:13:55 +02:00
Brian Gianforcaro 1682f0b760 Everything: Move to SPDX license identifiers in all files.
SPDX License Identifiers are a more compact / standardized
way of representing file license information.

See: https://spdx.dev/resources/use/#identifiers

This was done with the `ambr` search and replace tool.

 ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
2021-04-22 11:22:27 +02:00
Andreas Kling f4eff7df8f Kernel: Convert String::format() => String::formatted() 2021-04-21 23:49:02 +02:00
Andreas Kling 24dcd99e4b Kernel: Add magic key combo (Alt+Shift+F12) to dump scheduler state
Pressing this combo will dump a list of all threads and their state
to the debug console.

This might be useful to figure out why the system is not responding.
2021-04-18 20:00:10 +02:00
AnotherTest e4412f1f59 AK+Kernel: Make IntrusiveList capable of holding non-raw pointers
This should allow creating intrusive lists that have smart pointers,
while remaining free (compared to the impl before this commit) when
holding raw pointers :^)
As a sidenote, this also adds a `RawPtr<T>` type, which is just
equivalent to `T*`.
Note that this does not actually use such functionality, but is only
expected to pave the way for #6369, to replace NonnullRefPtrVector<T>
with intrusive lists.

As it is with zero-cost things, this makes the interface a bit less nice
by requiring the type name of what an `IntrusiveListNode` holds (and
optionally its container, if not RawPtr), and also requiring the type of
the container (normally `RawPtr`) on the `IntrusiveList` instance.
2021-04-16 22:26:52 +02:00
Andreas Kling 5e7abea31e Kernel+Profiler: Capture metadata about all profiled processes
The perfcore file format was previously limited to a single process
since the pid/executable/regions data was top-level in the JSON.

This patch moves the process-specific data into a top-level array
named "processes" and we now add entries for each process that has
been sampled during the profile run.

This makes it possible to see samples from multiple threads when
viewing a perfcore file with Profiler. This is extremely cool! :^)
2021-03-02 22:38:06 +01:00
Andreas Kling ea500dd3e3 Kernel: Start work on full system profiling :^)
The superuser can now call sys$profiling_enable() with PID -1 to enable
profiling of all running threads in the system. The perf events are
collected in a global PerformanceEventBuffer (currently 32 MiB in size.)

The events can be accessed via /proc/profile
2021-03-02 22:38:06 +01:00
Andreas Kling 5d180d1f99 Everywhere: Rename ASSERT => VERIFY
(...and ASSERT_NOT_REACHED => VERIFY_NOT_REACHED)

Since all of these checks are done in release builds as well,
let's rename them to VERIFY to prevent confusion, as everyone is
used to assertions being compiled out in release.

We can introduce a new ASSERT macro that is specifically for debug
checks, but I'm doing this wholesale conversion first since we've
accumulated thousands of these already, and it's not immediately
obvious which ones are suitable for ASSERT.
2021-02-23 20:56:54 +01:00
Andreas Kling 2b2828ae52 Kernel: Slap UNMAP_AFTER_INIT on a bunch more functions
We're now able to unmap 100 KiB of kernel text after init. :^)
2021-02-19 21:42:18 +01:00
Andreas Kling fdf03852c9 Kernel: Slap UNMAP_AFTER_INIT on a whole bunch of functions
There's no real system here, I just added it to various functions
that I don't believe we ever want to call after initialization
has finished.

With these changes, we're able to unmap 60 KiB of kernel text
after init. :^)
2021-02-19 20:23:05 +01:00
Andreas Kling c5c68bbd84 Kernel: Mark a handful of things in Scheduler.cpp READONLY_AFTER_INIT 2021-02-14 18:12:00 +01:00
Andreas Kling b712345c92 Kernel: Use PANIC() in a bunch of places :^) 2021-02-14 09:36:58 +01:00
AnotherTest 09a43969ba Everywhere: Replace dbgln<flag>(...) with dbgln_if(flag, ...)
Replacement made by `find Kernel Userland -name '*.h' -o -name '*.cpp' | sed -i -Ee 's/dbgln\b<(\w+)>\(/dbgln_if(\1, /g'`
2021-02-08 18:08:55 +01:00
AnotherTest 53ce923e10 Everywhere: Fix obvious dbgln() bugs
This will allow compiletime dbgln() checks to pass
2021-02-08 18:08:55 +01:00
Tom d5472426ec Kernel: Retire SchedulerData and add Thread lookup table
This allows us to get rid of the thread lists in SchedulerData.
Also, instead of iterating over all threads to find a thread by id,
just use a lookup table. In the rare case of having to iterate over
all threads, just iterate the lookup table.
2021-01-28 17:35:41 +01:00
Andreas Kling b72f067f0d Kernel+Userland: Remove unused "effective priority" from threads
This has been merged with the regular Thread::priority field after
the recent changes to the scheduler.
2021-01-28 08:25:53 +01:00
Andreas Kling f2decb6665 Revert "Kernel: Fix Thread::relock_process leaving critical section"
This reverts commit e9e76b8074.

This was causing a noticeable slowdown, and we're not sure that it was
actually necessary.
2021-01-27 23:23:21 +01:00
Tom db1448b21a Kernel: Add a compile-time switch to enable scheduling on all CPUs
This is meant to be temporary only and should be removed once scheduling
on all CPUs is stable.
2021-01-27 22:48:41 +01:00
Tom e9e76b8074 Kernel: Fix Thread::relock_process leaving critical section
We don't want to explicitly enable interrupts when leaving the
critical section to trigger a context switch.
2021-01-27 22:48:41 +01:00
Tom 03a9ee79fa Kernel: Implement thread priority queues
Rather than walking all Thread instances and putting them into
a vector to be sorted by priority, queue them into priority sorted
linked lists as soon as they become ready to be executed.
2021-01-27 22:48:41 +01:00
Tom c531084873 Kernel: Track processor idle state and wake processors when waking threads
Attempt to wake idle processors to get threads to be scheduled more quickly.
We don't want to wait until the next timer tick if we have processors that
aren't doing anything.
2021-01-27 22:48:41 +01:00
Tom e2f9e557d3 Kernel: Make Processor::id a static function
This eliminates the window between calling Processor::current and
the member function where a thread could be moved to another
processor. This is generally not as big of a concern as with
Processor::current_thread, but also slightly more light weight.
2021-01-27 21:12:24 +01:00
Tom 21d288a10e Kernel: Make Thread::current smp-safe
Change Thread::current to be a static function and read using the fs
register, which eliminates a window between Processor::current()
returning and calling a function on it, which can trigger preemption
and a move to a different processor, which then causes operating
on the wrong object.
2021-01-27 21:12:24 +01:00
Tom 33cdc1d2f1 Kernel: Use new Thread::previous_mode to track ticks 2021-01-27 21:12:24 +01:00
Tom 0bd558081e Kernel: Track previous mode when entering/exiting traps
This allows us to determine what the previous mode (user or kernel)
was, e.g. in the timer interrupt. This is used e.g. to determine
whether a signal handler should be set up.

Fixes #5096
2021-01-27 21:12:24 +01:00
asynts 7cf0c7cc0d Meta: Split debug defines into multiple headers.
The following script was used to make these changes:

    #!/bin/bash
    set -e

    tmp=$(mktemp -d)

    echo "tmp=$tmp"

    find Kernel \( -name '*.cpp' -o -name '*.h' \) | sort > $tmp/Kernel.files
    find . \( -path ./Toolchain -prune -o -path ./Build -prune -o -path ./Kernel -prune \) -o \( -name '*.cpp' -o -name '*.h' \) -print | sort > $tmp/EverythingExceptKernel.files

    cat $tmp/Kernel.files | xargs grep -Eho '[A-Z0-9_]+_DEBUG' | sort | uniq > $tmp/Kernel.macros
    cat $tmp/EverythingExceptKernel.files | xargs grep -Eho '[A-Z0-9_]+_DEBUG' | sort | uniq > $tmp/EverythingExceptKernel.macros

    comm -23 $tmp/Kernel.macros $tmp/EverythingExceptKernel.macros > $tmp/Kernel.unique
    comm -1 $tmp/Kernel.macros $tmp/EverythingExceptKernel.macros > $tmp/EverythingExceptKernel.unique

    cat $tmp/Kernel.unique | awk '{ print "#cmakedefine01 "$1 }' > $tmp/Kernel.header
    cat $tmp/EverythingExceptKernel.unique | awk '{ print "#cmakedefine01 "$1 }' > $tmp/EverythingExceptKernel.header

    for macro in $(cat $tmp/Kernel.unique)
    do
        cat $tmp/Kernel.files | xargs grep -l $macro >> $tmp/Kernel.new-includes ||:
    done
    cat $tmp/Kernel.new-includes | sort > $tmp/Kernel.new-includes.sorted

    for macro in $(cat $tmp/EverythingExceptKernel.unique)
    do
        cat $tmp/Kernel.files | xargs grep -l $macro >> $tmp/Kernel.old-includes ||:
    done
    cat $tmp/Kernel.old-includes | sort > $tmp/Kernel.old-includes.sorted

    comm -23 $tmp/Kernel.new-includes.sorted $tmp/Kernel.old-includes.sorted > $tmp/Kernel.includes.new
    comm -13 $tmp/Kernel.new-includes.sorted $tmp/Kernel.old-includes.sorted > $tmp/Kernel.includes.old
    comm -12 $tmp/Kernel.new-includes.sorted $tmp/Kernel.old-includes.sorted > $tmp/Kernel.includes.mixed

    for file in $(cat $tmp/Kernel.includes.new)
    do
        sed -i -E 's/#include <AK\/Debug\.h>/#include <Kernel\/Debug\.h>/' $file
    done

    for file in $(cat $tmp/Kernel.includes.mixed)
    do
        echo "mixed include in $file, requires manual editing."
    done
2021-01-26 21:20:00 +01:00
asynts 8465683dcf Everywhere: Debug macros instead of constexpr.
This was done with the following script:

    find . \( -name '*.cpp' -o -name '*.h' -o -name '*.in' \) -not -path './Toolchain/*' -not -path './Build/*' -exec sed -i -E 's/dbgln<debug_([a-z_]+)>/dbgln<\U\1_DEBUG>/' {} \;

    find . \( -name '*.cpp' -o -name '*.h' -o -name '*.in' \) -not -path './Toolchain/*' -not -path './Build/*' -exec sed -i -E 's/if constexpr \(debug_([a-z0-9_]+)/if constexpr \(\U\1_DEBUG/' {} \;
2021-01-25 09:47:36 +01:00
asynts acdcf59a33 Everywhere: Remove unnecessary debug comments.
It would be tempting to uncomment these statements, but that won't work
with the new changes.

This was done with the following commands:

    find . \( -name '*.cpp' -o -name '*.h' -o -name '*.in' \) -not -path './Toolchain/*' -not -path './Build/*' -exec awk -i inplace '$0 !~ /\/\/#define/ { if (!toggle) { print; } else { toggle = !toggle } } ; $0 ~/\/\/#define/ { toggle = 1 }' {} \;

    find . \( -name '*.cpp' -o -name '*.h' -o -name '*.in' \) -not -path './Toolchain/*' -not -path './Build/*' -exec awk -i inplace '$0 !~ /\/\/ #define/ { if (!toggle) { print; } else { toggle = !toggle } } ; $0 ~/\/\/ #define/ { toggle = 1 }' {} \;
2021-01-25 09:47:36 +01:00
Andreas Kling 647cfcb641 Kernel: Prune uninteresting kernel frames from profiling samples
Start capturing the sample stacks at the EIP/EBP of the pre-empted
thread instead of capturing EBP in the sampling function itself.
2021-01-17 14:36:53 +01:00
asynts 94bb544c33 Everywhere: Replace a bundle of dbg with dbgln.
These changes are arbitrarily divided into multiple commits to make it
easier to find potentially introduced bugs with git bisect.

This commit touches some dbg() calls which are enclosed in macros. This
should be fine because with the new constexpr stuff, we ensure that the
stuff actually compiles.
2021-01-16 11:54:35 +01:00
Andreas Kling 5dafb72370 Kernel+Profiler: Make profiling per-process and without core dumps
This patch merges the profiling functionality in the kernel with the
performance events mechanism. A profiler sample is now just another
perf event, rather than a dedicated thing.

Since perf events were already per-process, this now makes profiling
per-process as well.

Processes with perf events would already write out a perfcore.PID file
to the current directory on death, but since we may want to profile
a process and then let it continue running, recorded perf events can
now be accessed at any time via /proc/PID/perf_events.

This patch also adds information about process memory regions to the
perfcore JSON format. This removes the need to supply a core dump to
the Profiler app for symbolication, and so the "profiler coredump"
mechanism is removed entirely.

There's still a hard limit of 4MB worth of perf events per process,
so this is by no means a perfect final design, but it's a nice step
forward for both simplicity and stability.

Fixes #4848
Fixes #4849
2021-01-11 11:36:00 +01:00
asynts 019c9eb749 Everywhere: Replace a bundle of dbg with dbgln.
These changes are arbitrarily divided into multiple commits to make it
easier to find potentially introduced bugs with git bisect.
2021-01-09 21:11:09 +01:00
Tom 476f17b3f1 Kernel: Merge PurgeableVMObject into AnonymousVMObject
This implements memory commitments and lazy-allocation of committed
memory.
2021-01-01 23:43:44 +01:00
Andreas Kling ed5c26d698 AK: Remove custom %w format string specifier
This was a non-standard specifier alias for %04x. This patch replaces
all uses of it with new-style formatting functions instead.
2020-12-25 17:05:05 +01:00
Andreas Kling c25cf5fb56 Kernel: Panic if we're about to switch to a user thread with IOPL!=0
This is a crude protection against IOPL elevation attacks. If for
any reason we find ourselves about to switch to a user mode thread
with IOPL != 0, we'll now simply panic the kernel.

If this happens, it basically means that something tricked the kernel
into incorrectly modifying the IOPL of a thread, so it's no longer
safe to trust the kernel anyway.
2020-12-23 14:30:10 +01:00
Tom 5f51d85184 Kernel: Improve time keeping and dramatically reduce interrupt load
This implements a number of changes related to time:
* If a HPET is present, it is now used only as a system timer, unless
  the Local APIC timer is used (in which case the HPET timer will not
  trigger any interrupts at all).
* If a HPET is present, the current time can now be as accurate as the
  chip can be, independently from the system timer. We now query the
  HPET main counter for the current time in CPU #0's system timer
  interrupt, and use that as a base line. If a high precision time is
  queried, that base line is used in combination with quering the HPET
  timer directly, which should give a much more accurate time stamp at
  the expense of more overhead. For faster time stamps, the more coarse
  value based on the last interrupt will be returned. This also means
  that any missed interrupts should not cause the time to drift.
* The default system interrupt rate is reduced to about 250 per second.
* Fix calculation of Thread CPU usage by using the amount of ticks they
  used rather than the number of times a context switch happened.
* Implement CLOCK_REALTIME_COARSE and CLOCK_MONOTONIC_COARSE and use it
  for most cases where precise timestamps are not needed.
2020-12-21 18:26:12 +01:00