Commit graph

442 commits

Author SHA1 Message Date
Gunnar Beutner 01c75e3a34 Kernel: Don't log profile data before/after the process/thread lifetime
There were a few cases where we could end up logging profiling events
before or after the associated process or thread exists in the profile:

After enabling profiling we might end up with CPU samples before we
had a chance to synthesize process/thread creation events.

After a thread exits we would still log associated kmalloc/kfree
events. Instead we now just ignore those events.
2021-05-30 19:03:03 +02:00
Andreas Kling 1123af361d Kernel: Convert Process::get_syscall_path_argument() to KString
This API now returns a KResultOr<NonnullOwnPtr<KString>> and allocation
failures should be propagated everywhere nicely. :^)
2021-05-29 20:18:57 +02:00
Gunnar Beutner 42d667645d Kernel: Make sure we free the thread stack on thread exit
This adds two new arguments to the thread_exit system call which let
a thread unmap an arbitrary VM range on thread exit. LibPthread
uses this functionality to unmap the thread stack.

Fixes #7267.
2021-05-29 15:53:08 +02:00
Gunnar Beutner 95c2166ca9 Kernel: Move sys$munmap functionality into a helper method 2021-05-29 15:53:08 +02:00
Brian Gianforcaro 65d5f81afc Kernel: Make PrivateInodeVMObject factory APIs OOM safe 2021-05-29 09:04:05 +02:00
Andreas Kling 9d801d2345 Kernel: Rename Custody::create() => try_create()
The try_ prefix indicates that this may fail. :^)
2021-05-28 11:23:00 +02:00
Andreas Kling fc9ce22981 Kernel: Use KString for Region names
Replace the AK::String used for Region::m_name with a KString.

This seems beneficial across the board, but as a specific data point,
it reduces time spent in sys$set_mmap_name() by ~50% on test-js. :^)
2021-05-28 09:37:09 +02:00
Tim Schumacher 58bc10b947
Kernel: Make dup2() return the fd even if old & new are the same (#7506) 2021-05-27 21:14:57 +02:00
Gunnar Beutner ad6587424f Kernel: Disable profiling if setting up the buffer or timer failed 2021-05-24 09:10:50 +02:00
Gunnar Beutner 0688e02339 Kernel: Make sure we only log profiling events when m_profiling is true
Previously the process' m_profiling flag was ignored for all event
types other than CPU samples.

The kfree tracing code relies on temporarily disabling tracing during
exec. This didn't work for per-process profiles and would instead
panic.

This updates the profiling code so that the m_profiling flag isn't
ignored.
2021-05-23 23:54:30 +01:00
Gunnar Beutner 7557f2db90 Kernel: Remove an allocation when blocking a thread
When blocking a thread with a timeout we would previously allocate
a Timer object. This removes the allocation for that Timer object.
2021-05-20 09:09:10 +02:00
Brian Gianforcaro bb91bed576 Kernel: Make ProcessGroup::find_or_create API OOM safe
Make ProcessGroup::find_or_create & ProcessGroup::create OOM safe, by
moving to adopt_ref_if_nonnull.
2021-05-20 08:10:07 +02:00
Gunnar Beutner 7dc77bd833 Kernel: Avoid an allocation in sys$poll 2021-05-19 22:51:42 +02:00
Gunnar Beutner 277f333b2b Kernel: Add support for profiling kmalloc()/kfree() 2021-05-19 22:51:42 +02:00
Gunnar Beutner 572bbf28cc Kernel+LibC: Add support for filtering profiling events
This adds the -t command-line argument for the profile tool. Using this
argument you can filter which event types you want in your profile.
2021-05-19 22:51:42 +02:00
Justin 1c3badede3 Kernel: Add statvfs & fstatvfs Syscalls
These syscalls fill a statvfs struct with various data
about the mount on the VFS.
2021-05-19 21:33:29 +02:00
Hendiadyoin1 ef425a02f7 Kernel: Implement mprotect for multiple Regions 2021-05-18 16:50:52 +02:00
Sahan Fernando d0f314b23c Kernel: Fix subtle race condition in sys$write implementation
There is a slight race condition in our implementation of write().
We call File::can_write() before attempting to write to it (blocking if
it returns false). If it returns true, we assume that we can write to
the file, and our code assumes that File::write() cannot possibly fail
by being blocked. There is, however, the rare case where another process
writes to the file and prevents further writes in between the call to
Files::can_write() and File::write() in the first process. This would
result in the first process calling File::write() when it cannot be
written to.

We fix this by adding a mechanism for File::can_write() to signal that
it was blocked, making it the responsibilty of File::write() to check
whether it can write and then finally making sys$write() check if the
write failed due to it being blocked.
2021-05-18 16:33:15 +02:00
Gunnar Beutner 89956cb0d6 Kernel+Userspace: Implement the accept4() system call
Unlike accept() the new accept4() system call lets the caller specify
flags for the newly accepted socket file descriptor, such as
SOCK_CLOEXEC and SOCK_NONBLOCK.
2021-05-17 13:32:19 +02:00
Liav A e6f333ae00 Kernel: Print failed attempt to shutdown the machine
Because we don't parse ACPI AML yet, If we are not able to shut down
the machine with "hacky" emulation methods - halt and print this state
to the users so they know they can shutdown the machine by themselves.
2021-05-17 00:30:40 +01:00
Nicholas Baron aa4d41fe2c
AK+Kernel+LibELF: Remove the need for IteratorDecision::Continue
By constraining two implementations, the compiler will select the best
fitting one. All this will require is duplicating the implementation and
simplifying for the `void` case.

This constraining also informs both the caller and compiler by passing
the callback parameter types as part of the constraint
(e.g.: `IterationFunction<int>`).

Some `for_each` functions in LibELF only take functions which return
`void`. This is a minimal correctness check, as it removes one way for a
function to incompletely do something.

There seems to be a possible idiom where inside a lambda, a `return;` is
the same as `continue;` in a for-loop.
2021-05-16 10:36:52 +01:00
Andreas Kling 4d429ba9ea Kernel: Unbreak profiling all processes
Regressed in 8a4cc735b9.
We stopped generating "process created" when enabling profiling,
which led to Profiler getting confused about the missing events.
2021-05-15 21:25:54 +02:00
Liav A 8a4cc735b9 Kernel: Don't use the profile timer if we don't have a timer to assign 2021-05-15 18:08:41 +02:00
Gunnar Beutner 4ab9d8736b Kernel: Make perf_event() work for global profiles
Previously calls to perf_event() would end up in a process-specific
perfcore file even though global profiling was enabled. This changes
the behavior for perf_event() so that these events are stored into
the global profile instead.
2021-05-15 16:28:18 +02:00
Brian Gianforcaro ede1483e48 Kernel: Make Process creation APIs OOM safe
This change looks more involved than it actually is. This simply
reshuffles the previous Process constructor and splits out the
parts which can fail (resource allocation) into separate methods
which can be called from a factory method. The factory is then
used everywhere instead of the constructor.
2021-05-15 09:01:32 +02:00
Andreas Kling 16221305ad LibELF: Remove sketchy use of "undefined" ELF::Image::Section
We were using ELF::Image::section(0) to indicate the "undefined"
section, when what we really wanted was just Optional<Section>.

So let's use Optional instead. :^)
2021-05-15 00:17:55 +02:00
Mart G e7310ba45a Kernel+LibC: Add fstatat
The function fstatat can do the same thing as the stat and lstat
functions. However, it can be passed the file descriptor of a directory
which will be used when as the starting point for relative paths. This
is contrary to stat and lstat which use the current working directory as
the starting for relative paths.
2021-05-14 23:32:10 +02:00
Gunnar Beutner 8614d18956 Kernel: Use a separate timer for profiling the system
This updates the profiling subsystem to use a separate timer to
trigger CPU sampling. This timer has a higher resolution (1000Hz)
and is independent from the scheduler. At a later time the
resolution could even be made configurable with an argument for
sys$profiling_enable() - but not today.
2021-05-14 00:35:57 +02:00
Andreas Kling e46343bf9a Kernel: Make UserOrKernelBuffer R/W helpers return KResultOr<size_t>
This makes error propagation less cumbersome (and also exposed some
places where we were not doing it.)
2021-05-13 23:28:40 +02:00
Brian Gianforcaro 0d50d3ed1e Kernel: Make InodeWatcher::crate API OOM safe 2021-05-13 16:21:53 +02:00
Brian Gianforcaro 51ceb172b9 Kernel: Replace make<T>() with adopt_own_if_nonnull() in sys$module_load 2021-05-13 16:21:53 +02:00
Brian Gianforcaro 956314f0a1 Kernel: Make Process::start_tracing_from API OOM safe
Modify the API so it's possible to propagate error on OOM failure.
NonnullOwnPtr<T> is not appropriate for the ThreadTracer::create() API,
so switch to OwnPtr<T>, use adopt_own_if_nonnull() to handle creation.
2021-05-13 16:21:53 +02:00
sin-ack fe5ca6ca27 Kernel: Implement multi-watch InodeWatcher :^)
This patch modifies InodeWatcher to switch to a one watcher, multiple
watches architecture.  The following changes have been made:

- The watch_file syscall is removed, and in its place the
  create_iwatcher, iwatcher_add_watch and iwatcher_remove_watch calls
  have been added.
- InodeWatcher now holds multiple WatchDescriptions for each file that
  is being watched.
- The InodeWatcher file descriptor can be read from to receive events on
  all watched files.

Co-authored-by: Gunnar Beutner <gunnar@beutner.name>
2021-05-12 22:38:20 +02:00
Gunnar Beutner 22ebd754d3 Kernel: Fix loading ELF images without PT_INTERP
Previously we'd try to load ELF images which did not have
an interpreter set with an incorrect load offset of 0, i.e. way
outside of the part of the address space where we'd expect either
the dynamic loader or the user's executable to reside.

This fixes the problem by using get_load_offset for both executables
which have an interpreter set and those which don't. Notably this
allows us to actually successfully execute the Loader.so binary:

courage:~ $ /usr/lib/Loader.so
You have invoked `Loader.so'. This is the helper program for programs
that use shared libraries. Special directives embedded in executables
tell the kernel to load this program.

This helper program loads the shared libraries needed by the program,
prepares the program to run, and runs it. You do not need to invoke
this helper program directly.
courage:~ $
2021-05-10 20:39:08 +02:00
Brian Gianforcaro 0b7395848a Kernel: Plumb OOM propagation through Custody factory
Modify the Custody::create(..) API so it has the ability to propagate
OOM back to the caller.
2021-05-10 11:55:52 +02:00
Brian Gianforcaro 8bf4201f50 Kernel: Move process creation perf events to PerformanceManager 2021-05-07 15:35:23 +02:00
Brian Gianforcaro ccdcb6a635 Kernel: Add PerformanceManager static class, move perf event APIs there
The current method of emitting performance events requires a bit of
boiler plate at every invocation, as well as having to ignore the
return code which isn't used outside of the perf event syscall. This
change attempts to clean that up by exposing high level API's that
can be used around the code base.
2021-05-07 15:35:23 +02:00
Brian Gianforcaro 11306d7121
Kernel: Modify TimeManagement::current_time(..) API so it can't fail. (#6869)
The fact that current_time can "fail" makes its use a bit awkward.
All callers in the Kernel are trusted besides syscalls, so assert
that they never get there, and make sure all current callers perform
validation of the clock_id with TimeManagement::is_valid_clock_id().

I have fuzzed this change locally for a bit to make sure I didn't
miss any obvious regression.
2021-05-05 18:51:06 +02:00
Brian Gianforcaro 234c6ae32d Kernel: Change Inode::{read/write}_bytes interface to KResultOr<ssize_t>
The error handling in all these cases was still using the old style
negative values to indicate errors. We have a nicer solution for this
now with KResultOr<T>. This change switches the interface and then all
implementers to use the new style.
2021-05-02 13:27:37 +02:00
Sahan Fernando bd563f0b3c Kernel: Make processes start with a 16-byte-aligned stack 2021-05-01 20:08:35 +02:00
Brian Gianforcaro a678851b41 Kernel: Harden sys$setgroups Vector usage against OOM 2021-05-01 09:10:30 +02:00
Itamar 6bbd2ebf83 Kernel+LibELF: Support initializing values of TLS data
Previously, TLS data was always zero-initialized.

To support initializing the values of TLS data, sys$allocate_tls now
receives a buffer with the desired initial data, and copies it to the
master TLS region of the process.

The DynamicLinker gathers the initial TLS image and passes it to
sys$allocate_tls.

We also now require the size passed to sys$allocate_tls to be
page-aligned, to make things easier. Note that this doesn't waste memory
as the TLS data has to be allocated in separate pages anyway.
2021-04-30 18:47:39 +02:00
Itamar 373e8bcbc7 Kernel: Give a name to the Master TLS region allocation 2021-04-30 18:47:39 +02:00
Gunnar Beutner 3c0355a398 Kernel: Accepted socket file descriptors should not inherit flags
For example Linux accepts an additional argument for flags in accept4()
that let the user specify what flags they want. However, by default
accept() should not inherit those flags from the listener socket.
2021-04-30 11:43:19 +02:00
Jesse Buhagiar 60cdbc9397 Kernel/LibC: Implement setreuid 2021-04-30 11:35:17 +02:00
Brian Gianforcaro a8765fa673 Kernel: Harden sys$select Vector usage against OOM.
Theoretically the append should never fail as we have in-line storage
of FD_SETSIZE, which should always be enough. However I'm planning on
removing the non-try variants of AK::Vector when compiling in kernel
mode in the future, so this will need to go eventually. I suppose it
also protects against some unforeseen bug where we we can append more
than FD_SETSIZE items.
2021-04-29 20:31:15 +02:00
Brian Gianforcaro 0ca668f59c Kernel: Harden sys$munmap Vector usage against OOM.
Theoretically the append should never fail as we have in-line storage
of 2, which should be enough. However I'm planning on removing the
non-try variants of AK::Vector when compiling in kernel mode in the
future, so this will need to go eventually. I suppose it also protects
against some unforeseen bug where we we can append more than 2 items.
2021-04-29 20:31:15 +02:00
Brian Gianforcaro 569c5a8922 Kernel: Harden sys$purge Vector usage against OOM.
sys$purge() is a bit unique, in that it is probably in the systems
advantage to attempt to limp along if we hit OOM while processing
the vmobjects to purge. This change modifies the algorithm to observe
OOM and continue trying to purge any previously visited VMObjects.
2021-04-29 20:31:15 +02:00
Brian Gianforcaro b3096276bb Kernel: Harden sys$poll Vector usage against OOM. 2021-04-29 20:31:15 +02:00
Brian Gianforcaro 119b7be249 Kernel: Harden sys$execve Vector usage against OOM. 2021-04-29 20:31:15 +02:00