This makes PrintfImplementation usable with any sequence, provided that
a 'next element' function can be written for it.
Does not affect the behaviour of printf() and co.
It wasn't actually possible to call
const LogStream& operator<<(const LogStream&, ReadonlyBytes);
because it was shadowed by
template<typename T>
const LogStream& operator<<(const LogStream& stream, Span<T> span);
not sure how I didn't find this when I added the overload.
It would be possible to use SFINAE to disable the other overload,
however, I think it is better to use a different method entirely because
the output can be very verbose:
void dump_bytes(ReadonlyBytes);
Leverage constexpr and __builtin_ffs for Bitmap::find_first. Also add
a variant Bitmap::find_one_anywhere that can start scanning at a
provided hint.
Also, merge Bitmap::fill_range into the already existing Bitmap::set_range
The streaming operator doesn't short-circuit, consider the following
snippet:
void foo(InputStream& stream) {
int a, b;
stream >> a >> b;
}
If the first read fails, the second is called regardless. It should be
well defined what happens in this case: nothing.
This is a strcpy()-like method with actually sane semantics:
* It accepts a non-empty buffer along with its size in bytes.
* It copies as much of the string as fits into the buffer.
* It always null-terminates the result.
* It returns, as a non-discardable boolean, whether the whole string has been
copied.
Intended usage looks like this:
bool fits = string.copy_characters_to_buffer(buffer, sizeof(buffer));
and then either
if (!fits) {
fprintf(stderr, "The name does not fit!!11");
return nullptr;
}
or, if you're sure the buffer is large enough,
// I'm totally sure it fits because [reasons go here].
ASSERT(fits);
or if you're feeling extremely adventurous,
(void)fits;
but don't do that, please.
For some weird reason the C++ standard considers char, signed char and
unsigned char *three* different types. On the other hand int is just an
alias for signed int, meaning that int, signed int and unsigned int are
just *two* different types.
https://stackoverflow.com/a/32856568/8746648
Before, we had about these occurrence counts:
COPY: 13 without, 33 with
MOVE: 12 without, 28 with
Clearly, 'with' was the preferred way. However, this introduced double-semicolons
all over the place, and caused some warnings to trigger.
This patch *forces* the usage of a semi-colon when calling the macro,
by removing the semi-colon within the macro. (And thus also gets rid
of the double-semicolon.)
The implementation in LibC did a timestamp->day-of-week conversion
which looks like a valuable thing to have. But we only need it in
time_to_tm, where we already computed year/month/day -- so let's
consolidate on the day_of_week function in DateTime (which is
getting extracted to AK).
The JS tests pointed out that the implementation in DateTime
had an off-by-one in the month when doing the leap year check,
so this change fixes that bug.
Specifically:
- post-increment actually implemented pre-increment
- helper-templates that provided operator{+,-,*,/}() couldn't possibly work,
because the interface of add (etc) were incompatible (not taking a Checked<>,
and returning void)
Consider the following scenario:
if(condition)
FOO();
else
bar();
Suppose FOO is defined as follows:
#define FOO() { bar(); baz(); }
Then it expands to the following:
if(condition)
// Syntax error, we are not allowed to put a semicolon at the end.
{ bar(); baz(); };
else
bar();
If we define FOO as follows:
#define FOO() do { bar(); baz(); } while(false)
Then it expands to the following:
if(condition)
do { bar(); baz(); } while(false);
else
bar();
Which is correct.
MemoryManager cannot use the Singleton class because
MemoryManager::initialize is called before the global constructors
are run. That caused the Singleton to be re-initialized, causing
it to create another MemoryManager instance.
Fixes#3226
In particular: consistent rounding and extreme values.
Before, rounding was something like 'away from 0.999...', which led to
surprising corner cases in which the value was rounded up.
Now, rounding is always 'down'.
This even works for 0xffffffff, and also for 0xffffffffffffffffULL on 64-bit.
This makes error messages more useful during debugging.
Old:
START Running test compare_views
FAIL: ../AK/Tests/TestStringView.cpp:59: EXPECT_EQ(view1, "foobar") failed
New:
START Running test compare_views
FAIL: ../AK/Tests/TestStringView.cpp:59: EXPECT_EQ(view1, "foobar") failed: LHS="foo", RHS="foobar"
Previously, it would just print something with 'FAIL' to stderr which
would be picked up by CTest. However, some code assumes that
ASSERT_NOT_REACHED() doesn't return, for example:
bool foo(int value) {
switch(value) {
case 0:
return true;
case 1:
return false;
default:
ASSERT_NOT_REACHED();
}
// warning: control reaches end of non-void function
}
Thankfully, this hasn't happened in any other code yet, but it happened
while I was trying something out. Using '==' on two ByteBuffers to check
whether they're equal seemed straight-forward, so I ran into the trap.
This seems to be because ByteBuffer implements 'operator bool', and C++
considers bool to be an integer type. Thus, when trying to find a way to
evaluate '==', it attempts integer promotion, which in turn finds 'operator bool'.
This explains why all non-empty buffers seem to be equal, but different from the
empty one. Also, why comparison seems to be implemented.
clang-format automatically sorts include statements that are in a
'block'. Adding a whitespace prevents this. It is crutial that
<AK/TestSuite.h> is included first because it redefines some macros.
Previously, the implementation would produce one Vector<u8> which
would contain the whole decompressed data. That can be a lot and
even exhaust memory.
With these changes it is still necessary to store the whole input data
in one piece (I am working on this next,) but the output can be read
block by block. (That's not optimal either because blocks can be
arbitrarily large, but it's good for now.)
This class is similar to BufferStream because it is possible to both
read and write to it. However, it differs in the following ways:
- DuplexMemoryStream keeps a history of 64KiB and discards the rest,
BufferStream always keeps everything around.
- DuplexMemoryStream tracks reading and writing seperately, the
following is valid:
DuplexMemoryStream stream;
stream << 42;
int value;
stream >> value;
For BufferStream it would read:
BufferStream stream;
stream << 42;
int value;
stream.seek(0);
stream >> value;
In the future I would like to replace all usages of BufferStream with
InputMemoryStream, OutputMemoryStream (doesn't exist yet) and
DuplexMemoryStream. For now I just add DuplexMemoryStream though.
Fatal errors can not be handeled and lead to an assertion error when the
stream is destroyed. It makes no sense to delay the assertion failure,
instead of setting m_fatal, an assertion should be done directly.
Two changes were made
1. copy_to() and copy_trimmed_to() now return how many bytes were
copied.
2. The argument was changed to Span<typename RemoveConst<T>::Type>
because the following would not work:
ReadonlyBytes bytes0;
Bytes bytes1;
// Won't work because this calls Span<const u8>::copy_to(Span<u8>)
// but the method was defined as Span<const u8>::copy_to(Span<const u8>)
bytes0.copy_to(bytes1);
The Coverity compiler doesn't support C++2a yet, and thus doesn't
even recognize concept keywords. To allow serenity to be built and
analyzed on such compilers, add a fallback underdef to perform
the same template restriction based on AK::EnableIf<..> meta
programming.
Note: Coverity does seem to (annoyingly) define __cpp_concepts, even
though it doesn't support them, so we need to further check for
__COVERITY__ explicitly.
Windows uses "KB", "MB", "GB" as powers of two.
macOS uses "kB", "MB", "GB" as powers of ten.
"k", "M", "G" are standard SI prefixes that normally refer to powers of
ten.
The IEC introduced "KiB", "MiB", "GiB" to unambiguously refer to
powers of two. It admittedly hasn't caught on that much, but it
does have the advantage that it's unabigious what it means.
So let's use it for user-visible sizes in SerenityOS.
(Linux does all of the above in different places, depending on app and
toolkit.)
Let's use the one in AK/NumberFormat.h everywhere.
It has slightly different behavior than some of the copies this
removes, but it's probably nice to have uniform human readable
size outputs across the system.
The SI prefixes "k", "M", "G" mean "10^3", "10^6", "10^9".
The IEC prefixes "Ki", "Mi", "Gi" mean "2^10", "2^20", "2^30".
Let's use the correct name, at least in code.
Only changes the name of the constants, no other behavior change.
I originally defined the bytes() method for the String class, because it
made it obvious that it's a span of bytes instead of span of characters.
This commit makes this more consistent by defining a bytes() method when
the type of the span is known to be u8.
Additionaly, the cast operator to Bytes is overloaded for ByteBuffer and
such.
This change aims to add support for obscure IPv4 address notations, such as 1.1 (which should be equal to 1.0.0.1), or the hypothetical address 1 (which is equal to 0.0.0.1). This is supported on other platforms as well, such as Linux, Windows, *BSD, and even Haiku.
This enables a nice warning in case a function becomes dead code. Also, add forgotten
header to Base64.cpp, which would cause an issue later when we enable -Wmissing-declarations.
This template class allows for easy generation of incompatible numeric types.
This is useful whenever code has to handle heterogenous data (like meters and
seconds) but the underlying data types are compatible (like int and int).
The motivation comes from the Kernel's inconsistent use of pid_t for process and
thread IDs even though the ID spaces are incompatible, and translating forth/back
is nontrivial.
Other uses could be units (as described above), or incompatible index systems.
A popular use in real life is image manipulation, when there are multiple
coordinate systems.
The symbol name insertion scheme is different from objdump -d's.
Compare the output on Build/Userland/id:
* disasm:
...
_start (08048305-0804836b):
08048305 push ebp
...
08048366 call 0x0000df56
0804836b o16 nop
0804836d o16 nop
0804836f nop
(deregister_tm_clones (08048370-08048370))
08048370 mov eax, 0x080643e0
...
_ZN2AK8Utf8ViewC1ERKNS_6StringE (0805d9b2-0805d9b7):
_ZN2AK8Utf8ViewC2ERKNS_6StringE (0805d9b2-0805d9b7):
0805d9b2 jmp 0x00014ff2
0805d9b7 nop
* objdump -d:
08048305 <_start>:
8048305: 55 push %ebp
...
8048366: e8 9b dc 00 00 call 8056006 <exit>
804836b: 66 90 xchg %ax,%ax
804836d: 66 90 xchg %ax,%ax
804836f: 90 nop
08048370 <deregister_tm_clones>:
8048370: b8 e0 43 06 08 mov $0x80643e0,%eax
...
0805d9b2 <_ZN2AK8Utf8ViewC1ERKNS_6StringE>:
805d9b2: e9 eb f6 ff ff jmp 805d0a2 <_ZN2AK10StringViewC1ERKNS_6StringE>
805d9b7: 90 nop
Differences:
1. disasm can show multiple symbols that cover the same instructions.
I've only seen this happen for C1/C2 (and D1/D2) ctor/dtor pairs,
but it could conceivably happen with ICF as well.
2. disasm separates instructions that do not belong to a symbol with
a newline, so that nop padding isn't shown as part of a function
when it technically isn't.
3. disasm shows symbols that are skipped (due to having size 0)
in parenthesis, separated from preceding and following instructions.
When using Userspace<T> there are certain syscalls where being able
to cast between types is needed. You should be able to easily cast
away the Userspace<T> wrapper, but it's perfectly safe to be able to
cast the internal type that is being wrapped.
This function did a const_cast internally which made the call side look
"safe". This method is removed completely and call sites are replaced
with ByteBuffer::wrap(const_cast<void*>(data), size) which makes the
behaviour obvious.
We should always leak to an observed variable, otherwise
it's an actual leak. This is similar to AK::RefPtr::leak_ref()
which is also marked as [[nodiscard]].
There are use cases where a linked list is useful but it's also worth
the overhead to maintain a count so you can quickly answer queries of
the size of the list.
This is a very cheesy patch and I don't like it, but as Qt Creator does
not grok C++20 concepts yet, this makes it possible to still use syntax
highlighting.
We'll remove this hack the moment it stops being a problem. Note that
it doesn't actually affect the build since we use GCC, not Clang.
It was a bit odd that you could create a Userspace<int> and that
Userspace<int>::ptr() returned an int instead of an int*.
Let's use C++20 concepts to only allow creating Userspace objects with
pointer types. :^)
Since we already have the type information in the Userspace template,
it was a bit silly to cast manually everywhere. Just add a sufficiently
scary-sounding getter for a typed pointer.
Thanks @alimpfard for pointing out that I was being silly with tossing
out the type.
In the future we may want to make this API non-public as well.
This will be used in the kernel to wrap pointers into userspace memory
without convenient direct access. The idea is to use the compiler to
enforce that we don't dereference userspace pointers.
I accidently wrote `Span<RemoveConst<T>>` when I meant
`Span<RemoveConst<T>::Type>`.
Changing that wouldn't be enough though, this constructor can only be
defined if T is not const, otherwise it would redefine the copy
constructor. This can be avoided by overloading the cast operator.
There's no great advantage to using MMX instructions here on modern
processors, since REP MOVSB/STOSB are optimized in microcode anyway
and tend to run much faster than MMX/SSE/AVX variants.
This also makes it much easier to implement high-level emulation of
memcpy/memset in UserspaceEmulator once we get there. :^)
Fixes#2776.
This fixes, among other things, JSON serialization.
The underlying bug was that 'print_double' defined fraction_length
as a function argument with a default value, whereas
printf_internal *always* provided a value, even if nothing was read.
The 'use 6 by default' logic has been moved to printf_internal instead.
I totally forgot about the C++ basics here. There are three distinct
types: "char", "signed char" and "unsigned char". Whether "char" is
signed or unsigned is implementation specific.
This allows performing an action based on whether something
was actually added or removed without having to look it up
prior to calling set() or remove().
The fact that JsonValues can contain 64-bit values isn't a JavaScript
compatible behavior in the first place, but as long as we're supporting
this, we should make sure it works correctly.
Prior to this, we wrote to the log every time the << operator
was used, which meant that only these parts of the log statement
were serialized. If the thread was preempted, or especially with
multiple CPUs the debug output was hard to decipher. Instead, we
buffer up the log statements. To avoid allocations we'll attempt
to use stack space, which covers most log statements.
This was showing up in Browser profiles, which is silly, so write a new
version that doesn't create a temporary String object.
There are a whole bunch of these and long-term it would be nice to find
a way to share all the very similar logic instead of duplicating it.
- Parsing invalid JSON no longer asserts
Instead of asserting when coming across malformed JSON,
JsonParser::parse now returns an Optional<JsonValue>.
- Disallow trailing commas in JSON objects and arrays
- No longer parse 'undefined', as that is a purely JS thing
- No longer allow non-whitespace after anything consumed by the initial
parse() call. Examples of things that were valid and no longer are:
- undefineddfz
- {"foo": 1}abcd
- [1,2,3]4
- JsonObject.for_each_member now iterates in original insertion order
Get rid of the weird old signature:
- int StringType::to_int(bool& ok) const
And replace it with sensible new signature:
- Optional<int> StringType::to_int() const
Before this, it has been possible to assign a RefCounted object to another
RefCounted object. Hilariosly (or sadly), that copied the refcount among
the other fields, meaning the target value ended up with a wrong refcount.
Ensure this never happens by disallowing copies and moves for RefCounted types.
This fixes all sorts of race conditions, primarily in the kernel, where till
now it's been possible to obtain either double free or use-after-free by
exploiting refcounting races.
I've been using this in the new HTML parser and it makes it much easier
to understand the state of unfinished code branches.
TODO() is for places where it's okay to end up but we need to implement
something there.
ASSERT_NOT_REACHED() is for places where it's not okay to end up, and
something has gone wrong.
The SDL port failed to build because the CMake toolchain filed pointed
to the old root. Now the toolchain file assumes that the Root is in
Build/Root.
Additionally, the AK/ and Kernel/ headers need to be installed in the
root too.
.. and make travis run it.
I renamed check-license-headers.sh to check-style.sh and expanded it so
that it now also checks for the presence of "#pragma once" in .h files.
It also checks the presence of a (single) blank line above and below the
"#pragma once" line.
I also added "#pragma once" to all the files that need it: even the ones
we are not check.
I also added/removed blank lines in order to make the script not fail.
I also ran clang-format on the files I modified.
And move canonicalized_path() to a static method on LexicalPath.
This is to make it clear that FileSystemPath/canonicalized_path() only
perform *lexical* canonicalization.
The CMake runner looks at the return code if you don't set
the pattern. Since the AK test suite setup doesn't use return
codes, we were missing test failures.
FileSystemPath::has_extension was jumping through hoops and allocating
memory to do a case insensitive comparison needlessly. Extend the
existing String::ends_with method to allow the caller to specify the
case sensitivity required.
Previously, passing a fragment string ("#section3") to the complete_url
method would result in a URL that looked like
"file:///home/anon/www/#section3" which was obviously incorrect. Now the
result looks like "file:///home/anon/www/afrag.html#section3".
We shouldn't just drop leading ..-s for relative paths. At the same time,
we should handle paths like
../foo/../../bar
correctly: the first .. after the foo cancels out the foo, but the second
one should get treated as a leading one and not get dropped.
Note that since this path resolution is purely lexical, it's never going to be
completely correct with respect to symlinks and other filesystem magic. Better
don't use it when dealing with files.