ECMA-402 doesn't explicitly handle a note in the TR-35 spec related to
expanding field lengths based on user-provided options. Instead, it
assumes the "implementation defined" locale data includes the possible
values.
LibUnicode does not generate every possible combination of field lengths
in its implementation of TR-35's "Missing Skeleton Fields", because the
number of generated patterns would grow out of control. Instead, it's
much simpler to handle this difference at runtime.
Other implementations unconditionally initialize [[pattern12]] from
[[pattern]] regardless of whether [[pattern]] has an hour pattern of h11
or h12. LibUnicode does not do this. So when InitializeDateTimeFormat
defaults the hour cycle to the locale's preferred hour cycle, if the
best format didn't have an equivalent hour pattern, [[pattern12]] will
be empty.
TR-35 describes how to combine date, time, and available formats with
date-time format patterns to generate more available format patterns:
https://unicode.org/reports/tr35/tr35-dates.html#Missing_Skeleton_Fields
Use these steps to generate ~400 new patterns for each calendar. These
are required for ECMA-402's BasicFormatMatcher to produce reasonable
results.
Similar to NumberFormat, replace the segments of date-time patterns with
partitions that can be split at runtime. Also generate the pattern style
fields for e.g. era, day, hour, etc.
Add unique storage for parsed CalendarPattern structures to ensure only
one copy of each structure is generated.
This doesn't have any impact on libunicode.so with the current generated
data. Rather, this prevents the amount of generated data from needlessly
growing astronomically once date-time patterns are fully parsed. There
will be 173,459 patterns parsed, of which only 22,495 (about 12%) are
unique. This change will save a few MB, and will also help compilation
times.
Currently, there's only a handful of entries in these arrays, so it is
not a huge deal to generate them inline with the struct that holds them.
But they will each soon contain a few hundred entries. Generate them out
of line for easier viewing in the generated code.
Add unique storage for parsed NumberFormat structures to ensure only one
copy of each structure is generated. Reduces libunicode.so on x86 from
13.2 MB to 11.4 MB.
UniqueStringStorage is used to ensure only one copy of a string will be
generated, and interested parties store just an index into the generated
storage. Generalize this class to allow any* type to be stored uniquely.
* To actually be storable, the type must have both an AK::Format and an
AK::Traits overload available.
In particular, strace can now stomach memory errors while copying
invalid strings.
Example with a valid string:
dbgputstr("95.976 traceme(38:38) Well, Hello Friends!") = 55
Example with an invalid string:
dbgputstr(Error(errno=14){0x00012345, 678b}) = -14 EFAULT
(ANSI escapes removed for readability.)
Returning 'result.error().code()' erroneously creates an
ErrorOr<FlatPtr> of the positive errno code, which breaks our
error-returning convention.
This seems to be due to a forgotten minus-sign during the refactoring in
9e51e295cf. This latent bug was never
discovered, because currently the error-handling paths are rarely
exercised.
This necessarily introduces some usages (and benefits!) of the new
ErrorOr<> pattern. To keep commits atomic, I do not yet rewrite the
entire program to use ErrorOr<> correctly.
Also, remove incomplete, superfluous check.
Incomplete, because only the byte at the provided address was checked;
this misses the last bytes of the "jerk page".
Superfluous, because it is already correctly checked by peek_user_data
(which calls copy_from_user).
The caller/tracer should not typically attempt to read non-userspace
addresses, we don't need to "hot-path" it either.
Note that the return type for the non-const method error() changed. This
is most likely an accident, hidden by the fact that ErrorType typically
is Error.
This makes it an error to not do something with a returned smart
pointer, which should help prevent mistakes. In cases where you do need
to ignore the value, casting to void will placate the compiler.
I did have to add comments to disable clang-format on a couple of lines,
where it wanted to format the code like this:
```c++
private : NonnullRefPtr() = delete;
```
The synchronous call returns a NonnullOwnPtr that we don't use, so we
have to cast to prevent a compiler warning once smart pointers become
[[nodiscard]].
These ones all manage their storage internally, whereas the WebContent
and ImageDecoder ones require the caller to manage their lifetime. This
distinction is not obvious to the user without looking through the code,
so an API that makes this clearer would be nice.