Commit graph

892 commits

Author SHA1 Message Date
Timothy Flynn 3b7f5af042 LibUnicode: Generate primary and secondary number grouping sizes
Most locales have a single grouping size (the number of integer digits
to be written before inserting a grouping separator). However some have
a primary and secondary size. We parse the primary size as the size used
for the least significant integer digits, and the secondary size for the
most significant.
2021-11-14 10:35:19 +00:00
Timothy Flynn c65dea64bd LibJS+LibUnicode: Don't remove {currency} keys in GetNumberFormatPattern
In order to implement Intl.NumberFormat.prototype.formatToParts, do not
replace {currency} keys in the format pattern before ECMA-402 tells us
to. Otherwise, the array return by formatToParts will not contain the
expected currency key.

Early replacement was done to avoid resolving the currency display more
than once, as it involves a couple of round trips to search through
LibUnicode data. So this adds a non-standard method to NumberFormat to
do this resolution and cache the result.

Another side effect of this change is that LibUnicode must replace unit
format patterns of the form "{0} {1}" during code generation. These were
previously skipped during code generation because LibJS would just
replace the keys with the currency display at runtime. But now that the
currency display injection is delayed, any {0} or {1} keys in the format
pattern will cause PartitionNumberPattern to abort.
2021-11-13 19:01:25 +00:00
Timothy Flynn a701ed52fc LibJS+LibUnicode: Fully implement currency number formatting
Currencies are a bit strange; the layout of currency data in the CLDR is
not particularly compatible with what ECMA-402 expects. For example, the
currency format in the "en" and "ar" locales for the Latin script are:

    en: "¤#,##0.00"
    ar: "¤\u00A0#,##0.00"

Note how the "ar" locale has a non-breaking space after the currency
symbol (¤), but "en" does not. This does not mean that this space will
appear in the "ar"-formatted string, nor does it mean that a space won't
appear in the "en"-formatted string. This is a runtime decision based on
the currency display chosen by the user ("$" vs. "USD" vs. "US dollar")
and other rules in the Unicode TR-35 spec.

ECMA-402 shies away from the nuances here with "implementation-defined"
steps. LibUnicode will store the data parsed from the CLDR however it is
presented; making decisions about spacing, etc. will occur at runtime
based on user input.
2021-11-13 11:52:45 +00:00
Timothy Flynn e9493a2cd5 LibUnicode: Ensure UnicodeNumberFormat is aware of default content
For example, there isn't a unique set of data for the en-US locale;
rather, it defaults to the data for the en locale. See this commit for
much more detail: 357c97dfa8
2021-11-13 11:52:45 +00:00
Timothy Flynn 9421d5c0cf LibUnicode: Generate currency unit-pattern number formats
These are used when formatting a number as currency with a display
option of "name" (e.g. for USD, the name is "US Dollars" in en-US).

These patterns appear in the CLDR in a different manner than other
number formats that are pluralized. They are of the form "{0} {1}",
therefore do not undergo subpattern replacements.
2021-11-13 11:52:45 +00:00
Timothy Flynn 39e031c4dd LibJS+LibUnicode: Generate all styles of currency localizations
Currently, LibUnicode is only parsing and generating the "long" style of
currency display names. However, the CLDR contains "short" and "narrow"
forms as well that need to be handled. Parse these, and update LibJS to
actually respect the "style" option provided by the user for displaying
currencies with Intl.DisplayNames.

Note: There are some discrepencies between the engines on how style is
handled. In particular, running:

new Intl.DisplayNames('en', {type:'currency', style:'narrow'}).of('usd')

Gives:

  SpiderMoney: "USD"
  V8: "US Dollar"
  LibJS: "$"

And running:

new Intl.DisplayNames('en', {type:'currency', style:'short'}).of('usd')

Gives:

  SpiderMonkey: "$"
  V8: "US Dollar"
  LibJS: "$"

My best guess is V8 isn't handling style, and just returning the long
form (which is what LibJS did before this commit). And SpiderMoney can
handle some styles, but if they don't have a value for the requested
style, they fall back to the canonicalized code passed into of().
2021-11-13 11:52:45 +00:00
Timothy Flynn 6cfd63e5bd LibUnicode: Parse numbers in number formats a bit more leniently
The parser was previously expecting number sections within a pattern to
start with "#", but they may also begin with "0".
2021-11-13 11:52:45 +00:00
Daniel Bertalan fe1726521a Meta: Resolve cyclic dependency between LibPthread and libc++
libc++ uses a Pthread condition variable in one of its initialization
functions. This means that Pthread forwarding has to be set up in LibC
before libc++ can be initialized. Also, because LibPthread is written in
C++, (at least some) parts of the C++ standard library have to be linked
against it.

This is a circular dependency, which means that the order in which these
two libraries' initialization functions are called is undefined. In some
cases, libc++ will come first, which will then trigger an assert due to
the missing Pthread forwarding.

This issue isn't necessarily unique to LibPthread, as all libraries that
libc++ depends on exhibit the same circular dependency issue.

The reason why this issue didn't affect the GNU toolchain is that
libstdc++ is always linked statically. If we were to change that, I
believe that we would run into the same issue.
2021-11-13 11:15:33 +00:00
Andreas Kling b189c88ec2 Fuzzers: Use ImageDecoders instead of load_FORMAT_from_memory() wrappers 2021-11-13 00:55:07 +01:00
Timothy Flynn 1f2ac0ab41 LibUnicode: Move number formatting code generator to UnicodeNumberFormat 2021-11-12 20:46:38 +00:00
Timothy Flynn 04e6b43f05 LibUnicode: Move (soon-to-be) common code out of GenerateUnicodeLocale
The data used for number formatting is going to grow quite a bit when
the cldr-units package is parsed. To prevent the generated UnicodeLocale
file from growing outrageously large, the number formatting data can go
into its own file. To prepare for this, move code that will be common
between the generators for UnicodeLocale and UnicodeNumberFormat to the
utility header.
2021-11-12 20:46:38 +00:00
Ali Mohammad Pur c08bfd450b Meta: Update the gdb script for the new RefPtr layout 2021-11-12 13:01:59 +00:00
Timothy Flynn be69eae651 LibUnicode: Precompute the compact scale of each number formatting rule
This will be needed for the ComputeExponentForMagnitude AO for compact
formatting, namely step 5b:

  Let exponent be an implementation- and locale-dependent (ILD) integer
  by which to scale a number of the given magnitude in compact notation
  for the current locale.
2021-11-12 09:17:08 +00:00
Timothy Flynn 230b133ee3 LibUnicode: Parse number formats into zero/positive/negative patterns
A number formatting pattern in the CLDR contains one or two entries,
delimited by a semi-colon. Previously, LibUnicode was just storing the
entire pattern as one string. This changes the generator to split the
pattern on that delimiter and generate the 3 unique patterns expected by
ECMA-402.

The rules for generating the 3 patterns are as follows:

* If the pattern contains 1 entry, it is the zero pattern. The positive
  pattern is the zero pattern prepended with {plusSign}. The negative
  pattern is the zero pattern prepended with {minusSign}.

* If the pattern contains 2 entries, the first is the zero pattern, and
  the second is the negative pattern. The positive pattern is the zero
  pattern prepended with {plusSign}.
2021-11-12 09:17:08 +00:00
Timothy Flynn 1244ebcd4f LibUnicode: Parse and generate standard accounting formatting rules
Also known as "currency-accounting" in some CLDR documentation.
2021-11-12 09:17:08 +00:00
Timothy Flynn 967afc1b84 LibUnicode: Parse and generate standard currency formatting rules 2021-11-12 09:17:08 +00:00
Timothy Flynn bffd73e0d4 LibUnicode: Parse and generate standard decimal formatting rules 2021-11-12 09:17:08 +00:00
Timothy Flynn feb8c22a62 LibUnicode: Parse and generate standard percentage formatting rules 2021-11-12 09:17:08 +00:00
Timothy Flynn 4317a1b552 LibUnicode: Parse and generate compact currency formatting rules 2021-11-12 09:17:08 +00:00
Timothy Flynn 604a596c90 LibUnicode: Parse and generate compact decimal formatting rules 2021-11-12 09:17:08 +00:00
Timothy Flynn 12b468a588 LibUnicode: Begin parsing and generating locale number systems
The number system data in the CLDR contains information on how to format
numbers in a locale-dependent manner. Start parsing this data, beginning
with numeric symbol strings. For example the symbol NaN maps to "NaN" in
the en-US locale, and "非數值" in the zh-Hant locale.
2021-11-12 09:17:08 +00:00
Timothy Flynn d3e83c9934 LibUnicode: Parse alternate default numbering systems
Some locales in the CLDR have alternate default numbering systems listed
under "defaultNumberingSystem-alt-*", e.g.:

    "defaultNumberingSystem": "arab",
    "defaultNumberingSystem-alt-latn": "latn",
    "otherNumberingSystems": {
      "native": "arab"
    },

We were previously only parsing "defaultNumberingSystem" and
"otherNumberingSystems". This odd format appears to be an artifact of
converting from XML.
2021-11-12 09:17:08 +00:00
Timothy Flynn ae66188d43 LibUnicode: Capitialize generated identifiers in lieu of full title case
This isn't particularly important because this generates code that is
quite hidden from outside callers. But when viewing the generated code,
it's a bit nicer to read e.g. enum identifiers such as "MinusSign"
rather than "Minussign".
2021-11-12 09:17:08 +00:00
Ali Mohammad Pur 7d1142e2c8 LibWasm: Implement module validation 2021-11-11 09:20:04 +01:00
Ali Mohammad Pur d0ad7efd9b Meta: Update WebAssembly testsuite branch name
The 'master' branch is no longer updated, they've switched to 'main'.
2021-11-11 09:20:04 +01:00
Andreas Kling 8b1108e485 Everywhere: Pass AK::StringView by value 2021-11-11 01:27:46 +01:00
Sam Atkins e52f987020 LibWeb: Make property_initial_value() return a NonnullRefPtr
The finale! Users can now be sure that the value is valid, which makes
things simpler.
2021-11-10 21:58:14 +01:00
Sam Atkins 4d42915485 LibWeb: Ensure that CSS initial values are always valid :^)
First off, this verifies that an initial value is always provided in
Properties.json for each property.

Second, it verifies that parsing that initial value succeeds.

This means that a call to `property_initial_value()` will always return
a valid StyleValue. :^)
2021-11-10 21:58:14 +01:00
Tim Schumacher d1eb604896 CMake: Build serenity_lib libraries with a custom SONAME
This allows libraries and binaries to explicitly link against
`<library>.so.serenity`, which avoids some confusion if there are other
libraries with the same name, such as OpenSSL's `libcrypto`.
2021-11-10 14:42:49 +01:00
Tim Schumacher 46aa477b8f CMake: Remove unused serenity_shared_lib function 2021-11-10 14:42:49 +01:00
Sam Atkins 901a990b1b LibWeb: Remove concept of CSS pseudo-properties
We don't need them any more, so they're gone. :^)
2021-11-10 14:38:49 +01:00
Timothy Flynn 03c023d7e9 LibUnicode: Upgrade to CLDR version 40.0.0
Release notes:
https://github.com/unicode-org/cldr-json/releases/tag/40.0.0
2021-11-09 20:44:52 +01:00
Timothy Flynn 357c97dfa8 LibUnicode: Parse the CLDR's defaultContent.json locale list
This file contains the list of locales which default to their parent
locale's values. In the core CLDR dataset, these locales have their own
files, but they are empty (except for identity data). For example:

https://github.com/unicode-org/cldr/blob/main/common/main/en_US.xml

In the JSON export, these files are excluded, so we currently are not
recognizing these locales just by iterating the locale files.

This is a prerequisite for upgrading to CLDR version 40. One of these
default-content locales is the popular "en-US" locale, which defaults to
"en" values. We were previously inferring the existence of this locale
from the "en-US-POSIX" locale (many implementations, including ours,
strip variants such as POSIX). However, v40 removes the "en-US-POSIX"
locale entirely, meaning that without this change, we wouldn't know that
"en-US" exists (we would default to "en").

For more detail on this and other v40 changes, see:
https://cldr.unicode.org/index/downloads/cldr-40#h.nssoo2lq3cba
2021-11-09 20:44:52 +01:00
Ben Wiederhake 3e420b7590 Meta: Remove useless lint-ipc-ids.sh script
This script was silently broken in commit
62af6cd4f9.
2021-11-05 00:17:01 +03:30
Ben Wiederhake 8f65153b03 Meta: Run IPC magic number linter during CI and pre-commit 2021-11-05 00:17:01 +03:30
Ben Wiederhake 585554a245 Meta: Implement checker for IPC magic number collisions 2021-11-05 00:17:01 +03:30
Ben Wiederhake 93356ee3df IPCCompiler: Remove now-unused ability to hardcode magic number 2021-11-05 00:17:01 +03:30
thislooksfun 03494ed6ba Meta: Add a check to ensure grep -P stays gone
grep -P does not work on macOS, but grep -E does.
2021-11-02 12:23:30 +01:00
thislooksfun a984545a94 Meta: Run find in the current dir
macOS's find requires a leading search scope. Without this change this
lint step fails.
2021-11-02 12:23:30 +01:00
thislooksfun 19bd302f6a Meta: Adhere to latest ScriptCheck standards (SC2268) 2021-11-02 12:23:30 +01:00
thislooksfun 3e32acc3e4 Meta: Add special case for macOS
macOS's `find` does not support the '-executable' flag, nor does it
support the '-perm /' syntax, but we can make it work with a special
case.
2021-11-02 12:23:30 +01:00
thislooksfun 170e956c80 Meta: Remove unnecessary -i
Using `xargs -i <cmd> {}` is just doing the default behavior of xargs,
but with extra steps that also don't work on macOS.
2021-11-02 12:23:30 +01:00
thislooksfun c2d44209a8 Meta: Use grep -E/F, not grep -P
grep -E and -F are POSIX standard, and meets all our matching needs.
2021-11-02 12:23:30 +01:00
Ben Wiederhake 686efb6737 ConfigureComponents: Reduce duplicated code 2021-11-02 11:36:23 +01:00
Linus Groh 897471c852 Meta: Don't check for toolchain if serenity.sh target is lagom
This is just silly :^)

    $ serenity run lagom js
    WARNING: unknown toolchain 'js'. Defaulting to GNU.
             Valid values are 'Clang', 'GNU' (default)
2021-11-02 11:09:05 +01:00
Idan Horowitz 19e28d5798 LibWeb: Convert is_named_property_exposed_on_object to ThrowCompletions
This is the last usage of old-style exceptions in the WrapperGenerator.
2021-11-02 10:41:25 +02:00
Ben Wiederhake 55e1edd51b Meta: Check auto-generated manpages for completeness on CI 2021-11-01 21:12:58 +01:00
Ben Wiederhake a4e805756d Meta: Add script to generate and export manpages 2021-11-01 21:12:58 +01:00
Ben Wiederhake 2caad04d23 Base: Add new system-mode that just generates manpages 2021-11-01 21:12:58 +01:00
Timothy Flynn 95e492de59 LibWeb: Convert throw_dom_exception_if_needed() to ThrowCompletionOr
This changes Web::Bindings::throw_dom_exception_if_needed() to return a
JS::ThrowCompletionOr instead of an Optional. This allows callers to
wrap the invocation with a TRY() macro instead of making a follow-up
call to should_return_empty(). Further, this removes all invocations to
vm.exception() in the generated bindings.
2021-10-31 18:51:07 +01:00