Commit graph

11 commits

Author SHA1 Message Date
BenJilks 0ca5675d59 LibTextCodec: Implement iso-2022-jp encoder
Implements the `iso-2022-jp` encoder, as specified by
https://encoding.spec.whatwg.org/#iso-2022-jp-encoder
2024-08-08 17:49:58 +01:00
BenJilks 08a8d67a5b LibTextCodec: Implement shift_jis encoder
Implements the `shift_jis` encoder, as specified by
https://encoding.spec.whatwg.org/#shift_jis-encoder
2024-08-08 17:49:58 +01:00
BenJilks d80575a410 LibTextCodec: Implement gb18030 and gbk encoders
Implements the `gb18030` and `gbk` encoders, as specified by
https://encoding.spec.whatwg.org/#gb18030-encoder
https://encoding.spec.whatwg.org/#gbk-encoder
2024-08-08 17:49:58 +01:00
BenJilks 34c8c559c1 LibTextCodec: Implement big5 encoder
Implements the `big5` encoder, as specified by
https://encoding.spec.whatwg.org/#big5-encoder
2024-08-08 17:49:58 +01:00
BenJilks 826292536c LibTextCodec: Implement euc-kr encoder
Implements the `euc-kr` encoder, as specified by
https://encoding.spec.whatwg.org/#euc-kr-encoder
2024-08-08 17:49:58 +01:00
BenJilks 72d0e3284b LibTextCodec+LibURL: Implement utf-8 and euc-jp encoders
Implements the corresponding encoders, selects the appropriate one when
encoding URL search params. If an encoder for the given encoding could
not be found, fallback to utf-8.
2024-08-08 17:49:58 +01:00
Timothy Flynn 368dad54ef LibTextCodec: Use AK facilities to validate and convert UTF-16 to UTF-8
This allows LibTextCodec to make use of simdutf, and also reduces the
number of places with manual UTF-16 implementations.
2024-07-18 19:43:57 +02:00
Sam Atkins 2db168acc1 LibTextCodec+Everywhere: Port Decoders to new Strings 2023-02-19 17:15:47 +01:00
Nico Weber 3423b54eb9 LibTextCodec: Make utf-16be and utf-16le codecs actually work
There were two problems:

1. They didn't handle surrogates
2. They used signed chars, leading to eg 0x00e4 being treated as 0xffe4

Also add a basic test that catches both issues.
There's some code duplication with Utf16CodePointIterator::operator*(),
but let's get things working first.
2023-01-22 21:30:44 +00:00
sin-ack 3f3f45580a Everywhere: Add sv suffix to strings relying on StringView(char const*)
Each of these strings would previously rely on StringView's char const*
constructor overload, which would call __builtin_strlen on the string.
Since we now have operator ""sv, we can replace these with much simpler
versions. This opens the door to being able to remove
StringView(char const*).

No functional changes.
2022-07-12 23:11:35 +02:00
Karol Kosek dcb24e943d Tests: Add a basic UTF-8 to UTF-8 LibTextCodec test 2022-03-29 01:01:32 +02:00