Note that the default value of the attribute is true. We were previously
autoplaying videos as soon as they loaded - this will prevent that from
happening until the paused attribute is set to false.
See the lengthy comment added in this commit for details.
With this, the webp lossless decoder is feature complete :^)
(...except for bug fixes and performance improvements, as always.)
...in addition to modifying in-place. This is needed for bitpacking
support for the color indexing transform (and it could also be used
to make the color indexing transform return an indexed bitmap, which
is something we could do if that's the last transform that's applied).
No behavior change.
Reduces the time to run
Build/lagom/image ~/src/libwebp/webp_js/test_webp_wasm.webp -o tmp.png
from 0.5s to 0.25s.
Before, 60% of the time was spent decoding webp and 40% writing png.
Now, 16% of the time was spent decoding webp and 84% writing png.
That means png writing takes 0.2s, and webp decoding time went from
0.3s to 0.05s.
A template expression without explicit return type deduces its return
type as if for a function whose return type is declared auto. That
does deduce return-by-value, while `decltype(auto)` would deduce
return-by-reference. Explictly saying `decltype(auto)` would work
too, but writing out the type is maybe easier to understand.
No behavior change other than being much faster.
The previous attempt was in commit e5e9d3b877, where I thought
max_symbol describes how many code lengths should be read.
But it looks like it instead describes how many code length input
symbols should be read. (The two aren't the same since one code length
input symbol can produce several code lengths.)
I still agree with the commit description of e5e9d3b877 that the spec
isn't very clear on this :)
This time I've found a file that sets max_symbol and with this change
here, that file decodes correctly. (It's Qpalette.webp, which I'm about
to add as a test case.)
This ports MouseEvent, UIEvent, WheelEvent, and Event to new String.
They all had a dependency to T::create() in
WebDriverConnection::fire_an_event() and therefore had to be ported in
the same commit.
Webp lossless can have up to 2328 symbols. This code assumed the deflate
max of 288, leading to crashes for webp lossless files using more than
288 symbols (such as Tests/LibGfx/test-inputs/simple-vp8l.webp).
Nothing writes webp files at this point, so the m_bit_codes and
m_bit_code_lengths arrays aren't ever used in practice with more than
288 entries.
WebP lossless differs from deflate in how it handles 1-element codes.
Deflate consumes one bit from the bitstream to produce the element,
while webp lossless consumes 0 bits. Add a wrapper class to handle
this case.
It's not totally clear to me when all of these states are supposed to be
set. For example, nothing in the HTMLMediaElement spec says to "set the
readyState attribute to HAVE_ENOUGH_DATA". However, this will at least
advance the readyState to HAVE_METADATA, which is needed for other
useful attributes for debugging.
The spec for loading a media element is quite huge. This implements just
enough to parse the attribute, fetch the corresponding media object, and
decode the media object (if it is a video). While doing so, this also
implements most network state tracking and firing DOM events that may be
observed via JavaScript.
Some elements, like HTMLMediaElement, must have a unique task sources
for every instance of that element that is created. Support this with a
simple wrapper around IDAllocator.
This copies the video data from the Matroska document into the Track
structure that outside users have access to. Because Track can actually
represent other media types, this is set up such that the Track can hold
metadata for those other types when they are needed.
This is needed for LibWeb's HTMLMediaElement implementation.
Grid and flex containers have their own rules for abspos items, so we
shouldn't try to be clever and put them in the "current" anonymous
wrapper block. That behavior is primarily for the benefit of block &
inline layout.
The `static` here meant we always kept the alphabet sizes of the
first image we happened to load -- and a single webp lossless image
can store several helper images used during decoding.
Usually, the helper images wouldn't use a color cache but the main
image would, but the main image would then use the first entry from
the helper images due to the `static`, which led us to not decoding
the codes for the color cache symbols.
When calculating the intrinsic width of a block-level box, we now ignore
the preferred width entirely, and not just when the preferred width
should be treated as auto.
The condition for this was both confused and wrong, as it looked at the
available width around the box, but didn't check for a width constraint
on the box itself.
Just because the available width has an intrinsic sizing constraint
doesn't mean that the box is undergoing intrinsic sizing. It could also
be the box's containing block!
No need to store them in `i32`, the JPEG norm specifies that they are
not bigger than 16 bits for extended JPEGs. So, in this patch, we
replace `i32` with `i16`. It almost divides memory usage by two :^)
When creating a copy of the font containing only the glyphs that are in
use, we previously looped over all possible code points, instead of the
range of code points that are actually in use (and allocated) in the
font. This is a problem, since we index into the array of widths to find
out if a given glyph is used. This array is only as long as the number
of glyphs the font was created with, causing an out of bounds read when
that number is less than our maximum.
Loads of changes that are tightly connected... :/
* Change lambdas to static functions
* Add spec docs to those functions
* Keep the current scope around as a parameter
* Add wrapping classes for some Certificate members
* Parse ec and ecdsa data from certificates
The spec is at best misleading here, suggesting that max_symbol should
be set to "num_code_lengths" if it's not explicitly stored.
But num_code_lengths doesn't mean the num_code_lengths mentioned a few
lines further up in the spec, but alphabet_size!
(I had to cheat and look at libwebp instead of the spec for this: See
vp8l_dec.c, ReadHuffmanCode() which passes alphabet_size to
ReadHuffmanCodeLengths() as num_symbols, and ReadHuffmanCodeLengths()
then sets max_symbol to that.)
I haven't yet found a file that uses max_symbol, so this isn't actually
tested. But it's close to what's in libwebp, so maybe it works!
This improves the decompression time of `clang-15.0.7.src.tar.xz` from
41 seconds down to about 5 seconds.
The reason for this very significant improvement is that LZMA, the
underlying compression of XZ, fills its range decompressor one byte at a
time, causing a lot of overhead at the syscall barrier.
Missing:
* Transform support (used by virtually all lossless webp files)
* Meta prefix / entropy image support
Working:
* Decoding of regular image streams
* Color cache
This happens to be enough to be able to decode
Tests/LibGfx/test-inputs/extended-lossless.webp
The canonical prefix code is very similar to deflate's, enough so that
this can use Compress::CanonicalCode (and take advantage of all the
recent performance improvements there).
If we fail to set the response type to an error, calling code will think
the fetch was successful. We also should not default to an error code of
200, which would also indicate success.
deflate_special_code_length_copy has value 16, so it should be
before the two zero-filling branches for codes 17 and 18.
Also, the initial if also refers to deflate_special_code_length_copy
as well, so if it's repeated right in the next else, one has to keep
it on the mental stack for shorter when reading this code.
No behavior change.
Alternatively, we could remove the else after the continue, but
all branches here should be equally prominent, so this seems a bit
nicer.
No behavior change.
Minus a tasteful item height remainder. Ignoring Taskbar is okay now
that the window is a PopUp.
Also expands its width if intersection with the Desktop makes its
ListView scrollable. ComboBox windows no longer intersect horizontally,
remaining firmly "attached" to the editor, similar to other classic UIs.
Originally implemented to handle resizable ComboBox windows, this
"feature" no longer exists, so calculating min size is no longer
necessary. The calculation was also failing to account for dynamic
ListViews properly.
This patch simplifies things by setting ComboBox ListView's minimum size
explicitly and deferring to AbstractScrollableWidget's more flexible
calculated implementation otherwise.
Fixes FontPicker resizing incorrectly due to overly rigid ListViews.
Expand the following types from 32-bit to 64-bit:
- blkcnt_t
- blksize_t
- dev_t
- nlink_t
- suseconds_t
- clock_t
This matches their size on other 64-bit systems.
The default SortingProxyModel does not allow to react to reodering.
As we would like to keep the column width on sorting, we create a
subclass and inject our code into the sorting method.
This fixes multi-layer backgrounds with background positions. This
is a little awkard, so maybe it would be better to refactor the
parsing code to make these lists directly, but right now this is
the simplest fix.
This patch adds support for properly read images with four components,
basically CMYK or YCCK. However, we still lack color spaces
transformations for this type of image. So, it just postpones failure.
As mentioned in F.2.1.5 - Inverse DCT (IDCT), the decoder needs to
perform a level shift by adding 128. This used to be done in
`ycbcr_to_rgb` after the conversion. Now, we do it in `inverse_dct` in
order to ensure that the task is done unconditionally.
Consequences of this are that we are no longer required to explicitly
do it for RGB images and also, the `ycbcr_to_rgb` function is exactly
like the specification.
When reading the stream, interpreted as a normal value 0xF0 means skip
15 values and assign the 16th to 0. On the other hand, the marker ZRL
- which has the value 0xF0, means skip 16 values. For baseline JPEGs,
ZRL doesn't need to be interpreted differently as writing the 16th value
has no consequence. This is no longer the case with refining scans.
That's why this patch implement correctly ZRL.
We used to skip over zero coefficient by modifying the loop counter. It
is unfortunately impossible to perform this with SOF2 images as only
coefficients with a zero-history should be skipped.
This induces no behavior change for the user of the function.
This commit is nonsense for anything else than SOF2 images with spectral
approximation. For this particular case, skips like EOB or ZRL only
apply to coefficients with a zero-history. This commit prepares the code
to handle this behavior.
Rather than the very C-like API we currently have, accepting a void* and
a length, let's take a Bytes object instead. In almost all existing
cases, the compiler figures out the length.
This function has probably been added when we weren't as good with error
propagations as we are now. We can safely remove it and let future
calls to `read` fail if the file is corrupted.
This can be tested with the following bytes (already used in 9191829a):
ffd8ffc000000800080ef701101200ffda00030100
When trying to set the wallpaper from the menu, ImageViewer would
crash because setting the wallpaper requires the program to pledge
to the WindowManager domain. This patch adds that pledge.
This parses the new background-position-x/y longhands and properly
hooks up them up. This requires converting PositionStyleValue to
just contain two EdgeStyleValues so that it can be easily expanded
into the longhands.
This is very similar to the LittleEndianInputBitStream bit buffer change
from 8e834d4bb2.
We currently buffer one byte of data for the underlying stream. And when
we put bits onto that buffer, we do so 1 bit at a time.
This replaces the u8 buffer with a u64. And instead of looping at all,
we perform bitwise operations to write the desired number of bits.
Using the "enwik8" file as a test (100MB uncompressed, commonly used in
benchmarks: https://www.mattmahoney.net/dc/enwik8.zip), compression time
decreases from:
13.62s to 10.9s on Serenity (cold)
13.62s to 9.22s on Serenity (warm)
2.93s to 2.32s on Linux
One caveat is that this requires explicitly flushing any leftover bits
when the caller is done with the stream. The byte buffer implementation
implicitly flushed its data every time the buffer was byte-aligned, as
doing so would always fill the byte. This is no longer the case. But for
now, this should be fine as the one user of this class, DEFLATE, already
has a "flush everything now that we're done" finalizer.
This is something we're supposed to do according to the HTML spec.
Note that image loading is currently completely ad-hoc, and this just
adds a simple DocumentLoadEventDelayer to the existing implementation.
This will allow us to use images in layout tests, which rely on the
document load event firing at a predictable time.
Instead of passing the continuously merged initial forbidden token set
(with the new additional forbidden tokens from each parsed secondary
expression) to the next call of parse_secondary_expression(), keep a
copy of the original set and use it as the base for parsing the next
secondary expression.
This bug prevented us from properly parsing the following expression:
```js
0 ?? 0 ? 0 : 0 || 0
```
...due to LogicalExpression with LogicalOp::NullishCoalescing returning
both DoubleAmpersand and DoublePipe in its forbidden token set.
The following correct AST is now generated:
Program
(Children)
ExpressionStatement
ConditionalExpression
(Test)
LogicalExpression
NumericLiteral 0
??
NumericLiteral 0
(Consequent)
NumericLiteral 0
(Alternate)
LogicalExpression
NumericLiteral 0
||
NumericLiteral 0
An alternate solution I explored was only merging the original forbidden
token set with the one of the last parsed secondary expression which is
then passed to match_secondary_expression(); however that led to an
incorrect AST (note the alternate expression):
Program
(Children)
ExpressionStatement
LogicalExpression
ConditionalExpression
(Test)
LogicalExpression
NumericLiteral 0
??
NumericLiteral 0
(Consequent)
NumericLiteral 0
(Alternate)
NumericLiteral 0
||
NumericLiteral 0
Truth be told, I don't know enough about the inner workings of the
parser to fully explain the difference. AFAICT this patch has no
unintended side effects in its current form though.
Fixes#18087.
This now uses the current font (rather than the painter's default)
and scales it correctly. This is not perfect though as just naviely
doing .draw_text() here does not follow the proper text layout logic
so this is misaligned (by a pixel or two) with the text in the <li>.
At the end of HTML::EventLoop::process(), the loop reschedules itself if
there are more runnable tasks available.
However, the condition was flawed: we would reschedule if there were any
microtasks queued, but those tasks will not be processed if we're
currently within the scope of a microtask checkpoint.
To fix this, we now only reschedule the HTML event loop for microtask
processing *if* we're not already in a microtask checkpoint.
This fixes the 100% CPU churn seen when looking at PRs on GitHub. :^)
This is copy-pasted from the gzip utility, along with its existing TODO.
This is currently only needed by that utility, but this gives us API
symmetry with GzipDecompressor, and helps ensure we won't end up in a
situation where only one utility receives optimizations that should be
received by all interested parties.
This patch adds an additional control to KeyboardSettings allowing
the user to map Caps Lock to Ctrl. Previously, this was only possible
by writing to /sys/kernel/variables/caps_lock_to_ctrl.
Writing to /sys/kernel/variables/caps_lock_to_ctrl requires root
privileges, but KeyboardSettings will not attempt to elevate
the privilege of the user if they are not root. Instead, the
checkbox is rendered as un-editable.
As part of this, the CodeDocument now keeps track of the kind of
difference for each line. Previously, we iterated every hunk every time
the editor was painted, but now we do that once whenever the diff
changes, and then save the type of difference for each line.
HackStudio's Editor has displayed indicators in its gutter for a long
time, but each required manual code to paint them in the right place
and respond to click events. All indicators on a line would be painted
in the same location. If any other applications wanted to have gutter
indicators, they would also need to manually implement the same code.
This commit adds an API to GUI::TextEditor so it deals with these
indicators. It makes sure that multiple indicators on the same line
each have their own area to paint in, and provides a callback for when
one is clicked.
- `register_gutter_indicator()` should be called early on. It returns a
`GutterIndicatorID` that is then used by the other methods.
Indicators on a line are painted from right to left, in the order
they were registered.
- `add_gutter_indicator()` and `remove_gutter_indicator()` add the
indicator to the given line.
- `clear_gutter_indicators()` removes a given indicator from every line.
- The `on_gutter_click` callback is called whenever the user clicks on
the gutter, but *not* on an indicator.
Warn the user about seemingly known malloc() and free() patterns in the
fault address.
This brings back the functionality that was removed recently in the
5416a37fde commit, but this time we detect
these patterns in userspace code and not in kernel code.
Calling umask and open with same permission value will result in a file
with no permissions bits if the program is stopped while it is doing an
IO. This will result in an error with EACCES when we try to run the
benchmark with the same file name. The file needs to be manually
removed before continuing the benchmark.
There is no use in calling umask here, so just remove it so that
interrupting the program while it is doing an IO will not result int the
file with no permissions the next time we run the program.
Instead of reading bytes from the output stream into a buffer, just to
immediately write them back out, we can skip the middle-man and copy the
bytes directly into the output buffer.
This implements Intel's slicing-by-8 algorithm for CRC checksums (only
little endian CPUs for now, as I don't have a way to test big endian).
The original paper for this algorithm seems to have disappeared, but
Intel's source code is still available as a reference:
https://sourceforge.net/projects/slicing-by-8/
As well as other implementations for reference:
https://docs.rs/slice-by-8/latest/src/slice_by_8/algorithm.rs.html
Using the "enwik8" file as a test (100MB uncompressed, commonly used in
benchmarks: https://www.mattmahoney.net/dc/enwik8.zip), decompression
time decreases from:
4.89s to 3.52s on Serenity (cold)
1.72s to 1.32s on Serenity (warm)
1.06s to 0.92s on Linux
This was fine before as the last entry was a null string (which could be
printed), but we no longer use C-style sentinel-terminated arrays for
arguments.
This solves an awkward dependency cycle, where CalculatedStyleValue
needs the definition of Percentage, but including that would also pull
in PercentageOr, which in turn needs CalculatedStyleValue.
Many places that previously included StyleValue.h no longer need to. :^)
There were a mix of users between those who want to know if the Length
changed, and those that just want an absolute Length. So, we now have
two methods: Length::absolutize() returns an empty Optional if nothing
changed, and Length::absolutized() always returns a value.
...and replace it with AngleOrCalculated.
This has the nice bonus effect of actually handling `calc()` for angles
in a transform function. :^) (Previously we just would have asserted.)
This is intended as a replacement for Length and friends each holding a
`RefPtr<CalculatedStyleValue>`. Instead, let's make the types explicit
about whether they are calculated or not. This then means a Length is
always a Length, and won't require including `StyleValue.h`.
As noted, it's probably nicer for LengthOrCalculated to live in
`Length.h`, but we can't do that until Length stops including
`StyleValue.h`.
Due to CSSImportRule::has_import_result() being backwards, we never
actually entered imported style sheets when traversing style rules or
media queries.
With this fixed, we no longer need the "collect style sheets" step in
StyleComputer, as normal for_each_effective_style_rule() will now
actually find all the rules. :^)
The constructor is now only concerned with creating the required
streams, which means that it no longer fails for XZ streams with
invalid headers. Instead, everything is parsed and validated during the
first read, preparing us for files with multiple streams.
This was causing a huge slowdown when loading some pages with weirdly
huge number of style sheets. For example, amazon.com has over 200 style
elements, which meant we had to resort the StyleSheetList 200 times.
(And sorting itself was slow because it has to compare DOM positions.)
Instead of sorting, we now look for the correct insertion point when
adding new style sheets, and we start the search from the end, which is
where style sheets are typically added in the vast majority of cases.
This removes a 600ms time sink when loading Amazon on my machine! :^)
Setting the `data` of a text node already triggers `children changed`
per spec, so there's no need for an explicit call.
This avoids parsing every HTMLStyleElement sheet twice. :^)
We now select between nearest neighbor and bilinear filtering when
scaling images in CRC2D.drawImage().
This patch also adds CRC2D.imageSmoothingQuality but it's ignored for
now as we don't have a bunch of different quality levels to map it to.
Work towards #17993 (Ruffle Flash Player)
We currently decode back-references one byte at a time, while writing
that byte back out to the output buffer. This is only necessary when the
back-reference refers to itself, i.e. when the back-reference distance
is less than its length. In other cases, we can read the entire back-
reference block in one shot.
Using the "enwik8" file as a test (100MB uncompressed, commonly used in
benchmarks: https://www.mattmahoney.net/dc/enwik8.zip), decompression
time decreases from:
5.8s to 4.89s on Serenity (cold)
2.3s to 1.72s on Serenity (warm)
1.6s to 1.06s on Linux
Huffman codes have a useful property in that they are prefix codes. That
is, a set of bits representing a Huffman-coded symbol is never a prefix
of another symbol. This allows us to create a table, where each index in
the table are integers whose prefix is the entry's corresponding Huffman
code.
With Deflate, we can have codes up to 16 bits in length, thus creating a
prefix table with 2^16 entries. So instead of creating a table fit all
possible codes, we use a cutoff of 8-bit codes. Codes larger than 8 bits
fall back to the binary search method.
Using the "enwik8" file as a test (100MB uncompressed, commonly used in
benchmarks: https://www.mattmahoney.net/dc/enwik8.zip), decompression
time decreases from 3.527s to 2.585s on Linux.
We currently mix normal and bit streams during GZIP decompression, where
the latter is a wrapper around the former. This isn't causing issues now
as the underlying bit stream buffer is a byte, so the normal stream can
pick up where the bit stream left off.
In order to increase the size of that buffer though, the normal stream
will not be able to assume it can resume reading after the bit stream.
The buffer can easily contain more bits than it was meant to read, so
when the normal stream resumes, there may be N bits leftover in the bit
stream that the normal stream was meant to read.
To avoid weird behavior when mixing streams, this changes the GZIP
decompressor to always read from a bit stream.
Previously we used a parsing context with no access to the document, so
any URLs in url() functions would become invalid.
Fixes the images on Steam's store carousel, which sets
Element.style.backgroundImage to url() functions.
Some fonts (like the Bootstrap Icons webfont) have bogus glyph offsets
in the `loca` table that point past the end of the `glyf` table.
AFAICT other rasterizers simply ignore these glyphs and treat them as if
they were missing. So let's do the same.
This makes https://changelog.serenityos.org/ actually work! :^)
In situations where we need a width to calculate the intrinsic height of
a flex item, we use the fit-content width as a stand-in. However, we
also need to clamp it to any min-width and max-width properties present.
The spec tells us that for glyph offsets in the "loca" table, if an
offset appears twice in a row, the index of the first one refers to a
glyph without an outline (such as the space character). We didn't check
for this, which would cause us to render a glyph outline where there
should have been nothing.
Some versions of clang, such as Apple clang-1400.0.29.202 error out on
the previous out of line operators. Explicitly defaulting comparison
operators out of line is allowed per P2085R0, but was checked in clang
before version 15 in C++20 mode.
When debugging in Xcode, the waitpid() for the initial forked process
would always return EINTR or ECHILD. Work around this by blocking all
signals until we're ready to wait for the initial child.
In `flex-direction: column` layouts, a flex item's intrinsic height may
depend on its width, but the width is calculated *after* the intrinsic
height is required.
Unfortunately, the specification doesn't tell us exactly what to do here
(missing inputs to intrinsic sizing is a common problem) so we take the
solution that flexbox applies in 9.2.3.C and apply it to all intrinsic
height calculations within FlexFormattingContext: if the used width of
an item is not yet known when its intrinsic height is requested, we
substitute the fit-content width instead.
Note that while this is technically ad-hoc, it's basically extrapolating
the spec's suggestion in one specific case and using it in all cases.
When calculating the intrinsic width of a box, we now make its content
width & height indefinite before entering the intrinsic sizing layout.
This ensures that any geometry assigned to the box by its parent
formatting context is ignored.
For intrinsic heights, we only make the content height indefinite.
This is because used content width is a valid (but optional) input
to intrinsic height calculation.
This fixes an issue found on Linus's hosted WebServer. Now, if WebServer
is hosted at a non-root URL. (eg, `example.com/webserver` instead of
`example.com`) the links will correctly go to
`example.com/webserver/foo` instead of `example.com/foo`.