Using JS's `getComputedStyle()` for every character is too CPU
intensive, so instead I'm experimenting with using a custom font
to take the canvas snapshot. The font is made up of only the unicode
block character, which basically fills the entire space given to a
monospace glyph. This also means that we can fairly reliably work out
the visibility (whether it's obscured or hidden with CSS) of text.
This proves that frames can be generated on Firefox using the canvas and
a Tree Walker to examine text nodes. Already with little optimisation
frames don't ever take longer than 200ms to render.
Chrome has a MediaStream of the viewport, hopefully that will prove
performant as well.
This doesn't have functioning text colour detection or text occlusion
support. But early research suggests this will possible by comparing 2
screenshots: one with and the other without rendered text.