tantek.com

New issue on

GitHub project “fragmention”


Fragmention must define what “plain text content” means

on (ttk.me b/54i2) using BBEdit

The Fragmention processing model says to “Search through the plain text content” and hyperlinks the phrase “plain text content” to the HTML definition of a text node. That link is insufficient for precise (interoperable) implementation. Examples of questions that need to be answered explicitly by normative text in the spec:

Does it just mean to concatenate all the text nodes, including CDATA?

Does that include the text of inline script elements?

Does it instead include the text of noscript elements?

Does that include the text of inline style elements?

As already proposed in issue #4, does that include alt text from images?

The specification should at least answer these questions, preferably with examples demonstrating why the specific answers were chosen, and add tests for them accordingly.