You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# Server Rendering
<app-ctc-blockvariant="snippet"heading="src/pages/users.js">
```jsexportasyncfunctiongetBody(compilation, page, request) {
consttimestamp=newDate().getTime();
return` <h1>Hello from the server rendered users page! 👋</h1> <table> <tr> <th>Name</th> <th>Image</th> </tr> </table> <h6>Last Updated: ${timestamp}</h6>`;
}
```
</app-ctc-block>
While the output from unified is correct and properly escaped
<h1>Server Rendering</h1><app-ctc-blockvariant="snippet" heading="src/pages/users.js"><pre><codeclass="language-js">export async function getBody(compilation, page, request) {
const timestamp = new Date().getTime();
return `
<h1>Hello from the server rendered users page! 👋</h1>
<table>
<tr>
<th>Name<;/th>
<th>Image</th>
</tr>
</table>
<h6>Last Updated: ${timestamp}</h6>
`;
}
</code></pre></app-ctc-block>
The output from WCC / parse5 has all the HTML entities converted back to HTML
<h1>Server Rendering</h1><app-ctc-blockvariant="snippet" heading="src/pages/users.js"><pre><codeclass="language-js">
export async function getBody(compilation, page, request) {
const timestamp = new Date().getTime();
return `
<h1>Hello from the server rendered users page! 👋</h1><table><tbody><tr><th>Name</th><th>Image</th></tr></tbody></table><h6>Last Updated: ${timestamp}</h6>
`;
}
</code></pre></app-ctc-block>
This means that instead of rendering as text
The HTML is rendered literally, breaking the output
Details
The main issue seems to be in WCC (parse5 specifically), in that parse5 will convert HTML entities automatically when accessing the "raw" value of a node
Which means that when building the HTML back out, instead of getting something like </h1>Some text</h1> to do innerHTML work in WCC, we end up getting the literal HTML </h1>Some text</h1> which does seem to be the expected behavior as they state, meaning it will be up to application consumers to manage this preservation, unfortunately.
A seemingly simple solution would be to just manually escape < when parsing innerHTML in WCC, though I'm not sure if this is the best solution, or more likely, the best place?
The challenge is that as far as Greenwood is concerned, the input to WCC is correct, so how would we know where to do the substitution on the way out? Per reading through similar issues in the parse5 repo, we would have to double parse and convert based on locations, and from what i understand, adding location markers is a pretty significant performance overhead.
thescientist13
changed the title
escaped HTML entities from markdown content are not being honored when prerendering
escaped HTML entities in markdown content are not being honored when prerendering Light DOM HTML
Jan 7, 2025
Another complication is that parse5 will also seem to encode entities even if they aren't part of HTML, which makes this work around in WCC even more unpredictable :/ ProjectEvergreen/wcc#182
{
value: '\n' +
' <h1>Hello from the server rendered users < page! 👋</h1>\n' +
''
}
{
html: '\n' +
' <x-ctc>\n' +
' <h1>Hello from the server rendered users < page! 👋</h1>\n' +
' </x-ctc>\n' +
''
}
Type of Change
Bug
Summary
It was observed in ProjectEvergreen/www.greenwoodjs.dev#120 (comment) that if creating markdown content as follows
While the output from unified is correct and properly escaped
The output from WCC / parse5 has all the HTML entities converted back to HTML
This means that instead of rendering as text
The HTML is rendered literally, breaking the output
Details
The main issue seems to be in WCC (parse5 specifically), in that parse5 will convert HTML entities automatically when accessing the "raw" value of a node
Which means that when building the HTML back out, instead of getting something like
</h1>Some text</h1>
to doinnerHTML
work in WCC, we end up getting the literal HTML</h1>Some text</h1>
which does seem to be the expected behavior as they state, meaning it will be up to application consumers to manage this preservation, unfortunately.A seemingly simple solution would be to just manually escape
<
when parsinginnerHTML
in WCC, though I'm not sure if this is the best solution, or more likely, the best place?The challenge is that as far as Greenwood is concerned, the input to WCC is correct, so how would we know where to do the substitution on the way out? Per reading through similar issues in the parse5 repo, we would have to double parse and convert based on locations, and from what i understand, adding location markers is a pretty significant performance overhead.
The text was updated successfully, but these errors were encountered: