Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modifications for Rocket integration #20

Open
1 of 5 tasks
mettke opened this issue Jul 10, 2018 · 10 comments
Open
1 of 5 tasks

Modifications for Rocket integration #20

mettke opened this issue Jul 10, 2018 · 10 comments

Comments

@mettke
Copy link
Contributor

mettke commented Jul 10, 2018

I'm adding this Issue to track progress for the integration into #684

My suggestion is:

  • high-performance minifier for HTML for on-the-fly minification of dynamic content
  • high-performance minifier for JSON for on-the-fly minification of dynamic content
  • high-performance minifier for XML for on-the-fly minification of dynamic content
  • implementation of READ
  • minification of files

What do you think? I'm of course happy to help if you point me to the right direction. Maybe some of the stuff from my crate might be helpful as well

@GuillaumeGomez
Copy link
Owner

For HTML/XML, I was thinking about using html5ever, but not much optimisations can be performed on those formats unfortunately. For JSON, it'd require a whole new parser (nothing impossible to do but still needs to be done).

@mettke
Copy link
Contributor Author

mettke commented Jul 10, 2018

I'm currently writing the JSON minifier with READ implementation. Should be done tomorrow.

I'm not sure how html5ever works, but I know how my minifier works. It is possible to create an iterator like structure which yields 5 or more bytes at a time, allowing to check whether the current one should be filtered or not. That way it is possible to iterator over the whole string/read handling each byte only once removing those who are not supposed to be there.

I will give you an example tomorrow with the JSON parser. It is a bit more difficult but offers a huge performance bonus as it is not iterating multiple times over the whole data

@GuillaumeGomez
Copy link
Owner

I'm currently writing the JSON minifier with READ implementation. Should be done tomorrow.

Oh cool!

I'm not sure how html5ever works, but I know how my minifier works. It is possible to create an iterator like structure which yields 5 or more bytes at a time, allowing to check whether the current one should be filtered or not. That way it is possible to iterator over the whole string/read handling each byte only once removing those who are not supposed to be there.

Not sure yet but since it's used by servo, I'd assume it's quite efficient.

I will give you an example tomorrow with the JSON parser. It is a bit more difficult but offers a huge performance bonus as it is not iterating multiple times over the whole data

In my current implementation, I'm not iterating multiple times over whole data. Or did I miss your point? :)

@mettke
Copy link
Contributor Author

mettke commented Jul 10, 2018

In my current implementation, I'm not iterating multiple times over whole data. Or did I miss your point? :)

re.replace_all(&source, " ").into_owned()
re.replace_all(source, |caps: &Captures| {
    type_reg.replace_all(&caps[0], "").into_owned()
}).into_owned()
for useless_tag in &useless_tags {
    res = res.replace(useless_tag, "");
}

What is your way of calling this :D (no offence)

@GuillaumeGomez
Copy link
Owner

That's the HTML part. This part is clearly not ready and shouldn't be used. That's why I want to rewrite it using html5ever. I should have been more clear on that point, my bad. :)

@mettke
Copy link
Contributor Author

mettke commented Jul 10, 2018

Ah I see. Well in that case it might make sense. Remember however that html and xml is not the same. I guess that xml cannot be minified by html5ever. There are for example data entires (or however those are called) which don't exist in html

@GuillaumeGomez
Copy link
Owner

XML is simpler, indeed. Don't know if it's really worth it to write a XML minifier though.

This was referenced Jul 11, 2018
@mettke
Copy link
Contributor Author

mettke commented Aug 5, 2018

@GuillaumeGomez Any updates on this? Would be nice to know whether html5ever turned out to be handy for XML and HTML minification

@GuillaumeGomez
Copy link
Owner

HTML minification is tricky, because you never really know where spaces are important or not. For XML I haven't look at all. Also, I still need to add the from_read for CSS and JS...

@SuperCuber
Copy link

Does your "html" checkpoint include javascript (in a script tag or in its own file)? I think that's the most minifiable of them all, and probably with the most existing libs to do it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants