-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve serialization memory footprint #176
Comments
I can't picture how this looks, but this sounds like a solid approach and I'm happy to help once I know what to do (and pending the resolution of the serde question, also mentioned in #172 |
@urschrei The idea is to use the So, for instance, the main #[derive(Clone, Debug, PartialEq, Serialize, Deserialize)]
#[serde(tag = "type")]
pub enum GeoJson {
Geometry(Geometry),
Feature(Feature),
FeatureCollection(FeatureCollection),
} The #[derive(Clone, Debug, PartialEq, Serialize, Deserialize)]
pub struct FeatureCollection {
/// Bounding Box
///
/// [GeoJSON Format Specification § 5](https://tools.ietf.org/html/rfc7946#section-5)
#[serde(skip_serializing_if = "Option::is_none")]
pub bbox: Option<Bbox>,
pub features: Vec<Feature>,
/// Foreign Members
///
/// [GeoJSON Format Specification § 6](https://tools.ietf.org/html/rfc7946#section-6)
#[serde(skip_serializing_if = "Option::is_none", flatten)]
pub foreign_members: Option<JsonObject>,
} These two already fix the memory issues (which is mainly caused by large |
So, error handling of I had hoped it would allow The |
@rmanoka are you sure about that? I interpret https://serde.rs/error-handling.html as a "data format" can define it's own Error. |
The data format can indeed define their own error type; here data format we intend to use is |
@rmanoka I think I understand, so I guess the more "optimal" way would be to implement a more serde native |
I'm inclined to agree – in the case of GeoJSON handling, an order-of-magnitude increase is worth the cost of less informative error messages IMO. |
Currently, we convert
Geojson
to aMap<String, serde_json::Value>
to serialize. This seems to be quite wasteful with memory (and probably also compute-time). For instance, when serializing a large set of geoms, the serialization alone takes up more than 10x the memory required to store the final representation.I suggest we move out of this approach, in favour of a direct serialization via derive macros / custom impl.. From a quick skim, it looks like only
Value
requires a custom impl, and the rest can be done via serde_derive (thanks to the flexibility available therein).Once we move to this, we could also get rid of the
JsonObject
and all relatedTryFrom
/From
impls. Note thatserde_json
already provides{to,from}_value
that can convert any serializable type to/fromJsonValue
. Is there any additional value (no pun intended) in keeping the impls.?One minor concern: the docs suggest that serde is optional, but to the best of my understanding, it is a very crucial part of the crate. Am I missing something?
(Issue summarized from discord discussion.)
The text was updated successfully, but these errors were encountered: