-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Definition::Enum::tag_width field #215
Conversation
Another idea is to introduce a Width type: pub enum Width {
NoWidth,
LE8 = 1,
LE16 = 2,
LE24 = 3,
LE32 = 4,
LE40 = 5,
LE48 = 6,
LE56 = 7,
LE64 = 8,
} which would specify how the tag is encoded. That would leave possibility for extensions in the future. |
Offtopic: /// A fixed-size tuple with the length known at the compile time and the
/// elements of different types.
Tuple {
/// Types of elements of the tuple.
elements: Vec<Declaration>,
},
/// A structure, structurally similar to a tuple.
Struct {
/// Named fields of the structure with their types.
///
/// Tuple-like structures are modelled as a `Tuple`.
fields: Vec<(FieldName, Declaration)>,
}, will be better than Tuple {
/// Types of elements of the tuple.
elements: Vec<(Option<FieldName>, Declaration)>,
}, as the latter allows a mix of named and unnamed fields. |
Please update branch with new changes in trunk as well as (periodically) other prs. |
9d72f84
to
f7d9c47
Compare
The field allows specifying how many bytes the discriminant byte takes. This in turn allows for better support of custom encoding formats with more than 256 variants and untagged unions. Issue: near#181
@mina86 the change to Let's take Enum {
variants: Vec<(VariantName, Declaration)>,
}, This pr adds some info about serialized form: Enum {
tag_width: u8,
variants: Vec<(VariantName, Declaration)>,
}, But, some info was skipped. With the addition of enum discriminants the enum tag in serialized form for Enum {
tag_width: u8,
variants: Vec<(u64, VariantName, Declaration)>,
}, If it were supposed to only convey information about serialized form, than it would look like: Enum {
tag_width: u8,
variants: BTreeMap<u64, Declaration>, // the order of variants, as they were declared in rust enum,
// is irrelevant to serialization
}, Similarly, this change for /// A fixed-size tuple with the length known at the compile time and the
/// elements of different types.
Tuple {
/// Types of elements of the tuple.
elements: Vec<Declaration>,
},
/// A structure, structurally similar to a tuple.
Struct {
/// Named fields of the structure with their types.
///
/// Tuple-like structures are modelled as a `Tuple`.
fields: Vec<(FieldName, Declaration)>,
}, contains some loss of information about originating forms in rust, as some If it were supposed to only convey information about serialized form, than it would look like this both for tuples and structs on rust side: Tuple { elements: Vec<Declaration> }, as the names of the fields are irrelevant. A variant with full info (both serialization & origin) is what it looks like now: Tuple { elements: Vec<Declaration> },
Struct { fields: Fields },
...
pub enum Fields {
NamedFields(Vec<(FieldName, Declaration)>),
UnnamedFields(Vec<Declaration>),
Empty,
} thus, i believe, to keep up the same logic about the // max info
pub enum Definition {
Sequence { // extends prev `Sequence` and replaces `Array`
length_width: u8,
min_length: u64,
max_length: u64,
elements: Declaration,
},
Tuple { // stays the same
elements: Vec<Declaration>,
},
Enum { // extended with `tag_width` and `u64` added to variants tuple
tag_width: u8,
variants: Vec<(u64, VariantName, Declaration)>,
},
Struct { fields: Fields }, // stays the same
}
pub enum Fields {
NamedFields(Vec<(FieldName, Declaration)>),
UnnamedFields(Vec<Declaration>),
Empty,
} or // only serialized form
pub enum Definition {
Sequence { // changes
length_width: u8,
min_length: u64,
max_length: u64,
elements: Declaration,
},
Tuple { // stays the same, replaces `Struct`
elements: Vec<Declaration>,
},
Enum { // changes
tag_width: u8,
variants: BTreeMap<u64, Declaration>, // the order of variants, as they were declared in rust enum,
}
} @frol what do you think? As there's also process of deserialization, which turns serialized bytes back into rust forms, i vote for variant with max info, On the other hand, if |
Note that one more piece of information one has to go on is declaration. Tuples are called Lastly, Enum::tag_width is being added in both situations so can we get this PR merged regardless? ;) |
@dj8yfo I don't have full context about Borsh Schema, and I still believe there are more problems with it than it solves, so anything that kind of works is fine with me. I am fine with iterative approach and this PR only adds |
@mina86 with all of that articulated,
yes Sequence { // extends prev `Sequence` and replaces `Array`
length_width: u8,
min_length: u64,
max_length: u64,
elements: Declaration,
}, in another pr. If you also considered adding info about enum discriminants' values (in this or another pr), that would be consistent with current pr's drive for adding details about data layout: Enum { // extended with `tag_width` and `u128` added to variants tuple
tag_width: u8,
variants: Vec<(u128, VariantName, Declaration)>,
}, The change of this pr might be useful not in the context of having enums defined with more than 256 variants, but having enums defined with reasonable number of variants which have discriminants outside of u8 range.
Due to the clash of names, i think there's no need to introduce any change to |
I can look into it, but I’m essentially doing all this in my spare time so I give no promises that I’ll add discriminants to the definition. |
it might actually be Enum { // extended with `tag_width` and `i64` added to variants tuple
tag_width: u8,
variants: Vec<(i64, VariantName, Declaration)>,
}, as rust allows
no, i just meant it would be a more meaningful investment of time for this repo than churning |
to make `BorshSchema` more congruent with `BorshSerialize, BorshDeserialize`
Difference is I already have Struct/Tuple mostly written. And adding discriminant to Enum seems like a huge can of worms. |
well, then you should be ready it gets rejected if it doesn't look absolutely stunning ) |
adding |
I think I don’t understand GitHub… I’m trying to comment on your changes but my comment don’t seem to appear. Firstly, why is actually dynamically-sized sequence of zero-sized encoding forbidden? Secondly, rather than calling |
668a885
to
2adf4ca
Compare
there must be this button in upper right corner |
The field allows specifying how many bytes the discriminant byte
takes. This in turn allows for better support of custom encoding
formats with more than 256 variants and untagged unions.
relates to: #181