-
Notifications
You must be signed in to change notification settings - Fork 374
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve ergonomics of encoding and decoding data (using arrow IPC) #8614
Conversation
Web viewer built successfully. If applicable, you should also test it:
Note: This comment is updated whenever you push a commit. |
/// Decode an object from a its wire (protobuf) representation. | ||
pub trait Decode { | ||
fn decode(&self) -> Result<TransportChunk, CodecError>; | ||
} | ||
|
||
impl Decode for DataframePart { | ||
fn decode(&self) -> Result<TransportChunk, CodecError> { | ||
decode(self.encoder_version(), &self.payload) | ||
} | ||
} | ||
|
||
impl Decode for RerunChunk { | ||
fn decode(&self) -> Result<TransportChunk, CodecError> { | ||
decode(self.encoder_version(), &self.payload) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still makes sense as an intermediate step, but DataframePart decoding to TransportChunk still feels weird to me as DataframePart's are not actually valid Chunks, even though TransportChunk can (conveniently) store them.
I think this should decode to an alternative parallel Dataframe-oriented structure that looks very similar but is typed distinctly so nobody is inclined to try to create a Chunk out of it (which may fail due to missing metadata, etc.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense, I was also slightly reluctant to have 2 very distinct data structures decode to TC as they indeed have very different semantics. Do you think we'll have a need for 2 different ones once we have RecordBatch
on the wire?
Edit: thinking about it bit more, it feels that we should have 2 distinct ones.
What
Enable more ergonomic encoding and decoding of protobuf types.
DataframePart
directly #8356