Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve ergonomics of encoding and decoding data (using arrow IPC) #8614

Merged
merged 5 commits into from
Jan 9, 2025

Conversation

zehiko
Copy link
Contributor

@zehiko zehiko commented Jan 8, 2025

What

Enable more ergonomic encoding and decoding of protobuf types.

@zehiko zehiko added exclude from changelog PRs with this won't show up in CHANGELOG.md remote-store remote store gRPC API labels Jan 8, 2025
@zehiko zehiko self-assigned this Jan 8, 2025
Copy link

github-actions bot commented Jan 8, 2025

Web viewer built successfully. If applicable, you should also test it:

  • I have tested the web viewer
Result Commit Link Manifest
da09ae9 https://rerun.io/viewer/pr/8614 +nightly +main

Note: This comment is updated whenever you push a commit.

@zehiko zehiko marked this pull request as ready for review January 8, 2025 10:58
Comment on lines +27 to +42
/// Decode an object from a its wire (protobuf) representation.
pub trait Decode {
fn decode(&self) -> Result<TransportChunk, CodecError>;
}

impl Decode for DataframePart {
fn decode(&self) -> Result<TransportChunk, CodecError> {
decode(self.encoder_version(), &self.payload)
}
}

impl Decode for RerunChunk {
fn decode(&self) -> Result<TransportChunk, CodecError> {
decode(self.encoder_version(), &self.payload)
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still makes sense as an intermediate step, but DataframePart decoding to TransportChunk still feels weird to me as DataframePart's are not actually valid Chunks, even though TransportChunk can (conveniently) store them.

I think this should decode to an alternative parallel Dataframe-oriented structure that looks very similar but is typed distinctly so nobody is inclined to try to create a Chunk out of it (which may fail due to missing metadata, etc.)

Copy link
Contributor Author

@zehiko zehiko Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, I was also slightly reluctant to have 2 very distinct data structures decode to TC as they indeed have very different semantics. Do you think we'll have a need for 2 different ones once we have RecordBatch on the wire?
Edit: thinking about it bit more, it feels that we should have 2 distinct ones.

@zehiko zehiko merged commit 574637d into main Jan 9, 2025
33 checks passed
@zehiko zehiko deleted the zehiko/serializable branch January 9, 2025 07:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
exclude from changelog PRs with this won't show up in CHANGELOG.md remote-store remote store gRPC API
Projects
None yet
Development

Successfully merging this pull request may close these issues.

gRPC: Simplify codec and provide encoding/decoding for DataframePart directly
3 participants