improve ergonomics of encoding and decoding data (using arrow IPC) #8614

zehiko · 2025-01-08T10:32:18Z

What

Enable more ergonomic encoding and decoding of protobuf types.

Closes gRPC: Simplify codec and provide encoding/decoding for DataframePart directly #8356

github-actions · 2025-01-08T10:33:33Z

Web viewer built successfully. If applicable, you should also test it:

I have tested the web viewer

Result	Commit	Link	Manifest
✅	`da09ae9`	https://rerun.io/viewer/pr/8614	`+nightly` `+main`

^{Note: This comment is updated whenever you push a commit.}

jleibs · 2025-01-08T20:35:36Z

crates/store/re_log_encoding/src/codec/wire/decoder.rs

+/// Decode an object from a its wire (protobuf) representation.
+pub trait Decode {
+    fn decode(&self) -> Result<TransportChunk, CodecError>;
+}
+
+impl Decode for DataframePart {
+    fn decode(&self) -> Result<TransportChunk, CodecError> {
+        decode(self.encoder_version(), &self.payload)
+    }
+}
+
+impl Decode for RerunChunk {
+    fn decode(&self) -> Result<TransportChunk, CodecError> {
+        decode(self.encoder_version(), &self.payload)
+    }
+}


This still makes sense as an intermediate step, but DataframePart decoding to TransportChunk still feels weird to me as DataframePart's are not actually valid Chunks, even though TransportChunk can (conveniently) store them.

I think this should decode to an alternative parallel Dataframe-oriented structure that looks very similar but is typed distinctly so nobody is inclined to try to create a Chunk out of it (which may fail due to missing metadata, etc.)

makes sense, I was also slightly reluctant to have 2 very distinct data structures decode to TC as they indeed have very different semantics. Do you think we'll have a need for 2 different ones once we have RecordBatch on the wire?
Edit: thinking about it bit more, it feels that we should have 2 distinct ones.

zehiko added 3 commits January 8, 2025 10:01

ergonomics: introduce encoding and decoding interfaces

715f298

remove build/

33a269c

rebase

62d9940

zehiko added exclude from changelog PRs with this won't show up in CHANGELOG.md remote-store remote store gRPC API labels Jan 8, 2025

zehiko self-assigned this Jan 8, 2025

Merge branch 'main' into zehiko/serializable

6d3bb5f

nits

da09ae9

zehiko marked this pull request as ready for review January 8, 2025 10:58

emilk approved these changes Jan 8, 2025

View reviewed changes

jleibs reviewed Jan 8, 2025

View reviewed changes

zehiko merged commit 574637d into main Jan 9, 2025
33 checks passed

zehiko deleted the zehiko/serializable branch January 9, 2025 07:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

improve ergonomics of encoding and decoding data (using arrow IPC) #8614

improve ergonomics of encoding and decoding data (using arrow IPC) #8614

zehiko commented Jan 8, 2025 •

edited

Loading

github-actions bot commented Jan 8, 2025 •

edited

Loading

jleibs Jan 8, 2025

zehiko Jan 9, 2025 •

edited

Loading

improve ergonomics of encoding and decoding data (using arrow IPC) #8614

improve ergonomics of encoding and decoding data (using arrow IPC) #8614

Conversation

zehiko commented Jan 8, 2025 • edited Loading

What

github-actions bot commented Jan 8, 2025 • edited Loading

jleibs Jan 8, 2025

Choose a reason for hiding this comment

zehiko Jan 9, 2025 • edited Loading

Choose a reason for hiding this comment

zehiko commented Jan 8, 2025 •

edited

Loading

github-actions bot commented Jan 8, 2025 •

edited

Loading

zehiko Jan 9, 2025 •

edited

Loading