arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ivan Popivanov <ivan.popiva...@gmail.com>
Subject Does arrow streaming support splitting big string/binary data?
Date Tue, 25 Jun 2019 01:28:24 GMT
Hello,

Looking at these examples
<https://wesmckinney.com/blog/arrow-streaming-columnar/> and the
documentation <https://arrow.apache.org/docs/format/IPC.html>, it seems
that a record batch cannot span multiple messages. Is my understanding
correct?

Here is the scenario I am considering: two columns, an *int* and a *string*.
Let assume that we want the maximum message size to be 64K. If there is a
row with a string value of let's say 70K, it has to span multiple batches.
Does the current message format support this?

If it doesn't, then another layer is needed to create the messages when a
column size is a multiple of the message size.

Thanks
Ivan

Mime
View raw message