arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Melo <andrew.m...@gmail.com>
Subject (java) Producing an in-memory Arrow buffer from a file
Date Thu, 23 Jan 2020 13:02:32 GMT
Hello all,

I work in particle physics, which has standardized on the ROOT (
http://root.cern) file format to store/process our data. The format itself
is quite complicated, but the relevant part here is that after
parsing/decompression, we end up with value and offset buffers holding our
data.

What I'd like to do is represent these data in-memory in the Arrow format.
I've written a very rough POC where I manually put an Arrow stream into a
ByteBuffer, then replaced the values and offset buffers with the bytes from
my files., and I'm wondering what's the "proper" way to do this is. From my
reading of the code, it appears (?) that what I want to do is produce a
org.apache.arrow.vector.types.pojo.Schema object, and N ArrowRecordBatch
objects, then use MessageSerializer to stick them into a ByteBuffer one
after each other.

Is this correct? Or, is there another API I'm missing?

Thanks!
Andrew

Mime
View raw message