arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sebastien Binet <bi...@cern.ch>
Subject Re: (java) Producing an in-memory Arrow buffer from a file
Date Thu, 23 Jan 2020 13:12:24 GMT
hi Andrew,

slightly related but probably also slightly off-topic:
(for inspiration) you may want to look at how this is done in groot/rarrow
where tools are exported to
- expose a ROOT "schema" as an Arrow Schema
- expose a ROOT Tree as an Arrow Table

groot/rarrow isn't working on zero-copy of ROOT data, though.

hth,
-s

On Thu, Jan 23, 2020 at 2:03 PM Andrew Melo <andrew.melo@gmail.com> wrote:

> Hello all,
>
> I work in particle physics, which has standardized on the ROOT (
> http://root.cern) file format to store/process our data. The format
> itself is quite complicated, but the relevant part here is that after
> parsing/decompression, we end up with value and offset buffers holding our
> data.
>
> What I'd like to do is represent these data in-memory in the Arrow format.
> I've written a very rough POC where I manually put an Arrow stream into a
> ByteBuffer, then replaced the values and offset buffers with the bytes from
> my files., and I'm wondering what's the "proper" way to do this is. From my
> reading of the code, it appears (?) that what I want to do is produce a
> org.apache.arrow.vector.types.pojo.Schema object, and N ArrowRecordBatch
> objects, then use MessageSerializer to stick them into a ByteBuffer one
> after each other.
>
> Is this correct? Or, is there another API I'm missing?
>
> Thanks!
> Andrew
>

Mime
View raw message