arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wes McKinney <wesmck...@gmail.com>
Subject Re: [C++] Storing/retreiving a Table in plasma
Date Mon, 20 May 2019 16:56:26 GMT
hi Miki,

That link didn't work for me. Would it not be better to do this work
in Apache Arrow rather than an external project? I would guess the
community would be interested in this.

- Wes

On Mon, May 20, 2019 at 9:48 AM Miki Tebeka <miki@353solutions.com> wrote:
>
> OK, almost working. I get "Write out of bounds" when running the code at https://github.com/353solutions/carrow/blob/plasma/plasma.cc
>
> Any ideas?
>
> Full output:
> batch size = 224
> buf size = 224
> error: write: Write out of bounds
>
> On Mon, May 20, 2019 at 5:21 PM Miki Tebeka <miki@353solutions.com> wrote:
>>
>> Thanks Wes
>>
>> On Mon, May 20, 2019 at 4:24 PM Wes McKinney <wesmckinn@gmail.com> wrote:
>>>
>>> See https://issues.apache.org/jira/browse/ARROW-5377
>>>
>>> On Mon, May 20, 2019 at 8:15 AM Wes McKinney <wesmckinn@gmail.com> wrote:
>>> >
>>> > hi Miki,
>>> >
>>> > Steps
>>> >
>>> > * Convert the Table to a sequence of RecordBatch objects. You can use
>>> > arrow::TableBatchReader to do this [1]
>>> > * Write a stream using MockOutputStream [2]
>>> > * Use the reported size of the total stream to allocate memory in Plasma
>>> > * Write a real stream using arrow::io::FixedSizeBufferWriter
>>> >
>>> > I'm interested at some point to reduce the amount of boilerplate
>>> > associated with this process, and also to avoid multiple metadata
>>> > serialization and record batch disassembly steps. I'll open a JIRA
>>> > issue
>>> >
>>> > We'd be delighted if you would contribute to the C++ documentation at
>>> > https://github.com/apache/arrow/tree/master/docs/source/cpp
>>> >
>>> > - Wes
>>> >
>>> > [1]: https://github.com/apache/arrow/blob/master/cpp/src/arrow/table.h#L340
>>> > [2]: https://github.com/apache/arrow/blob/7a5562174cffb21b16f990f64d114c1a94a30556/cpp/src/arrow/io/memory.h#L89
>>> >
>>> > On Mon, May 20, 2019 at 7:24 AM Miki Tebeka <miki@353solutions.com>
wrote:
>>> > >
>>> > > Hi,
>>> > >
>>> > > I'm looking for an example on how to store/retrieve a an arrow::Table
in plasma. The examples I see in the documentation site are for basic types.
>>> > >
>>> > > My end goal is to create data (Table) in C++, store it in plasma and
read if from Python.
>>> > >
>>> > > From reading around, I need to allocate buffer in plasma, but how can
I find the size of the Table to allocate the table? And how can I serialize it into the created
Buffer?
>>> > >
>>> > > Thanks,
>>> > > Miki

Mime
View raw message