arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wes McKinney <wesmck...@gmail.com>
Subject Re: [C++] Storing/retreiving a Table in plasma
Date Mon, 20 May 2019 13:15:23 GMT
hi Miki,

Steps

* Convert the Table to a sequence of RecordBatch objects. You can use
arrow::TableBatchReader to do this [1]
* Write a stream using MockOutputStream [2]
* Use the reported size of the total stream to allocate memory in Plasma
* Write a real stream using arrow::io::FixedSizeBufferWriter

I'm interested at some point to reduce the amount of boilerplate
associated with this process, and also to avoid multiple metadata
serialization and record batch disassembly steps. I'll open a JIRA
issue

We'd be delighted if you would contribute to the C++ documentation at
https://github.com/apache/arrow/tree/master/docs/source/cpp

- Wes

[1]: https://github.com/apache/arrow/blob/master/cpp/src/arrow/table.h#L340
[2]: https://github.com/apache/arrow/blob/7a5562174cffb21b16f990f64d114c1a94a30556/cpp/src/arrow/io/memory.h#L89

On Mon, May 20, 2019 at 7:24 AM Miki Tebeka <miki@353solutions.com> wrote:
>
> Hi,
>
> I'm looking for an example on how to store/retrieve a an arrow::Table in plasma. The
examples I see in the documentation site are for basic types.
>
> My end goal is to create data (Table) in C++, store it in plasma and read if from Python.
>
> From reading around, I need to allocate buffer in plasma, but how can I find the size
of the Table to allocate the table? And how can I serialize it into the created Buffer?
>
> Thanks,
> Miki

Mime
View raw message