arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Micah Kornfield <emkornfi...@gmail.com>
Subject Re: [C++] [Python] How to serialize and send C++ arrow types to a Python client to deserialize?
Date Mon, 10 Aug 2020 18:00:36 GMT
Hi Barbara

>  I also need this C++ client to serialize/deserialize the data in the same
> format that our existing Python client does with pyarrow, so that
> serialized data sent from the C++ client can be read from the Python client
> and vice versa.

How are you serializing data in python? If you are using pyarrow.serialize
[1] then I would suggest moving to serializing an IPC stream in memory [2]
and using that.  As the docs in pyarrow.serialize mention it is
experimental, this means it has no guarantees for forward and backward
compatibility.

Moving to this method would allow you use the RecordBatchReader from C++ [3]

After a bit of reading and research, I suspect that I should be using the
> arrow::py library, but was hoping to get more guidance on this.


It is not entirely clear without a code sample, but if you are using the
C++ python libraries then you need to ensure the the Python interpreter and
arrow python module are initialized [4].

Hope this helps.

-Micah

[1]
https://arrow.apache.org/docs/python/generated/pyarrow.serialize.html#pyarrow.serialize

[2] https://arrow.apache.org/docs/python/ipc.html#using-streams
[3]
https://arrow.apache.org/docs/cpp/api/table.html#_CPPv4N5arrow17RecordBatchReaderE
[4]
https://github.com/apache/arrow/blob/2bd2fc45c45cb0edc8800eb53721231b56a65113/cpp/src/arrow/python/util/test_main.cc#L29

On Mon, Aug 10, 2020 at 10:32 AM Barbara BOYAJIAN <
barbara@elementaryrobotics.com> wrote:

> Hello,
>
> I'm currently looking to use Arrow in the following use case. I am writing
> a C++ client, where I need to send serialized Arrow data to Redis, and
> deserialize Arrow data that is received from Redis. I'm using boost::asio
> to communicate with Redis, and am able to send/receive buffers via unix and
> tcp sockets. I also need this C++ client to serialize/deserialize the data
> in the same format that our existing Python client does with pyarrow, so
> that serialized data sent from the C++ client can be read from the Python
> client and vice versa.
>
> I wish to be able to apply the above use case to send/receive
> arrow::Tensors, arrow::Tables, and arrow::Arrays.
>
> After a bit of reading and research, I suspect that I should be using the
> arrow::py library, but was hoping to get more guidance on this.
>
> So far, I have created a C++ arrow::Table manually, wrapped it using
> arrow::py::wrap_table, and have tried to use arrow::SerializeObject(...) to
> serialize it. However, my approach is not working as the memory address for
> the variable that is meant to hold the serialized object is 0x0.
>
> Thank you very much in advance for your help.
>
>
>

Mime
View raw message