arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wes McKinney <>
Subject Re: [c++] Help with serializing and IPC with dictionary arrays
Date Fri, 12 Feb 2021 23:33:31 GMT
hi Dawson — you need to follow the IPC stream protocol, e.g. what
RecordBatchStreamWriter or RecordBatchStreamReader are doing
internally. Is there a reason you cannot use these interfaces
(particularly their internal bits, which are also used to implement
Flight where messages are split across different elements of a gRPC

I'm not sure that I would advise you to deal with dictionary
disassembly and reconstruction on your own unless it's your only
option. That said if you look in the unit test suite you should be
able to find examples of where DictionaryBatch IPC messages are
reconstructed manually, and then used to reconstitute a RecordBatch
IPC message using the arrow::ipc::ReadRecordBatch API. We can try to
help you look in the right place, let us know.


On Fri, Feb 12, 2021 at 2:58 PM Dawson D'Almeida
<> wrote:
> I am trying to create a record batch containing any number of dictionary and/or normal
arrow arrays, serialize the record batch into bytes (a normal std::string), and send it via
grpc to another server process. On that end we receive the arrow bytes and deserialize using
the bytes and the schema.
> Is there a standard way to serialize/deserialize these dictionary arrays? It seems like
all of the info is packaged correctly into the record batch.
> I've looked through a lot of the c++ apache arrow source and test code but I can't find
how to approach our use case.
> The current failure is:
> Field with memory address 140283497044320 not found
> from the returns status from arrow::ipc::ReadRecordBatch
> Thanks,
> --
> Dawson d'Almeida
> Software Engineer
> MOBILE  +1 360 499 1852
> Snowflake Inc.
> 227 Bellevue Way NE
> Bellevue, WA, 98004

View raw message