arrow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rares Vernica <rvern...@gmail.com>
Subject Read Arrow 0.9.0 output using newer pyarrow version
Date Mon, 11 Mar 2019 04:49:28 GMT
Hello,

I have a C++ library using Arrow 0.9.0 to serialize data The code looks
like this:

std::shared_ptr<arrow::RecordBatch> arrowBatch;
arrowBatch = arrow::RecordBatch::Make(_arrowSchema, nCells, _arrowArrays);

std::shared_ptr<arrow::PoolBuffer> arrowBuffer(new
arrow::PoolBuffer(_arrowPool));
arrow::io::BufferOutputStream arrowStream(arrowBuffer);

std::shared_ptr<arrow::ipc::RecordBatchWriter> arrowWriter;
arrow::ipc::RecordBatchStreamWriter::Open(&arrowStream, _arrowSchema,
&arrowWriter);

arrowWriter->WriteRecordBatch(*arrowBatch);
...
reinterpret_cast<const char*>(arrowBuffer->data()), arrowBuffer->size())
...

The output bytes are then read in Python using pyarrow:

pyarrow.RecordBatchStreamReader(pyarrow.BufferReader(buf)).read_pandas()

Since the C++ side uses Arrow 0.9.0 I have been using pyarrow==0.9.0. When
using Python 3.7, getting pyarrow=0.9.0 is not easy since there are no
per-compiled .whl packages on PyPI.

I wonder if I could use newer pyarrow versions to parse the Arrow 0.9.0
ouput? Is the format compatible?

Thanks!
Rares

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message