arrow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mike Pedersen (JIRA)" <>
Subject [jira] [Created] (ARROW-4836) "Cannot tell() a compressed stream" when using RecordBatchStreamWriter
Date Tue, 12 Mar 2019 11:37:00 GMT
Mike Pedersen created ARROW-4836:

             Summary: "Cannot tell() a compressed stream" when using RecordBatchStreamWriter
                 Key: ARROW-4836
             Project: Apache Arrow
          Issue Type: Bug
          Components: Python
    Affects Versions: 0.12.1
            Reporter: Mike Pedersen

It does not seem like RecordBatchStreamWriter works with compressed streams:

>>> import pyarrow as pa
>>> pa.__version__
>>> stream = pa.output_stream('/tmp/a.gz')
>>> batch = pa.RecordBatch.from_arrays([pa.array([1])], ['a'])
>>> writer = pa.RecordBatchStreamWriter(stream, batch.schema)
>>> writer.write(batch)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pyarrow/ipc.pxi", line 181, in pyarrow.lib._RecordBatchWriter.write
  File "pyarrow/ipc.pxi", line 196, in pyarrow.lib._RecordBatchWriter.write_batch
  File "pyarrow/error.pxi", line 89, in pyarrow.lib.check_status
pyarrow.lib.ArrowNotImplementedError: Cannot tell() a compressed stream

As I understand the documentation, this should be possible, right?

This message was sent by Atlassian JIRA

View raw message