avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ilia Khaustov (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (AVRO-2105) Using DataFileWriter in append mode with write-only file IO
Date Wed, 15 Nov 2017 22:25:00 GMT

     [ https://issues.apache.org/jira/browse/AVRO-2105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ilia Khaustov updated AVRO-2105:
--------------------------------
    Remaining Estimate:     (was: 4h)
     Original Estimate:     (was: 4h)

> Using DataFileWriter in append mode with write-only file IO 
> ------------------------------------------------------------
>
>                 Key: AVRO-2105
>                 URL: https://issues.apache.org/jira/browse/AVRO-2105
>             Project: Avro
>          Issue Type: Improvement
>          Components: python
>         Environment: Python 2/3
>            Reporter: Ilia Khaustov
>            Priority: Minor
>              Labels: python
>
> *Problem*: DataFileWriter supports "create" and "append" modes. "Append" mode can be
triggered by passing schema as None to constructor. In this case, it is required from given
file writer to allow reading as well - internal logic relies on reading meta information from
given file. If it was opened in "ab+" mode it works, but in "ab" it will raise IOError.
> *Practical example*: I use Avro serialization in Python with LZMA compression for serialized
files. LZMA library provides a file-like class LZMAFile for writing uncompressed data from
memory to disk, or reading compressed file to decompressed stream. It doesn't support "+"
modes - only compression or decompression, not both. This looks like a blocker for straight-forward
implementation of appending to compressed Avro objects. However, LZMAFile supports appending,
so does DataFileWriter.
> *Possible solution*: Add "reader" kwarg to DataFileWriter constructor that would be used
instead of "writer" in "append" mode for reading metadata. If not given, "reader" set to "writer"
for compatibility.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message