avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexander Hasha (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AVRO-1286) Python script avro cat should be able to read from stdin
Date Fri, 27 Feb 2015 22:45:04 GMT

    [ https://issues.apache.org/jira/browse/AVRO-1286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14340963#comment-14340963

Alexander Hasha commented on AVRO-1286:

Has anyone thought any more about this recently?  I'm looking at this issue for my own purposes.
 As far as I can tell so far, the calls to `seek` are not inherently necessary to parsing
the data stream.  There is one seek to determine the file length, but that looks like a convenience
method for determining if the end of the file has been reached.  (You can tell when that happens
on a stream fairly easily.)  You do need to seek backwards by `SYNC_SIZE`, but it seems like
this could be accomplished by buffering a whole number of blocks in memory, not necessarily
the whole file.

I'd like to give this a shot, but am worried I'm failing to understand an important detail.

> Python script avro cat should be able to read from stdin
> --------------------------------------------------------
>                 Key: AVRO-1286
>                 URL: https://issues.apache.org/jira/browse/AVRO-1286
>             Project: Avro
>          Issue Type: Bug
>          Components: python
>            Reporter: Uri Laserson
>            Priority: Minor
> Currently, you have to specify a target file on the command line.  But it would be nice
to be able to stream data through avro cat.

This message was sent by Atlassian JIRA

View raw message