hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-8597) FsShell's Text command should be able to read avro data files
Date Mon, 10 Sep 2012 20:20:09 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-8597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Doug Cutting updated HADOOP-8597:

    Attachment: HADOOP-8597.patch

Not sure why, but your patch file didn't apply cleanly for me.  Here's the same patch, but
a version that applies cleanly.
> FsShell's Text command should be able to read avro data files
> -------------------------------------------------------------
>                 Key: HADOOP-8597
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8597
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs
>    Affects Versions: 2.0.0-alpha
>            Reporter: Harsh J
>              Labels: newbie
>         Attachments: HADOOP-8597-2.patch, HADOOP-8597.patch, HADOOP-8597.patch
> Similar to SequenceFiles are Apache Avro's DataFiles. Since these are getting popular
as a data format, perhaps it would be useful if {{fs -text}} were to add some support for
reading it, like it reads SequenceFiles. Should be easy since Avro is already a dependency
and provides the required classes.
> Of discussion is the output we ought to emit. Avro DataFiles aren't simple as text, nor
have they the singular Key-Value pair structure of SequenceFiles. They usually contain a set
of fields defined as a record, and the usual text emit, as available from avro-tools via http://avro.apache.org/docs/current/api/java/org/apache/avro/tool/DataFileReadTool.html,
is in proper JSON format.
> I think we should use the JSON format as the output, rather than a delimited form, for
there are many complex structures in Avro and JSON is the easiest and least-work-to-do way
to display it (Avro supports json dumping by itself).

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message