hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Grant Ingersoll (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-2501) Implement utility-tools for working with SequenceFiles
Date Fri, 02 Sep 2011 14:12:09 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-2501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13095997#comment-13095997
] 

Grant Ingersoll commented on HADOOP-2501:
-----------------------------------------

Just came across this from a search looking for how to concat seq files.  Note, Mahout has
a number of these things implemented in it's utils package (dump, head, count, etc.) under
the integration module.  See the SequenceFileDumper and various others (like the iterators,
etc.)

> Implement utility-tools for working with SequenceFiles
> ------------------------------------------------------
>
>                 Key: HADOOP-2501
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2501
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: io
>            Reporter: Arun C Murthy
>            Assignee: Enis Soztutar
>
> It would be nice to implement a bunch of utilities to work with SequenceFiles:
>  * info (print-out header information such as key/value types, compression type/codec
etc.)
>  * cat
>  * head/tail
>  * merge multiple seq-files into one
>  * ...
> I'd imagine this would look like:
> {noformat}
> $ bin/hadoop seq -info /user/joe/blah.seq
> $ bin/hadoop seq -head -n 10 /user/joe/blah.seq
> {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message