hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "dhruba borthakur (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-6338) Utility to tail the contents of a directory
Date Wed, 28 Oct 2009 00:20:59 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-6338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12770719#action_12770719

dhruba borthakur commented on HADOOP-6338:

Such a utility helps in providing a simple one-file-abstraction for an application that wants
to consume the contents of a data-set created by a map-reduce application. An application
that was consuming data in real-time via a "tail -f" command can be easlily migrated to work
directly on HDFS files. 

> Utility to tail the contents of a directory
> -------------------------------------------
>                 Key: HADOOP-6338
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6338
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
> There is an existing utility "bin/hadoop fs -tail -f <filename>" that prints the
last few records from the specified file. A map-reduce application uses a directory as a data-set
and it creates multiple files in a HDFS directory. I am proposing that we extend  "bin/hadoop
fs -tail -f <directory>" to tail the contents of a directory. The files in the directory
can be sorted (lexicographically, or based on modtimes) to arrive at the virtual sequence
of the set of files inside the directory. 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message