hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joey Echeverria <j...@cloudera.com>
Subject Re: Incremental Mappers?
Date Tue, 22 Nov 2011 11:32:32 GMT
You're correct, currently HDFS only supports reading from closed files. You can configure flume
to write your data in small enough chunks so you can do incremental processing. 


On Nov 22, 2011, at 2:01, Romeo Kienzler <romeo@ormium.de> wrote:

> Hi,
> I'm planning to use Fume in order to stream data from a local client machine into HDFS
running on a cloud environment.
> Is there a way to start a mapper already on an incomplete file? As I know a file in HDFS
has to be closed first before a mapper can start.
> Is this true?
> Any possible idea for a solution of this problem?
> Or do I have to write smaller chunks of my big input file and create multiple files in
HDFS and start a separate map task on each file once it has been closed?
> Best Regards,
> Romeo
> Romeo Kienzler
> r o m e o @ o r m i u m . d e

View raw message