hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pankaj Gupta <pan...@brightroll.com>
Subject Reading part of file using Map Reduce
Date Wed, 31 Oct 2012 23:20:05 GMT

Is it possible to run a MapReduce job on a part of file on HDFS? The use case is using a single
file on HDFS as a stream to store all log events of a particular kind. New data can grow on
top while Map Reduce can process old data. Of course one option would be to copy part of data
into a separate file and give that to MapReduce but I was wondering if that extra copy can
be avoided.

View raw message