hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Out-of-band writing from mapper
Date Wed, 20 Apr 2011 09:18:10 GMT
Hello Christoph,

On Wed, Apr 20, 2011 at 2:12 PM, Christoph Schmitz
<Christoph.Schmitz@1und1.de> wrote:
> My question is: is there any mechanism to assist me in writing to some designated place
in the HDFS from the mapper, in a way that is recognized by the framework (i.e. dealing with
aborted tasks, speculative execution etc.)?
> I was thinking along the lines of what is described in the FAQ here:
> http://wiki.apache.org/hadoop/FAQ#Can_I_write_create.2BAC8-write-to_hdfs_files_directly_from_map.2BAC8-reduce_tasks.3F
> The FAQ explains that for reducers, there is support for special per-task output directories
that are recognized by the framework, but it seems (I tried it out) that this is not supported
for mappers.

[Perhaps you can consider using the MultipleOutputs class to write
output files from your job, instead of writing your own FS handling

The attempt directories are created for both Map and Reduce tasks. If
the FAQ makes this ambiguous, it ought to be fixed :)

Harsh J

View raw message