hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Out-of-band writing from mapper
Date Wed, 20 Apr 2011 09:18:10 GMT
Hello Christoph,

On Wed, Apr 20, 2011 at 2:12 PM, Christoph Schmitz
<Christoph.Schmitz@1und1.de> wrote:
> My question is: is there any mechanism to assist me in writing to some designated place
in the HDFS from the mapper, in a way that is recognized by the framework (i.e. dealing with
aborted tasks, speculative execution etc.)?
>
> I was thinking along the lines of what is described in the FAQ here:
>
> http://wiki.apache.org/hadoop/FAQ#Can_I_write_create.2BAC8-write-to_hdfs_files_directly_from_map.2BAC8-reduce_tasks.3F
>
> The FAQ explains that for reducers, there is support for special per-task output directories
that are recognized by the framework, but it seems (I tried it out) that this is not supported
for mappers.

[Perhaps you can consider using the MultipleOutputs class to write
output files from your job, instead of writing your own FS handling
code.]

The attempt directories are created for both Map and Reduce tasks. If
the FAQ makes this ambiguous, it ought to be fixed :)

-- 
Harsh J

Mime
View raw message