hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Radim Kolar <...@filez.com>
Subject multipleoutputs does not like speculative execution in map-only job
Date Wed, 12 Sep 2012 22:51:37 GMT
with speculative execution enabled Hadoop can run task attempt on more 
then 1 node. If mapper is using multipleoutputs then second attempt (or 
sometimes even all) fails to create output file because it is being 
created by another attempt:

fails with
Error: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: 
failed to create file /cznewgen/segments/20120907190053/parse_db/-m-00000

in my code i am using mos.write with 4 arguments. this problem is 
discussed in javadoc for FileOutputFormat function getWorkOutputPath, 
its possible to change MultipleOutputs to take advantage of this function?

or its better to change FileOoutputFormat.getUniqueFile() to append last 
digit in attempt id to filename to create unique names such as 
/cznewgen/segments/20120907190053/parse_db/-m-00000_0 ?

View raw message