hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thibaut_ <tbr...@blue.lu>
Subject Running parallel jobs having the same output directory
Date Mon, 20 Jul 2009 18:51:29 GMT


I'm trying to run a few parallel jobs which have the same input directory
and the same output directory.

I modified the FileInputClass to check for non zero files, and also modified
the output class to allow non empty directories (the input directory =
output directory in my case). I made sure that each job output is unique,
thus there are no file conflicts there.

Everything runs fine for a while, but I'm having problems with the temporary
java.io.IOException: The temporary job-output directory
hdfs://internal1:50010/user/root/0/_temporary doesn't exist!

I could go further down and try to make the _temporary directory job
dependent. But before I do that, I would like to know if there are other
traps/errors I could run into running parallel jobs having the same
output/input directory?

(Btw this is hadoop-0.20.0)


View this message in context: http://www.nabble.com/Running-parallel-jobs-having-the-same-output-directory-tp24575402p24575402.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

View raw message