hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Derrick H. Karimi" <dhkar...@sei.cmu.edu>
Subject problem running multiple native mode map reduce processes concurrently
Date Wed, 20 Mar 2013 02:55:15 GMT

I have a MapReduce program I have written and have used it on top of a Hadoop cluster with
success.  During development, for quick tests, and when the cluster is not available I run
it on machines that have no access to a Hadoop cluster.  I do this with regular command line

java -cp $MY_HADOOP_JARS:mybuild/app_under_test.jar

This works fine, until I attempt to run more than one at a time.  When I do launch many at
one time I intermittently get failures.  (each invocation is using a separate copy of jars,
and has its own working directory and input/output area, they are fully distributable and
do not share anything.  The machines have plenty of disk space too.)  Most commonly I get
two exception's in my job's stderr output:

org.apache.hadoop.util.DiskChecker$DiskErrorException: "Could not find output/file.out in
any of the configured local directories"

when I see this error the job appears to continue on, but in the output I can tell that several
of my input files were not processed.  I have nothing called "output/file.out" in my job.

The other error text I do not have handy at the moment, but it appears to be an XML parser
error at job startup on some file in the /tmp directory that is not part of any file mentioned
in my job.  Here I assume that multiple instances of the native mode implementation of map
reduce are trying to write to the same file at startup and it gets corrupted.  In these cases
the job fails and I do not get any output.  I theorize I can work around this error by sleeping
a few seconds between launching my processes.

I expected to be able to run more than one of these processes at the same time.  It appears
I cannot.  Does anyone have any suggestions that would help me do this?

--Derrick H. Karimi
--Software Developer, SEI Innovation Center
--Carnegie Mellon University

View raw message