hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Derrick H. Karimi" <dhkar...@sei.cmu.edu>
Subject RE: problem running multiple native mode map reduce processes concurrently
Date Fri, 22 Mar 2013 15:45:58 GMT
Thank you for the response.

Hadoop 0.20.2-cdh3u3

--Derrick H. Karimi
--Software Developer, SEI Innovation Center
--Carnegie Mellon University


-----Original Message-----
From: Harsh J [mailto:harsh@cloudera.com] 
Sent: Friday, March 22, 2013 1:32 AM
To: <user@hadoop.apache.org>
Subject: Re: problem running multiple native mode map reduce processes concurrently

Please post your Hadoop version (command: hadoop version).

On Thu, Mar 21, 2013 at 10:59 PM, Derrick H. Karimi <dhkarimi@sei.cmu.edu> wrote:
> Anybody have any ideas?  How can I safely run two native mode rap 
> reduces on one machine at the same time?
>
>
>
> --Derrick H. Karimi
>
> --Software Developer, SEI Innovation Center
>
> --Carnegie Mellon University
>
>
>
> From: Derrick H. Karimi
> Sent: Tuesday, March 19, 2013 10:55 PM
> To: 'user@hadoop.apache.org'
> Subject: problem running multiple native mode map reduce processes 
> concurrently
>
>
>
> Hi,
>
>
>
> I have a MapReduce program I have written and have used it on top of a 
> Hadoop cluster with success.  During development, for quick tests, and 
> when the cluster is not available I run it on machines that have no 
> access to a Hadoop cluster.  I do this with regular command line 
> invocation
>
>
>
> java -cp $MY_HADOOP_JARS:mybuild/app_under_test.jar
>
>
>
> This works fine, until I attempt to run more than one at a time.  When 
> I do launch many at one time I intermittently get failures.  (each 
> invocation is using a separate copy of jars, and has its own working 
> directory and input/output area, they are fully distributable and do not share anything.
> The machines have plenty of disk space too.)  Most commonly I get two 
> exception's in my job's stderr output:
>
>
>
> org.apache.hadoop.util.DiskChecker$DiskErrorException: "Could not find 
> output/file.out in any of the configured local directories"
>
>
>
> when I see this error the job appears to continue on, but in the 
> output I can tell that several of my input files were not processed.  
> I have nothing called "output/file.out" in my job.
>
>
>
> The other error text I do not have handy at the moment, but it appears 
> to be an XML parser error at job startup on some file in the /tmp 
> directory that is not part of any file mentioned in my job.  Here I 
> assume that multiple instances of the native mode implementation of 
> map reduce are trying to write to the same file at startup and it gets 
> corrupted.  In these cases the job fails and I do not get any output.  
> I theorize I can work around this error by sleeping a few seconds between launching my
processes.
>
>
>
> I expected to be able to run more than one of these processes at the 
> same time.  It appears I cannot.  Does anyone have any suggestions 
> that would help me do this?
>
>
>
> --Derrick H. Karimi
>
> --Software Developer, SEI Innovation Center
>
> --Carnegie Mellon University
>
>



--
Harsh J

Mime
View raw message