hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bejoy.had...@gmail.com
Subject Re: Submitting MapReduce job from remote server using JobClient
Date Thu, 24 Jan 2013 19:49:00 GMT
Hi Amit,

Apart for the hadoop jars, Do you have the same config files ($HADOOP_HOME/conf) that are
in the cluster on your analytics server as well?

If you are having the default config files in analytics server then your MR job would be running
locally and not on the cluster. 

Regards 
Bejoy KS

Sent from remote device, Please excuse typos

-----Original Message-----
From: Amit Sela <amits@infolinks.com>
Date: Thu, 24 Jan 2013 18:15:49 
To: <user@hadoop.apache.org>
Reply-To: user@hadoop.apache.org
Subject: Re: Submitting MapReduce job from remote server using JobClient

Hi Harsh,
I'm using Job.waitForCompletion() method to run the job but I can't see it
in the webapp and it doesn't seem to finish...
I get:
 *org.apache.hadoop.mapred.JobClient                           - Running
job: job_local_0001*
*INFO  org.apache.hadoop.util.ProcessTree                           -
setsid exited with exit code 0*
*2013-01-24 08:10:12.521 [Thread-51]                INFO
 org.apache.hadoop.mapred.Task                                -  Using
ResourceCalculatorPlugin :
org.apache.hadoop.util.LinuxResourceCalculatorPlugin@7db1be6*
*2013-01-24 08:10:12.536 [Thread-51]                INFO
 org.apache.hadoop.mapred.MapTask                             - io.sort.mb
= 100*
*2013-01-24 08:10:12.573 [Thread-51]                INFO
 org.apache.hadoop.mapred.MapTask                             - data buffer
= 79691776/99614720*
*2013-01-24 08:10:12.573 [Thread-51]                INFO
 org.apache.hadoop.mapred.MapTask                             - record
buffer = 262144/327680*
*2013-01-24 08:10:12.599 [Thread-51]                INFO
 org.apache.hadoop.mapred.MapTask                             - Starting
flush of map output*
*2013-01-24 08:10:12.608 [Thread-51]                INFO
 org.apache.hadoop.mapred.Task                                -
Task:attempt_local_0001_m_000000_0 is done. And is in the process of
commiting*
*2013-01-24 08:10:13.348
[org.springframework.scheduling.quartz.SchedulerFactoryBean#0_Worker-1]
INFO  org.apache.hadoop.mapred.JobClient                           -  map
0% reduce 0%*
*2013-01-24 08:10:15.509 [Thread-51]                INFO
 org.apache.hadoop.mapred.LocalJobRunner                      - *
*2013-01-24 08:10:15.510 [Thread-51]                INFO
 org.apache.hadoop.mapred.Task                                - Task
'attempt_local_0001_m_000000_0' done.*
*2013-01-24 08:10:15.511 [Thread-51]                INFO
 org.apache.hadoop.mapred.Task                                -  Using
ResourceCalculatorPlugin :
org.apache.hadoop.util.LinuxResourceCalculatorPlugin@6b02b23d*
*2013-01-24 08:10:15.512 [Thread-51]                INFO
 org.apache.hadoop.mapred.MapTask                             - io.sort.mb
= 100*
*2013-01-24 08:10:15.549 [Thread-51]                INFO
 org.apache.hadoop.mapred.MapTask                             - data buffer
= 79691776/99614720*
*2013-01-24 08:10:15.550 [Thread-51]                INFO
 org.apache.hadoop.mapred.MapTask                             - record
buffer = 262144/327680*
*2013-01-24 08:10:15.557 [Thread-51]                INFO
 org.apache.hadoop.mapred.MapTask                             - Starting
flush of map output*
*2013-01-24 08:10:15.560 [Thread-51]                INFO
 org.apache.hadoop.mapred.Task                                -
Task:attempt_local_0001_m_000001_0 is done. And is in the process of
commiting*
*2013-01-24 08:10:16.358
[org.springframework.scheduling.quartz.SchedulerFactoryBean#0_Worker-1]
INFO  org.apache.hadoop.mapred.JobClient                           -  map
100% reduce 0%*

And after that, instead of going to Reduce phase I keep getting map
attempts like:

*INFO  org.apache.hadoop.mapred.MapTask                             -
io.sort.mb = 100*
*2013-01-24 08:10:21.563 [Thread-51]                INFO
 org.apache.hadoop.mapred.MapTask                             - data buffer
= 79691776/99614720*
*2013-01-24 08:10:21.563 [Thread-51]                INFO
 org.apache.hadoop.mapred.MapTask                             - record
buffer = 262144/327680*
*2013-01-24 08:10:21.570 [Thread-51]                INFO
 org.apache.hadoop.mapred.MapTask                             - Starting
flush of map output*
*2013-01-24 08:10:21.573 [Thread-51]                INFO
 org.apache.hadoop.mapred.Task                                -
Task:attempt_local_0001_m_000003_0 is done. And is in the process of
commiting*
*2013-01-24 08:10:24.529 [Thread-51]                INFO
 org.apache.hadoop.mapred.LocalJobRunner                      - *
*2013-01-24 08:10:24.529 [Thread-51]                INFO
 org.apache.hadoop.mapred.Task                                - Task
'attempt_local_0001_m_000003_0' done.*
*2013-01-24 08:10:24.530 [Thread-51]                INFO
 org.apache.hadoop.mapred.Task                                -  Using
ResourceCalculatorPlugin :
org.apache.hadoop.util.LinuxResourceCalculatorPlugin@42e87d99*
*
*
Any clues ?
Thanks for the help.

On Thu, Jan 24, 2013 at 5:12 PM, Harsh J <harsh@cloudera.com> wrote:

> The Job class itself has a blocking and non-blocking submitter that is
> similar to JobConf's runJob method you discovered. See
>
> http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapreduce/Job.html#submit()
> and its following method waitForCompletion(). These seem to be what
> you're looking for.
>
> On Thu, Jan 24, 2013 at 5:43 PM, Amit Sela <amits@infolinks.com> wrote:
> > Hi all,
> >
> > I want to run a MapReduce job using the Hadoop Java api from my analytics
> > server. It is not the master or even a data node but it has the same
> Hadoop
> > installation as all the nodes in the cluster.
> > I tried using JobClient.runJob() but it accepts JobConf as argument and
> when
> > using JobConf it is possible to set only mapred Mapper classes and I use
> > mapreduce...
> > I tried using JobControl and ControlledJob but it seems like it tries to
> run
> > the job locally. the map phase just keeps attempting...
> > Anyone tried it before ?
> > I'm just looking for a way to submit MapReduce jobs from Java code and be
> > able to monitor them.
> >
> > Thanks,
> >
> > Amit.
>
>
>
> --
> Harsh J
>

Mime
View raw message