hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Dyer <psyb...@gmail.com>
Subject Re: Jobs randomly not starting
Date Tue, 17 Jul 2012 20:27:56 GMT
Upon further inspection of that log, it appears the problem is the startup
task just takes a very long time.

Typically it is taking at most 6 seconds, but sometimes (the cases I think
its hanging) it actually runs and finishes but takes 3-5 minutes.

Same problem with the cleanup (which is where I thought the reduce was
getting stuck).

I am currently the only user on this cluster and I never have more than 1
job in the queue at a time.


On Fri, Jul 13, 2012 at 1:04 AM, Harsh J <harsh@cloudera.com> wrote:

> Hey Robert,
> Any chance you can pastebin the JT logs, grepped for the bad job ID,
> and send the link across? They shouldn't hang the way you describe.
> On Fri, Jul 13, 2012 at 9:33 AM, Robert Dyer <psybers@gmail.com> wrote:
> > I'm using Hadoop 1.0.3 on a small cluster (1 namenode, 1 jobtracker, 2
> > compute nodes).  My input size is a sequence file of around 280mb.
> >
> > Generally, my jobs run just fine and all finish in 2-5 minutes.  However,
> > quite randomly the jobs refuse to run.  They submit and appear when
> running
> > 'hadoop job -list' but don't appear on the jobtracker's webpage.  If I
> > manually type in the job ID on the webpage I can see it is trying to run
> the
> > setup task - the map tasks haven't even started.  I've left them to run
> and
> > even after several minutes it is still in this state.
> >
> > When I spot this, I kill the job and resubmit it and generally it works.
> >
> > A couple of times I have seen similar problems with reduce tasks that get
> > stuck while 'initializing'.
> >
> > Any ideas?
> >

View raw message