hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jason hadoop <jason.had...@gmail.com>
Subject Re: Job taking long time to initialize
Date Fri, 03 Jul 2009 02:09:44 GMT
A couple of things that can cause a job to take a long time are replicating
distributed cache items,
and unpacking distributed cache items and otherwise preparing the local task
directory on the task trackers.
The job jar is a distributed cache item.

On Thu, Jul 2, 2009 at 5:48 PM, Philip Zeyliger <philip@cloudera.com> wrote:

> You can try to run it via LocalJobRunner ("hadoop jar yourjar -jt
> local" if you're using GenericOptionsParser), and see if it exhibits
> the same behavior there.  It's easy to push that into a debugger
> (HADOOP_OPTS="-Xdebug
> -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=8020" and
> point Eclipse at it) to set some breakpoints and see what's going on.
>
> Cheers,
>
> -- Philip
>
> On Thu, Jul 2, 2009 at 5:22 PM, Amandeep Khurana<amansk@gmail.com> wrote:
> > How do I figure out whats going on while a job is trying to initialize? I
> > have a job thats importing data from a DB into HBase and it takes very
> long
> > to initialize. The time is enough to cause a time out of the mappers and
> > eventually kill the job.
> >
> > Amandeep
> >
> >
> > Amandeep Khurana
> > Computer Science Graduate Student
> > University of California, Santa Cruz
> >
>



-- 
Pro Hadoop, a book to guide you from beginner to hadoop mastery,
http://www.amazon.com/dp/1430219424?tag=jewlerymall
www.prohadoopbook.com a community for Hadoop Professionals

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message