hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From anil gupta <anilgupt...@gmail.com>
Subject Re: Hbase import Tsv performance (slow import)
Date Wed, 24 Oct 2012 16:30:19 GMT
Hi Nick,

How many hard drives your slaves has? RPM of those? How many mappers are
run concurrently on a node?Did you turn off speculative execution? Have a
look at disk i/o to see whether that is a bottleneck or not.

MR is disk I/O bound so if you only have one disk on slave and you are
running 5 Mapper concurrently then the job will slow down.

Thanks,
Anil

On Wed, Oct 24, 2012 at 9:18 AM, Kevin O'dell <kevin.odell@cloudera.com>wrote:

> Nick,
>
>   What versions are you using:
>
> HDFS
> HBase
> OS
>  On Oct 24, 2012 10:36 AM, "Nick maillard" <
> nicolas.maillard@fifty-five.com>
> wrote:
>
> > Hello everyone
> >
> > Still looking in the issue.
> > I have tried different tests and the results are surprising.
> > If I put mapred.tasktracker.map.tasks.maximum: 28
> > I get a total of 84 tasks on my cluster and the process takes about 1h15
> > min
> > each task taking up 1h10 minutes. The whole file being cut down in 80
> > tasks.
> >
> > If I put  mapred.tasktracker.map.tasks.maximum: 3
> > I get a total of 6 tasks on my cluster and the process takes about the
> same
> > amount of time 1h20 still cutting down the whole file in 80 tasks, but
> now
> > of
> > course each individual task only takes up a couple of minutes.
> >
> > It's like the overall importTSv must take 1h something and the duration
> of
> > the
> > map tasks vary accordingly.
> >
> > There is definitly something I am doing wrong.
> >
> >
> >
> >
>



-- 
Thanks & Regards,
Anil Gupta

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message