hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcos Luis Ortiz Valmaseda <marcosluis2...@gmail.com>
Subject Re: Why my tests shows Yarn is worse than MRv1 for terasort?
Date Fri, 07 Jun 2013 03:05:41 GMT
Why not to tune the configurations?
Both frameworks have many areas to tune:
- Combiners, Shuffle optimization, Block size, etc



2013/6/6 sam liu <samliuhadoop@gmail.com>

> Hi Experts,
>
> We are thinking about whether to use Yarn or not in the near future, and I
> ran teragen/terasort on Yarn and MRv1 for comprison.
>
> My env is three nodes cluster, and each node has similar hardware: 2 cpu(4
> core), 32 mem. Both Yarn and MRv1 cluster are set on the same env. To be
> fair, I did not make any performance tuning on their configurations, but
> use the default configuration values.
>
> Before testing, I think Yarn will be much better than MRv1, if they all
> use default configuration, because Yarn is a better framework than MRv1.
> However, the test result shows some differences:
>
> MRv1: Hadoop-1.1.1
> Yarn: Hadoop-2.0.4
>
> (A) Teragen: generate 10 GB data:
> - MRv1: 193 sec
> - Yarn: 69 sec
> *Yarn is 2.8 times better than MRv1*
>
> (B) Terasort: sort 10 GB data:
> - MRv1: 451 sec
> - Yarn: 1136 sec
> *Yarn is 2.5 times worse than MRv1*
>
> After a fast analysis, I think the direct cause might be that Yarn is much
> faster than MRv1 on Map phase, but much worse on Reduce phase.
>
> Here I have two questions:
> *- Why my tests shows Yarn is worse than MRv1 for terasort?
> *
> *- What's the stratage for tuning Yarn performance? Is any materials?*
>
> Thanks!
>



-- 
Marcos Ortiz Valmaseda
Product Manager at PDVSA
http://about.me/marcosortiz

Mime
View raw message