hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: Performance of EC2
Date Tue, 26 Jan 2010 18:36:40 GMT
How big is your dataset?

J-D

On Tue, Jan 26, 2010 at 8:47 AM, Something Something
<mailinglists19@gmail.com> wrote:
> I have noticed some strange performance numbers on EC2.  If someone can give
> me some hints to improve performance that would be greatly appreciated.
>  Here are the details:
>
> I have a process that runs a series of Jobs under Hadoop 0.20.1 & Hbase
> 0.20.2  I ran the *exact* same process with following configurations:
>
> 1) 1 Master & 4 Workers (*c1.xlarge* instances) & 1 Zookeeper (*c1.medium*)
> with *8 Reducers *for every Reduce task.  The process completed in *849*
>  seconds.
>
> 2) 1 Master, 4 Workers & 1 Zookeeper  *ALL m1.small* instances with *8
> Reducers *for every Reduce task.  The process completed in *906* seconds.
>
> 3) 1 Master, *11* Workers & *3* Zookeepers  *ALL m1.small* instances with *20
> Reducers *for every Reduce task.  The process completed in *984* seconds!
>
>
> Two main questions:
>
> 1)  It's totally surprising that when I have 11 workers with 20 Reducers it
> runs slower than when I have exactly same type of fewer machines with fewer
> reducers..
> 2)  As expected it runs faster on c1.xlarge, but the performance improvement
> doesn't justify the high cost difference.  I must not be utilizing the
> machine power, but I don't know how to do that.
>
> Here are some of the performance improvements tricks that I have learnt from
> this mailing list in the past that I am using:
>
> 1)  conf.set("hbase.client.scanner.caching", "30");   I have this for all
> jobs.
>
> 2)  Using the following code every time I open a HTable:
>        this.table = new HTable(new HBaseConfiguration(), "tablenameXYZ");
>        table.setAutoFlush(false);
>        table.setWriteBufferSize(1024 * 1024 * 12);
>
> 3)  For every Put I do this:
>          Put put = new Put(Bytes.toBytes(out));
>          put.setWriteToWAL(false);
>
> 4)  Change the No. of Reducers as per the No. of Workers.  I believe the
> formula is:  # of workers * 1.75.
>
> Any other hints?  As always, greatly appreciate the help.  Thanks.
>

Mime
View raw message