hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Irfan Mohammed <irfan...@gmail.com>
Subject Re: performance help
Date Mon, 06 Jul 2009 18:24:43 GMT
Input is 1 file. 

These are 4 different tables "txn_m1", "txn_m2", "txn_m3", "txn_m4". To me, it looks like
it is always doing 1 region per table and these tables are always on different regionservers.
I never seen the same table on different regionservers. Does that sound right?

----- Original Message -----
From: "stack" <stack@duboce.net>
To: hbase-dev@hadoop.apache.org
Sent: Monday, July 6, 2009 2:14:43 PM GMT -05:00 US/Canada Eastern
Subject: Re: performance help

On Mon, Jul 6, 2009 at 11:06 AM, Irfan Mohammed <irfan.ma@gmail.com> wrote:

> I am working on writing to HDFS files. Will update you by end of day today.
>
> There are always 10 concurrent mappers running. I keep setting the
> setNumMaps(5) and also the following properties in mapred-site.xml to 3 but
> still end up running 10 concurrent maps.
>


Is your input ten files?


>
> There are 5 regionservers and the online regions are as follows :
>
> m1 : -ROOT-,,0
> m2 : txn_m1,,1245462904101
> m3 : txn_m4,,1245462942282
> m4 : txn_m2,,1245462890248
> m5 : .META.,,1
>     txn_m3,,1245460727203
>


So, that looks like 4 regions from table txn?

So thats about 1 region per regionserver?


> I have setAutoFlush(false) and also writeToWal(false) with the same
> behaviour.
>

If you did above and still takes 10 minutes, then that would seem to rule
out hbase (batching should have big impact on uploads and then setting
writeToWAL to false, should double throughput over whatever you were seeing
previous).

St.Ack

Mime
View raw message