From Irfan Mohammed <irfan...@gmail.com>
Subject Re: performance help
Date Mon, 06 Jul 2009 18:44:34 GMT
i ran the following [ note : tables t1, txn_m5, txn_m6, txn are unused for now ]

hbase(main):002:0> status 'detailed'
09/07/06 14:29:32 INFO zookeeper.ZooKeeperWrapper: Quorum servers: app16:2181,app48:2181,app122:2181
version 0.20.0-dev
5 live servers
    app16:60020 1246848846822
        requests=0, regions=2, usedHeap=65, maxHeap=963
            stores=18, storefiles=17, memcacheSize=0, storefileIndexSize=0
            stores=18, storefiles=34, memcacheSize=6, storefileIndexSize=0
    app03:60020 1246848846821
        requests=0, regions=2, usedHeap=36, maxHeap=963
            stores=72, storefiles=60, memcacheSize=0, storefileIndexSize=0
            stores=18, storefiles=33, memcacheSize=2, storefileIndexSize=0
    app48:60020 1246848846825
        requests=0, regions=1, usedHeap=25, maxHeap=963
            stores=1, storefiles=3, memcacheSize=0, storefileIndexSize=0
    app01:60020 1246848846823
        requests=0, regions=3, usedHeap=173, maxHeap=963
            stores=2, storefiles=3, memcacheSize=0, storefileIndexSize=0
            stores=18, storefiles=34, memcacheSize=26, storefileIndexSize=0
            stores=18, storefiles=17, memcacheSize=0, storefileIndexSize=0
    app122:60020 1246848846826
        requests=0, regions=2, usedHeap=148, maxHeap=963
            stores=1, storefiles=1, memcacheSize=0, storefileIndexSize=0
            stores=18, storefiles=31, memcacheSize=17, storefileIndexSize=0
0 dead servers

----- Original Message -----
From: "Irfan Mohammed" <irfan.ma@gmail.com>
To: hbase-dev@hadoop.apache.org
Sent: Monday, July 6, 2009 2:24:43 PM GMT -05:00 US/Canada Eastern
Subject: Re: performance help

Input is 1 file. 

These are 4 different tables "txn_m1", "txn_m2", "txn_m3", "txn_m4". To me, it looks like
it is always doing 1 region per table and these tables are always on different regionservers.
I never seen the same table on different regionservers. Does that sound right?

----- Original Message -----
From: "stack" <stack@duboce.net>
To: hbase-dev@hadoop.apache.org
Sent: Monday, July 6, 2009 2:14:43 PM GMT -05:00 US/Canada Eastern
Subject: Re: performance help

On Mon, Jul 6, 2009 at 11:06 AM, Irfan Mohammed <irfan.ma@gmail.com> wrote:

> I am working on writing to HDFS files. Will update you by end of day today.
> There are always 10 concurrent mappers running. I keep setting the
> setNumMaps(5) and also the following properties in mapred-site.xml to 3 but
> still end up running 10 concurrent maps.

Is your input ten files?

> There are 5 regionservers and the online regions are as follows :
> m1 : -ROOT-,,0
> m2 : txn_m1,,1245462904101
> m3 : txn_m4,,1245462942282
> m4 : txn_m2,,1245462890248
> m5 : .META.,,1
>     txn_m3,,1245460727203

So, that looks like 4 regions from table txn?

So thats about 1 region per regionserver?

> I have setAutoFlush(false) and also writeToWal(false) with the same
> behaviour.

If you did above and still takes 10 minutes, then that would seem to rule
out hbase (batching should have big impact on uploads and then setting
writeToWAL to false, should double throughput over whatever you were seeing


