hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zaharije Pasalic <pasalic.zahar...@gmail.com>
Subject Re: HBase - hiting only one node on insert ...
Date Mon, 18 Jan 2010 17:12:37 GMT
Yes. That node contains META table. So, i can expect that for node(s)
which will contain META?


On Mon, Jan 18, 2010 at 5:56 PM, Cosmin Lehene <clehene@adobe.com> wrote:
> I'm not sure why there would be 0 requests for most region servers, but I
> usually se a higher number of requests (even when the cluster is idle) on
> the regionserver that serves .META. My guess is that, on your cluster,
> hadoop-node02 serves .META.
>
> Cosmin
>
>
> On 1/18/10 1:55 PM, "pasaliczaharije" <pasalic.zaharije@gmail.com> wrote:
>
>>
>> Sorry for messed text. Here is propper format:
>>
>>
>> Hi
>>
>> we are having small Hadoop cluster environment with 7 nodes (8GB ram/8cores
>> each node) + 1 master and on same nodes we deployed HBase (7 nodes).
>>
>> Currrenlty we are importing ~50milion records from csv files into hbase. csv
>> can have about 100 columns and rowkey is uuid generated with java.util.UUID.
>>
>> We are having about 50files on HDFS which is imported into hbase by
>> mapreduce.
>>
>> At start everything works fine, but after few minutes, we are having large
>> load on second node. Here is list from hbase master.jsp
>>
>> hadoop-node01:60030 1263591474251 requests=184, regions=148, usedHeap=1196,
>> maxHeap=1991
>> hadoop-node02:60030 1263591474109 requests=663, regions=148, usedHeap=1489,
>> maxHeap=1991
>> hadoop-node03:60030 1263591474082 requests=161, regions=147, usedHeap=1526,
>> maxHeap=1991
>> hadoop-node04:60030 1263632774794 requests=142, regions=147, usedHeap=1213,
>> maxHeap=1991
>> hadoop-node06:60030 1263596977608 requests=152, regions=147, usedHeap=749,
>> maxHeap=1991
>> hadoop-node07:60030 1263597118777 requests=156, regions=148, usedHeap=1749,
>> maxHeap=1991
>> hadoop-node08:60030 1263597239565 requests=179, regions=148, usedHeap=1681,
>> maxHeap=1991
>>
>> (second node having about 5times more requests than other nodes) and at some
>> time we will have request=0 for all nodes excepts for node2 (where we are
>> having about 600-1800).
>>
>> In general we used uuid to have some kind of uniform load for all nodes. I'm
>> not sure is this some UUID thing (not uniform) or something other.
>>
>> Also, we are using default hadoop configuration (70nodes will result in 14
>> maps which runs in parallel). Is this optimal for this kind of job?
>>
>> Any comments?
>>
>> Thanks
>> -Zaharije
>>
>
>

Mime
View raw message