hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Einspanjer <deinspan...@mozilla.com>
Subject Re: Need help trying to balance HBase RegionServer load
Date Thu, 17 Jun 2010 15:58:36 GMT
  Here is an example of a region split with both daughters being 
assigned to the same region.  Is this expected?

2010-06-17 08:34:53,060 INFO 
org.apache.hadoop.hbase.master.ServerManager: Processing 
MSG_REPORT_SPLIT_INCLUDES_DAUGHTERS: 
crash_reports,21006172700f355-1d02-485a-90d9-0e8182100617,1276776160508: 
Daughters; 
crash_reports,21006172700f355-1d02-485a-90d9-0e8182100617,1276788891647, 
crash_reports,21006172b7ec9f5-dcad-4c98-9dc5-969532100617,1276788891647 
from cm-hadoop14.mozilla.org,60020,1276560962019; 1 of 1
2010-06-17 08:34:54,316 INFO 
org.apache.hadoop.hbase.master.RegionManager: Assigning region 
crash_reports,21006172700f355-1d02-485a-90d9-0e8182100617,1276788891647 
to cm-hadoop15.mozilla.org,60020,1276778868841
2010-06-17 08:34:54,316 INFO 
org.apache.hadoop.hbase.master.RegionManager: Assigning region 
crash_reports,21006172b7ec9f5-dcad-4c98-9dc5-969532100617,1276788891647 
to cm-hadoop15.mozilla.org,60020,12767788688412010-06-17 08:34:55,432 
INFO org.apache.hadoop.hbase.master.ServerManager: Processing 
MSG_REPORT_OPEN: 
crash_reports,21006172700f355-1d02-485a-90d9-0e8182100617,1276788891647 
from cm-hadoop15.mozilla.org,60020,1276778868841;
1 of 1
2010-06-17 08:34:55,432 INFO 
org.apache.hadoop.hbase.master.RegionServerOperation: 
crash_reports,21006172700f355-1d02-485a-90d9-0e8182100617,1276788891647 
open on 10.2.72.74:60020
2010-06-17 08:34:55,436 INFO 
org.apache.hadoop.hbase.master.RegionServerOperation: Updated row 
crash_reports,21006172700f355-1d02-485a-90d9-0e8182100617,1276788891647 
in region .META.,,1 with startcode=1276778868841, server=1
0.2.72.74:60020
2010-06-17 08:34:56,044 INFO 
org.apache.hadoop.hbase.master.ServerManager: Processing 
MSG_REPORT_OPEN: 
crash_reports,21006172b7ec9f5-dcad-4c98-9dc5-969532100617,1276788891647 
from cm-hadoop15.mozilla.org,60020,1276778868841;
1 of 1
2010-06-17 08:34:56,044 INFO 
org.apache.hadoop.hbase.master.RegionServerOperation: 
crash_reports,21006172b7ec9f5-dcad-4c98-9dc5-969532100617,1276788891647 
open on 10.2.72.74:60020
2010-06-17 08:34:56,048 INFO 
org.apache.hadoop.hbase.master.RegionServerOperation: Updated row 
crash_reports,21006172b7ec9f5-dcad-4c98-9dc5-969532100617,1276788891647 
in region .META.,,1 with startcode=1276778868841, server=1
0.2.72.74:60020


On 6/17/10 11:42 AM, Daniel Einspanjer wrote:
>  Currently, in our production cluster, almost all of the traffic for a 
> day ends up assigned to a single RS and that causes the load on that 
> machine to be too high. 

>
> With our last release, we salted our rowkeys so that rather than 
> starting with the date: 

> 100617<guid>
> 
they now start with the first letter of the guid followed by the date:
> 
e100617<guid_that_starts_with_e>
>
> When I look at the region assignments though, I see a single server 
> assigned the following regions:
> 
0100617...
> 
1100617...
> 
2100617...
> 
3100617...
> 
4100617...
> 
...
> 
d100617...
> 
e100617...
> 
f100617...
>
> Is there anything we can do to try to get the cluster to shuffle this 
> up some more?
> We are getting compaction times in the minutes (one I saw was over 12 
> minutes) and this causes our clients to time out and shut down which 
> causes production outages.
>
> -Daniel

Mime
View raw message