hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Load balancer repeatedly close and open region in the same regionserver.
Date Tue, 17 Jul 2012 15:04:45 GMT
Howard:
Before filing JIRA, can you verify with 0.94.1 RC that Lars sent out
yesterday ?
I guess you have noticed the following toward the end of log snippet:

2012-07-16 00:17:50,774 DEBUG
org.apache.hadoop.hbase.
master.handler.OpenedRegionHandler: Handling OPENED
event for
trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956.
from 192.168.1.2,60020,1342017399608; deleting unassigned node

As Ram pointed out, there might be two region server processes running on
192.168.1.2

Please verify whether that was the case.

Cheers

On Tue, Jul 17, 2012 at 7:30 AM, Ramkrishna.S.Vasudevan <
ramkrishna.vasudevan@huawei.com> wrote:

> From the logs I can see that though the server's are same their start code
> is different.
> Need to analyse the previous logs also.  Pls file a JIRA, if possible
> attach
> the logs to that.
>
> Thanks Howard.
>
> Regards
> Ram
>
> > -----Original Message-----
> > From: Howard [mailto:rj03hou@gmail.com]
> > Sent: Tuesday, July 17, 2012 7:32 PM
> > To: user@hbase.apache.org
> > Subject: Re: Load balancer repeatedly close and open region in the same
> > regionserver.
> >
> > *hi,Ted Yu,thanks for your reply:*
> > the hbase and hadoop version is
> > HBase Version0.94.0, r1332822Hadoop Version0.20.2-cdh3u1,
> > rbdafb1dbffd0d5f2fbc6ee022e1c8df6500fd638
> > the following is a detail log about the same region
> > trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7
> > e956,
> > and it repeats again and again.:
> > 2012-07-16 00:12:49,843 INFO org.apache.hadoop.hbase.master.HMaster:
> > balance
> > hri=trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160
> > e1b7e956.,
> > src=192.168.1.2,60020,1342017399608,
> > dest=192.168.1.2,60020,1342002082592
> > 2012-07-16 00:12:49,843 DEBUG
> > org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment
> > of
> > region
> > trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7
> > e956.
> > (offlining)
> > 2012-07-16 00:12:49,843 DEBUG
> > org.apache.hadoop.hbase.zookeeper.ZKAssign:
> > master:60000-0x4384d0a47f40068 Creating unassigned node for
> > 93caf5147d40f5dd4625e160e1b7e956 in a CLOSING state
> > 2012-07-16 00:12:49,845 DEBUG
> > org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to
> > 192.168.1.2,60020,1342017399608 for region
> > trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7
> > e956.
> > 2012-07-16 00:12:50,555 DEBUG
> > org.apache.hadoop.hbase.master.AssignmentManager: Handling
> > transition=RS_ZK_REGION_CLOSED, server=192.168.1.2,60020,1342017399608,
> > region=93caf5147d40f5dd4625e160e1b7e956
> > 2012-07-16 00:12:50,555 DEBUG
> > org.apache.hadoop.hbase.master.handler.ClosedRegionHandler: Handling
> > CLOSED
> > event for 93caf5147d40f5dd4625e160e1b7e956
> > 2012-07-16 00:12:50,555 DEBUG
> > org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE;
> > was=trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160
> > e1b7e956.
> > state=CLOSED, ts=1342368770556, server=192.168.1.2,60020,1342017399608
> > 2012-07-16 00:12:50,555 DEBUG
> > org.apache.hadoop.hbase.zookeeper.ZKAssign:
> > master:60000-0x4384d0a47f40068 Creating (or updating) unassigned node
> > for
> > 93caf5147d40f5dd4625e160e1b7e956 with OFFLINE state
> > 2012-07-16 00:12:50,558 DEBUG
> > org.apache.hadoop.hbase.master.AssignmentManager: Handling
> > transition=M_ZK_REGION_OFFLINE, server=10.75.18.34,60000,1342017369575,
> > region=93caf5147d40f5dd4625e160e1b7e956
> > 2012-07-16 00:12:50,558 DEBUG
> > org.apache.hadoop.hbase.master.AssignmentManager: Found an existing
> > plan
> > for
> > trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7
> > e956.
> > destination server is 192.168.1.2,60020,1342002082592
> > 2012-07-16 00:12:50,558 DEBUG
> > org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing
> > plan
> > for region
> > trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7
> > e956.;
> > plan=hri=trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd462
> > 5e160e1b7e956.,
> > src=192.168.1.2,60020,1342017399608,
> > dest=192.168.1.2,60020,1342002082592
> > 2012-07-16 00:12:50,558 DEBUG
> > org.apache.hadoop.hbase.master.AssignmentManager: Assigning region
> > trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7
> > e956.
> > to 192.168.1.2,60020,1342002082592
> > 2012-07-16 00:12:50,574 DEBUG
> > org.apache.hadoop.hbase.master.AssignmentManager: Handling
> > transition=RS_ZK_REGION_OPENING,
> > server=192.168.1.2,60020,1342017399608,
> > region=93caf5147d40f5dd4625e160e1b7e956
> > 2012-07-16 00:12:50,635 DEBUG
> > org.apache.hadoop.hbase.master.AssignmentManager: Handling
> > transition=RS_ZK_REGION_OPENING,
> > server=192.168.1.2,60020,1342017399608,
> > region=93caf5147d40f5dd4625e160e1b7e956
> > 2012-07-16 00:12:50,639 DEBUG
> > org.apache.hadoop.hbase.master.AssignmentManager: Handling
> > transition=RS_ZK_REGION_OPENED, server=192.168.1.2,60020,1342017399608,
> > region=93caf5147d40f5dd4625e160e1b7e956
> > 2012-07-16 00:12:50,639 DEBUG
> > org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling
> > OPENED
> > event for
> > trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7
> > e956.
> > from 192.168.1.2,60020,1342017399608; deleting unassigned node
> > 2012-07-16 00:12:50,640 DEBUG
> > org.apache.hadoop.hbase.zookeeper.ZKAssign:
> > master:60000-0x4384d0a47f40068 Deleting existing unassigned node for
> > 93caf5147d40f5dd4625e160e1b7e956 that is in expected state
> > RS_ZK_REGION_OPENED
> > 2012-07-16 00:12:50,641 DEBUG
> > org.apache.hadoop.hbase.master.AssignmentManager: The znode of region
> > trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7
> > e956.
> > has been deleted.
> > 2012-07-16 00:12:50,641 DEBUG
> > org.apache.hadoop.hbase.zookeeper.ZKAssign:
> > master:60000-0x4384d0a47f40068 Successfully deleted unassigned node for
> > region 93caf5147d40f5dd4625e160e1b7e956 in expected state
> > RS_ZK_REGION_OPENED
> > 2012-07-16 00:12:50,641 INFO
> > org.apache.hadoop.hbase.master.AssignmentManager: The master has opened
> > the
> > region
> > trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7
> > e956.
> > that was online on 192.168.1.2,60020,1342017399608
> > 2012-07-16 00:17:49,870 INFO org.apache.hadoop.hbase.master.HMaster:
> > balance
> > hri=trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160
> > e1b7e956.,
> > src=192.168.1.2,60020,1342017399608,
> > dest=192.168.1.2,60020,1342002082592
> > 2012-07-16 00:17:49,870 DEBUG
> > org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment
> > of
> > region
> > trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7
> > e956.
> > (offlining)
> > 2012-07-16 00:17:49,870 DEBUG
> > org.apache.hadoop.hbase.zookeeper.ZKAssign:
> > master:60000-0x4384d0a47f40068 Creating unassigned node for
> > 93caf5147d40f5dd4625e160e1b7e956 in a CLOSING state
> > 2012-07-16 00:17:49,872 DEBUG
> > org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to
> > 192.168.1.2,60020,1342017399608 for region
> > trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7
> > e956.
> > 2012-07-16 00:17:50,464 DEBUG
> > org.apache.hadoop.hbase.master.AssignmentManager: Handling
> > transition=RS_ZK_REGION_CLOSED, server=192.168.1.2,60020,1342017399608,
> > region=93caf5147d40f5dd4625e160e1b7e956
> > 2012-07-16 00:17:50,464 DEBUG
> > org.apache.hadoop.hbase.master.handler.ClosedRegionHandler: Handling
> > CLOSED
> > event for 93caf5147d40f5dd4625e160e1b7e956
> > 2012-07-16 00:17:50,464 DEBUG
> > org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE;
> > was=trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160
> > e1b7e956.
> > state=CLOSED, ts=1342369070465, server=192.168.1.2,60020,1342017399608
> > 2012-07-16 00:17:50,464 DEBUG
> > org.apache.hadoop.hbase.zookeeper.ZKAssign:
> > master:60000-0x4384d0a47f40068 Creating (or updating) unassigned node
> > for
> > 93caf5147d40f5dd4625e160e1b7e956 with OFFLINE state
> > 2012-07-16 00:17:50,467 DEBUG
> > org.apache.hadoop.hbase.master.AssignmentManager: Handling
> > transition=M_ZK_REGION_OFFLINE, server=10.75.18.34,60000,1342017369575,
> > region=93caf5147d40f5dd4625e160e1b7e956
> > 2012-07-16 00:17:50,467 DEBUG
> > org.apache.hadoop.hbase.master.AssignmentManager: Found an existing
> > plan
> > for
> > trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7
> > e956.
> > destination server is 192.168.1.2,60020,1342002082592
> > 2012-07-16 00:17:50,467 DEBUG
> > org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing
> > plan
> > for region
> > trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7
> > e956.;
> > plan=hri=trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd462
> > 5e160e1b7e956.,
> > src=192.168.1.2,60020,1342017399608,
> > dest=192.168.1.2,60020,1342002082592
> > 2012-07-16 00:17:50,467 DEBUG
> > org.apache.hadoop.hbase.master.AssignmentManager: Assigning region
> > trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7
> > e956.
> > to 192.168.1.2,60020,1342002082592
> > 2012-07-16 00:17:50,509 DEBUG
> > org.apache.hadoop.hbase.master.AssignmentManager: Handling
> > transition=RS_ZK_REGION_OPENING,
> > server=192.168.1.2,60020,1342017399608,
> > region=93caf5147d40f5dd4625e160e1b7e956
> > 2012-07-16 00:17:50,761 DEBUG
> > org.apache.hadoop.hbase.master.AssignmentManager: Handling
> > transition=RS_ZK_REGION_OPENING,
> > server=192.168.1.2,60020,1342017399608,
> > region=93caf5147d40f5dd4625e160e1b7e956
> > 2012-07-16 00:17:50,774 DEBUG
> > org.apache.hadoop.hbase.master.AssignmentManager: Handling
> > transition=RS_ZK_REGION_OPENED, server=192.168.1.2,60020,1342017399608,
> > region=93caf5147d40f5dd4625e160e1b7e956
> > 2012-07-16 00:17:50,774 DEBUG
> > org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling
> > OPENED
> > event for
> > trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7
> > e956.
> > from 192.168.1.2,60020,1342017399608; deleting unassigned node
> > 2012-07-16 00:17:50,774 DEBUG
> > org.apache.hadoop.hbase.zookeeper.ZKAssign:
> > master:60000-0x4384d0a47f40068 Deleting existing unassigned node for
> > 93caf5147d40f5dd4625e160e1b7e956 that is in expected state
> > RS_ZK_REGION_OPENED
> > 2012-07-16 00:17:50,775 DEBUG
> > org.apache.hadoop.hbase.master.AssignmentManager: The znode of region
> > trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7
> > e956.
> > has been deleted.
> > 2012-07-16 00:17:50,775 DEBUG
> > org.apache.hadoop.hbase.zookeeper.ZKAssign:
> > master:60000-0x4384d0a47f40068 Successfully deleted unassigned node for
> > region 93caf5147d40f5dd4625e160e1b7e956 in expected state
> > RS_ZK_REGION_OPENED
> > 2012-07-16 00:17:50,775 INFO
> > org.apache.hadoop.hbase.master.AssignmentManager: The master has opened
> > the
> > region
> > trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7
> > e956.
> > that was online on 192.168.1.2,60020,1342017399608
> > 2012-07-16 00:22:49,916 INFO org.apache.hadoop.hbase.master.HMaster:
> > balance
> > hri=trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160
> > e1b7e956.,
> > src=192.168.1.2,60020,1342017399608,
> > dest=192.168.1.2,60020,1342002082592
> > ....
> >
> > 2012/7/17 Ted Yu <yuzhihong@gmail.com>
> >
> > > Which HBase version were you using ?
> > >
> > > The log messages below were in very short time period. Can you
> > provide more
> > > logs ?
> > >
> > > Thanks
> > >
> > > On Tue, Jul 17, 2012 at 5:09 AM, Howard <rj03hou@gmail.com> wrote:
> > >
> > > > Through the master and regionserver log,I find load balancer
> > repeatedly
> > > > close and open region in the same regionserver(period in
> > > > hbase.balancer.period ).
> > > > Does this is a bug in load balancer and how can I dig into or avoid
> > this?
> > > > The log like this:
> > > > 2012-07-16 00:12:50,055 INFO
> > > > org.apache.hadoop.hbase.regionserver.HRegionServer: Received
> > request to
> > > > open region:
> > > >
> > > >
> > >
> > trackurl_status_list,zOwMwdi,1342294447440.7608037379153932a30f4bdd6d7e
> > df5d.
> > > > 2012-07-16 00:12:50,069 INFO
> > > > org.apache.hadoop.hbase.regionserver.HRegionServer: Received
> > request to
> > > > open region:
> > > >
> > > >
> > >
> > trackurl_comment_list,zO7vqv1,1342192347676.0c9036e183055ff2df93eca348a
> > eeada.
> > > > 2012-07-16 00:12:50,206 INFO
> > > > org.apache.hadoop.hbase.regionserver.HRegionServer: Received
> > request to
> > > > open region:
> > > >
> > > >
> > >
> > trackurl_status_list,zOzYDL9,1342294447440.47eb162f6aecffcddfa897fdbb70
> > 472b.
> > > > 2012-07-16 00:12:50,301 INFO
> > > > org.apache.hadoop.hbase.regionserver.HRegionServer: Received
> > request to
> > > > open region:
> > > >
> > > >
> > >
> > trackurl_status_list,zOM1HZP,1342293292587.9af5de3db0bad0470d9b8fe6b9ea
> > df14.
> > > > 2012-07-16 00:12:50,428 INFO
> > > > org.apache.hadoop.hbase.regionserver.HRegionServer: Received
> > request to
> > > > open region:
> > > >
> > > >
> > >
> > trackurl_status_list,zOPKLDi,1342293292587.f89dbfc1dad366c991eba3c24822
> > 8637.
> > > > 2012-07-16 00:12:50,562 INFO
> > > > org.apache.hadoop.hbase.regionserver.HRegionServer: Received
> > request to
> > > > open region:
> > > >
> > > >
> > >
> > trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7
> > e956.
> > > > 2012-07-16 00:12:50,685 INFO
> > > > org.apache.hadoop.hbase.regionserver.HRegionServer: Received
> > request to
> > > > open region:
> > > >
> > > >
> > >
> > trackurl_status_list,zOIR8Tw,1342283152067.6284e9f69042266cde88a7b19071
> > d701.
> > > > 2012-07-16 00:12:50,791 INFO
> > > > org.apache.hadoop.hbase.regionserver.HRegionServer: Received
> > request to
> > > > open region:
> > > >
> > > >
> > >
> > trackurl_comment_list,zOqzaNw,1342195266309.bd8e19c1063f4f2984f9ca63365
> > c5573.
> > > >
> > >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message