Mailing-List: contact user-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hbase.apache.org
Received-SPF: pass (athena.apache.org: domain of marginal.summer@gmail.com
 designates 209.85.160.169 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=mime-version:date:message-id:subject:from:to:content-type;
        b=ql/CxWt5LJPue2kHYKt3pkiSksRHV5egvXRTu4crr/8xe6FNtl4VqWDYnM+9oBS65K
         Qb09nrUri1X3yd0JrjFwE+/FKey+iIUKnQO+xdvH2i7KH8N9G5qLukoV0XicaXe/4Ly4
         zlAgAwbDi8pcprZmgzkMcGTV9oStCaWbjYby0=
MIME-Version: 1.0
Date: Wed, 27 Apr 2011 14:26:39 +0400
Message-ID: <BANLkTikqBU9rrcXf11FPQ0Ar5dcGhMaCog@mail.gmail.com>
Subject: [CDH3U0] Cluster not processing region server failover
From: Alex Romanovsky <marginal.summer@gmail.com>
To: user@hbase.apache.org
Content-Type: text/plain; charset=ISO-8859-1

Hi,

I am trying failover cases on a small 3-node fully-distributed cluster
of the following topology:
- master node - NameNode, JobTracker, QuorumPeerMain, HMaster;
- slave nodes - DataNode, TaskTracker, QuorumPeerMain, HRegionServer.

ROOT and META are initially served by two different nodes.

I create table 'incr' with a single column family 'value', put 'incr',
'00000000', 'value:main', '00000000' to achieve a 8-byte counter cell
with still human readable content, then start calling

$ incr 'incr', '00000000', 'value:main', 1

once in a second or two. Then I kill -9 one of my region servers, the
one that serves 'incr'.

The subsequent shell incr times out. I terminate it with Ctrl-C,
launch hbase-shell again and repeat the command, getting the following
message repeated several times:

11/04/27 13:57:43 INFO ipc.HbaseRPC: Server at
regionserver1/10.50.3.68:60020 could not be reached after 1 tries,
giving up.

tail master log yields the following diagnostic:

2011-04-27 14:08:32,982 INFO
org.apache.hadoop.hbase.master.LoadBalancer: Calculated a load balance
in 0ms. Moving 1 regions off of 1 overloaded servers onto 1 less
loaded servers
2011-04-27 14:08:32,982 INFO org.apache.hadoop.hbase.master.HMaster:
balance hri=incr,,1303892996561.cf314a59d3a5c79a77153f82b40015d7.,
src=regionserver1,60020,1303895356068,
dest=regionserver2,60020,1303898049443
2011-04-27 14:08:32,982 DEBUG
org.apache.hadoop.hbase.master.AssignmentManager: Starting
unassignment of region
incr,,1303892996561.cf314a59d3a5c79a77153f82b40015d7. (offlining)
2011-04-27 14:08:32,982 DEBUG
org.apache.hadoop.hbase.master.AssignmentManager: Attempted to
unassign region incr,,1303892996561.cf314a59d3a5c79a77153f82b40015d7.
but it is not currently assigned anywhere

hbase hbck finds 2 inconsistencies (regionserver1 down, region not
served). hbase hbck -fix reports 2 initial and 1 eventual
inconsistency, migrating the region to a live region server. However,
when I repeat the test with regionserver2 and regionserver1 swapped
(i.e. kill -9 the region server process on regionserver2, the initial
evacuation target), hbcase hbck -fix throws

org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed
setting up proxy interface
org.apache.hadoop.hbase.ipc.HRegionInterface to
regionserver2/10.50.3.68:60020 after attempts=1
        at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1008)
        at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:172)
        at org.apache.hadoop.hbase.util.HBaseFsck.getMetaEntries(HBaseFsck.java:746)
        at org.apache.hadoop.hbase.util.HBaseFsck.doWork(HBaseFsck.java:133)
        at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:989)

zookeeper.session.timeout is set to 1000 ms (i.e. 1 second), and the
configuration is consistent around the cluster, so these are not the
causes.

Manual region reassignment also helps for the first time, and only for
the first time. Subsequent retries leave 'incr' regions not assigned
anywhere, and I cannot even query table regions on the client since
HTable instances fail to connect.

As soon as I restart the killed region server, cluster operation resumes.
However, as far as I understand the HBase book, this is not the
intended behavior. The cluster should automatically evacuate regions
from dead region servers to known alive ones.

I run the cluster on RH 5, Sun JDK 1.6.0_24.
JAVA_HOME=/usr/java/jdk1.6.0_24 in hadoop-env.sh (wonder whether I
should duplicate the assignment in hbase-env.sh).
Is this one of the issues known to be fixed in 0.90.2 or later
releases? I grepped Jira and found no matching issues described;
failover scenarios mentioned there are far more complex.
What other logs or config files shall I check and/or post here?

Reg.,
Alex Romanovsky