hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vamshi Krishna <vamshi2...@gmail.com>
Subject Region server is getting disconnected and becomes unreachable.
Date Wed, 21 Aug 2013 10:59:50 GMT
I setup hbase cluster on two machines. One machine has master aswell as
regionserver and other has only RS. After running ./start-hbase.sh all
daemons are started perfectly. But 2nd machine which runs only RS is
getting disconnceted after some time and what ever data i iserted in to
Hbase table resides only in the master machine.
I see following error in the Regions server log.

2013-08-21 16:16:17,243 INFO org.apache.zookeeper.ZooKeeper: Initiating
client connection, connectString=vamshi_RS:2181 sessionTimeout=180000
watcher=regionserver:60020
2013-08-21 16:16:17,253 INFO
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: The identifier of
this process is 31047@vamshi
2013-08-21 16:16:17,258 INFO org.apache.zookeeper.ClientCnxn: Opening
socket connection to server vamshi_RS/192.168.1.57:2181. Will not attempt
to authenticate using SASL (Unable to locate a login configuration)
2013-08-21 16:17:20,347 WARN org.apache.zookeeper.ClientCnxn: Session 0x0
for server null, unexpected error, closing socket connection and attempting
reconnect
java.net.ConnectException: Connection timed out
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
    at
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
    at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
2013-08-21 16:17:20,463 WARN
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient
ZooKeeper exception:
org.apache.zookeeper.KeeperException$ConnectionLossException:
KeeperErrorCode = ConnectionLoss for /hbase/master
2013-08-21 16:17:20,463 INFO org.apache.hadoop.hbase.util.RetryCounter:
Sleeping 2000ms before retry #1...
2013-08-21 16:17:21,458 INFO org.apache.zookeeper.ClientCnxn: Opening
socket connection to server vamshi_RS/192.168.1.57:2181. Will not attempt
to authenticate using SASL (Unable to locate a login configuration)
2013-08-21 16:18:24,601 WARN org.apache.zookeeper.ClientCnxn: Session 0x0
for server null, unexpected error, closing socket connection and attempting
reconnect
java.net.ConnectException: Connection timed out
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
    at
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
    at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
2013-08-21 16:18:24,702 WARN
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient
ZooKeeper exception:
org.apache.zookeeper.KeeperException$ConnectionLossException:
KeeperErrorCode = ConnectionLoss for /hbase/master
2013-08-21 16:18:24,702 INFO org.apache.hadoop.hbase.util.RetryCounter:
Sleeping 4000ms before retry #2...
2013-08-21 16:18:25,702 INFO org.apache.zookeeper.ClientCnxn: Opening
socket connection to server vamshi_RS/192.168.1.57:2181. Will not attempt
to authenticate using SASL (Unable to locate a login configuration)
2013-08-21 16:19:28,857 WARN org.apache.zookeeper.ClientCnxn: Session 0x0
for server null, unexpected error, closing socket connection and attempting
reconnect
java.net.ConnectException: Connection timed out
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
    at
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
    at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
.
.
..
2013-08-21 16:20:33,217 FATAL
org.apache.hadoop.hbase.regionserver.HRegionServer: RegionServer abort:
loaded coprocessors are: []
2013-08-21 16:20:33,217 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Unexpected
exception during initialization, aborting
2013-08-21 16:20:34,214 INFO org.apache.zookeeper.ClientCnxn: Opening
socket connection to server vamshi_RS/192.168.1.57:2181. Will not attempt
to authenticate using SASL (Unable to locate a login configuration)
2013-08-21 16:20:36,220 INFO org.apache.hadoop.ipc.HBaseServer: Stopping
server on 60020
2013-08-21 16:20:36,221 FATAL
org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server
vamshi,60020,1377081977160: Initialization of RS failed.  Hence aborting RS.
java.io.IOException: Received the shutdown message while waiting.
    at
org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:680)
    at
org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:649)
    at
org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:609)
    at
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:735)
    at java.lang.Thread.run(Thread.java:662)
2013-08-21 16:20:36,222 FATAL
org.apache.hadoop.hbase.regionserver.HRegionServer: RegionServer abort:
loaded coprocessors are: []
2013-08-21 16:20:36,222 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Initialization
of RS failed.  Hence aborting RS.
2013-08-21 16:20:36,224 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Registered RegionServer
MXBean
2013-08-21 16:20:36,225 FATAL
org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server
vamshi,60020,1377081977160: Unhandled exception: null
java.lang.NullPointerException
    at
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:756)
    at java.lang.Thread.run(Thread.java:662)
2013-08-21 16:20:36,225 FATAL
org.apache.hadoop.hbase.regionserver.HRegionServer: RegionServer abort:
loaded coprocessors are: []
2013-08-21 16:20:36,225 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Unhandled
exception: null
2013-08-21 16:20:36,226 INFO
org.apache.hadoop.hbase.regionserver.ShutdownHook: Shutdown hook starting;
hbase.shutdown.hook=true; fsShutdownHook=Thread[Thread-5,5,main]
2013-08-21 16:20:36,226 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Shutdown hook
2013-08-21 16:20:36,227 INFO
org.apache.hadoop.hbase.regionserver.ShutdownHook: Starting fs shutdown
hook thread.
2013-08-21 16:20:36,228 INFO
org.apache.hadoop.hbase.regionserver.ShutdownHook: Shutdown hook finished.

I don't know what is the wrong and i am struggling a lot to resolve this
.Please some body help.

Below is the conf/hbase-site.xml content same in both the machines.
<property>
        <name>hbase.rootdir</name>
        <value>/home/biginfolabs/BILSftwrs/hbase-0.94.10/hbstmp/</value>
    </property>

    <property>
        <name>hbase.cluster.distributed</name>
        <value>true</value>
    </property>
    <property>
        <name>hbase.master</name>
        <value>vamshi_RS</value>
    </property>
    <property>
        <name>hbase.zookeeper.property.clientPort</name>
        <value>2181</value>
    </property>

   <property>
        <name>hbase.hregion.max.filesize</name>
        <value>50</value>
    </property>

   <property>
        <name>hbase.balancer.period</name>
        <value>60000</value>
    </property>

    <property>
        <name>hbase.zookeeper.quorum</name>
        <value>vamshi_RS</value>
    </property>
    <property>
        <name>hbase.zookeeper.property.dataDir</name>
        <value>/home/biginfolabs/BILSftwrs/hbase-0.94.10/zkptmp</value>
    </property>
  <property>
    <name>hbase.client.scanner.caching</name>
    <value>1000</value>
    <description>Number of rows that will be fetched when calling next
    </description>
  </property>
  <property>
    <name>hbase.zookeeper.property.maxClientCnxns</name>
    <value>1024</value>
  </property>

 <property>
    <name>hbase.coprocessor.user.region.classes</name>
    <value>com.bil.coproc.ColumnAggregationEndpoint</value>
  </property>

-- 
*Regards*
*
Vamshi
*

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message