hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jay Vyas <jayunit...@gmail.com>
Subject Re: Hbase region server is not communicating with zookeeper and stopping after some time it was started
Date Fri, 23 Aug 2013 11:25:27 GMT
Yup.  Usually firewalls/iptables/hostnames are often the culprit in hbase setup. I believe
this is due to the way hbase strictly relies on DNS, iirc.
http://sujee.net/tech/articles/hadoop/hadoop-dns/


On Aug 23, 2013, at 6:37 AM, Vamshi Krishna <vamshi2105@gmail.com> wrote:

> Hey, I got my problem solved.
> The reason for the region server couldn't connect was due to firewall. It
> was enabled in my system. And in other machine, It was disabled. I made
> sure that firewall is disabled in both the machines and its working perfect
> now.
> 
> Thank you for the responses.
> 
> @pavan, you have to restart your hbase after modifying /etc/hosts.
> And, make sure that your /etc/hosts file contains lines ip address and
> hostname mapping rows alone comment everything else.
> Example:
> <ip_address>  <hostname>
> <ip_address>  localhost
> .
> .
> And, a famous thing to be noted is replace the 127.0.1.1 with 127.0.0.1
> Hope this should solve your problem.
> 
> 
> On Thu, Aug 22, 2013 at 8:06 PM, Pavan Sudheendra <pavan0591@gmail.com>wrote:
> 
>> And just to be clear, sorry if this is a dumb question.. after updating the
>> /etc/hosts file are we supposed to restart hbase?
>> 
>> 
>> On Thu, Aug 22, 2013 at 8:03 PM, Pavan Sudheendra <pavan0591@gmail.com
>>> wrote:
>> 
>>> Isn't hbase.zookeeper.quorum suppose to contain only the address of the
>>> HBase master instead of all the region servers?
>>> 
>>> 
>>> 
>>> On Thu, Aug 22, 2013 at 8:01 PM, Pavan Sudheendra <pavan0591@gmail.com
>>> wrote:
>>> 
>>>> Vamshi and Jay .. Can you both share your /etc/hosts file?
>>>> 
>>>> I have the exact same problem .. All my namenode cluster just log this
>>>> connection refused when they are to log something useful for
>> de-bugging..
>>>> But for me HBase region server tries to connect to localhost when i
>> want it
>>>> to connect it to its master..
>>>> 
>>>> 
>>>> On Thu, Aug 22, 2013 at 7:24 PM, Jay Vyas <jayunit100@gmail.com> wrote:
>>>> 
>>>>> Yes this sounds like a zookeeper DNS error.
>>>>> 
>>>>> I just ran into these type of issues a few months ago and wrote up my
>>>>> solutions to the 3 main hbase communication/setup errors I got.
>>>>> 
>>>>> See if this helps
>> http://jayunit100.blogspot.com/2013/05/debugging-hbase-installation.html
>>>>> 
>>>>> Also Make sure iptables are off etc..
>>>>> 
>>>>> On Aug 22, 2013, at 6:02 AM, Vamshi Krishna <vamshi2105@gmail.com>
>>>>> wrote:
>>>>> 
>>>>>> Hi I setup a hbase cluster of 2 machines.
>>>>>> 
>>>>>> Master Machine (vamshi_RS) running both master & Regionserver
>>>>>> slave machine  - Running only Region server.
>>>>>> 
>>>>>> After i ran start-hbase.sh all the daemons are starting perfectly
but
>>>>> after
>>>>>> some time Regionserver on slave machine  is stopping.
>>>>>> 
>>>>>> I analysed the region server log and  below is the log content.
>>>>>> Some how the Region server machine is not able to communicate with
>> the
>>>>>> zookeeper (I guess). Is that the reason..?
>>>>>> 
>>>>>> Please look at my hbase-site.xml below (after log content), which
is
>>>>> same
>>>>>> in both the machines and kindly let me know the solution for this
>>>>> issue.
>>>>>> 
>>>>>> 
>>>>>> 2013-08-22 14:03:25,023 INFO org.apache.zookeeper.ZooKeeper:
>> Initiating
>>>>>> client connection, connectString=vamshi_RS:2181 sessionTimeout=180000
>>>>>> watcher=regionserver:60020
>>>>>> 2013-08-22 14:03:25,033 INFO
>>>>>> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: The
>> identifier
>>>>> of
>>>>>> this process is 7426@vamshi
>>>>>> 2013-08-22 14:03:25,038 INFO org.apache.zookeeper.ClientCnxn: Opening
>>>>>> socket connection to server vamshi_RS/192.168.1.57:2181. Will not
>>>>> attempt
>>>>>> to authenticate using SASL (Unable to locate a login configuration)
>>>>>> 2013-08-22 14:04:28,171 WARN org.apache.zookeeper.ClientCnxn: Session
>>>>> 0x0
>>>>>> for server null, unexpected error, closing socket connection and
>>>>> attempting
>>>>>> reconnect
>>>>>> java.net.ConnectException: Connection timed out
>>>>>>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>>>   at
>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
>>>>>>   at
>> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
>>>>>>   at
>>>>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
>>>>>> 2013-08-22 14:04:28,287 WARN
>>>>>> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly
>>>>> transient
>>>>>> ZooKeeper exception:
>>>>>> org.apache.zookeeper.KeeperException$ConnectionLossException:
>>>>>> KeeperErrorCode = ConnectionLoss for /hbase/master
>>>>>> 2013-08-22 14:04:28,287 INFO
>> org.apache.hadoop.hbase.util.RetryCounter:
>>>>>> Sleeping 2000ms before retry #1...
>>>>>> 2013-08-22 14:04:29,282 INFO org.apache.zookeeper.ClientCnxn: Opening
>>>>>> socket connection to server vamshi_RS/192.168.1.57:2181. Will not
>>>>> attempt
>>>>>> to authenticate using SASL (Unable to locate a login configuration)
>>>>>> 2013-08-22 14:05:32,425 WARN org.apache.zookeeper.ClientCnxn: Session
>>>>> 0x0
>>>>>> for server null, unexpected error, closing socket connection and
>>>>> attempting
>>>>>> reconnect
>>>>>> java.net.ConnectException: Connection timed out
>>>>>>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>>>   at
>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
>>>>>>   at
>> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
>>>>>>   at
>>>>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
>>>>>> 2013-08-22 14:05:32,526 WARN
>>>>>> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly
>>>>> transient
>>>>>> ZooKeeper exception:
>>>>>> org.apache.zookeeper.KeeperException$ConnectionLossException:
>>>>>> KeeperErrorCode = ConnectionLoss for /hbase/master
>>>>>> 2013-08-22 14:05:32,526 INFO
>> org.apache.hadoop.hbase.util.RetryCounter:
>>>>>> Sleeping 4000ms before retry #2...
>>>>>> 2013-08-22 14:05:33,526 INFO org.apache.zookeeper.ClientCnxn: Opening
>>>>>> socket connection to server vamshi_RS/192.168.1.57:2181. Will not
>>>>> attempt
>>>>>> to authenticate using SASL (Unable to locate a login configuration)
>>>>>> 2013-08-22 14:06:36,617 WARN org.apache.zookeeper.ClientCnxn: Session
>>>>> 0x0
>>>>>> for server null, unexpected error, closing socket connection and
>>>>> attempting
>>>>>> reconnect
>>>>>> java.net.ConnectException: Connection timed out
>>>>>>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>>>   at
>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
>>>>>>   at
>> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
>>>>>>   at
>>>>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
>>>>>> .
>>>>>> .
>>>>>> .
>>>>>> 
>>>>>> 
>>>>>> hbase-site.xml:
>>>>>> 
>>>>>> <property>
>>>>>>       <name>hbase.rootdir</name>
>> <!--value>hdfs://vamshi:54310/home/biginfolabs/BILSftwrs/hbase-0.94.10/data/</value-->
>>>>>>   <value>/home/biginfolabs/BILSftwrs/hbase-0.94.10/hbstmp/</value>
>>>>>>   </property>
>>>>>> 
>>>>>>   <property>
>>>>>>       <name>hbase.cluster.distributed</name>
>>>>>>       <value>true</value>
>>>>>>   </property>
>>>>>>   <property>
>>>>>>       <name>hbase.master</name>
>>>>>>       <value>vamshi_RS</value>
>>>>>>   </property>
>>>>>>   <property>
>>>>>>       <name>hbase.zookeeper.property.clientPort</name>
>>>>>>       <value>2181</value>
>>>>>>   </property>
>>>>>> 
>>>>>>  <property>
>>>>>>       <name>hbase.hregion.max.filesize</name>
>>>>>>       <value>50</value>
>>>>>>   </property>
>>>>>> 
>>>>>>  <property>
>>>>>>       <name>hbase.balancer.period</name>
>>>>>>       <value>60000</value>
>>>>>>   </property>
>>>>>> 
>>>>>>   <property>
>>>>>>       <name>hbase.zookeeper.quorum</name>
>>>>>>       <value>vamshi_RS</value>
>>>>>>   </property>
>>>>>>   <property>
>>>>>>       <name>hbase.zookeeper.property.dataDir</name>
>> <value>/home/biginfolabs/BILSftwrs/hbase-0.94.10/zkptmp</value>
>>>>>>   </property>
>>>>>> <property>
>>>>>>   <name>hbase.client.scanner.caching</name>
>>>>>>   <value>1000</value>
>>>>>>   <description>Number of rows that will be fetched when calling
next
>>>>>>   </description>
>>>>>> </property>
>>>>>> <property>
>>>>>>   <name>hbase.zookeeper.property.maxClientCnxns</name>
>>>>>>   <value>1024</value>
>>>>>> </property>
>>>>>> 
>>>>>> <property>
>>>>>>   <name>hbase.coprocessor.user.region.classes</name>
>>>>>>   <value>com.bil.coproc.ColumnAggregationEndpoint</value>
>>>>>> </property>
>>>>>> 
>>>>>> --
>>>>>> *Regards*
>>>>>> *
>>>>>> Vamshi Krishna
>>>>>> *
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Regards-
>>>> Pavan
>>> 
>>> 
>>> 
>>> --
>>> Regards-
>>> Pavan
>> 
>> 
>> 
>> --
>> Regards-
>> Pavan
> 
> 
> 
> -- 
> *Regards*
> *
> Vamshi Krishna
> *

Mime
  • Unnamed multipart/alternative (inline, 7-Bit, 0 bytes)
View raw message