hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jay Wilson <registrat...@circle-cross-jn.com>
Subject Re: HBASE -- Regionserver and QuorumPeer ?
Date Mon, 02 Jul 2012 22:55:21 GMT
First, thank you.

I moved my HRegionservers not my HQuorumPeers.

I have checked the network and everyone can talk to everyone.  I can
even talk to my HQuorumPeers via "nc" from the nodes that should be
running my HMaster on it and my HRegionservers.

[hadoop@devrackA-00 ~]$ zookeeper-check
devrackA-03
imok
This ZooKeeper instance is not currently serving requests
This ZooKeeper instance is not currently serving requests



devrackA-04
imok
Zookeeper version: 3.3.5-cdh3u4--1, built on 05/07/2012 20:10 GMT
Clients:
 /172.18.0.1:41582[0](queued=0,recved=1,sent=0)

Latency min/avg/max: 0/0/0
Received: 5
Sent: 4
Outstanding: 0
Zxid: 0x0
Mode: follower
Node count: 4
 /172.18.0.1:41583[0](queued=0,recved=1,sent=0)




devrackA-05
imok
Zookeeper version: 3.3.5-cdh3u4--1, built on 05/07/2012 20:10 GMT
Clients:
 /172.18.0.1:35517[0](queued=0,recved=1,sent=0)

Latency min/avg/max: 0/0/0
Received: 5
Sent: 4
Outstanding: 0
Zxid: 0x0
Mode: follower
Node count: 4
 /172.18.0.1:35518[0](queued=0,recved=1,sent=0)


~~~~~~~~~~~~~~~~~~~~


[hadoop@devrackA-06 ~]$ jps
21276 Jps
20641 DataNode
[hadoop@devrackA-06 ~]$ echo ruok | nc devrackA-04 2181
imok[hadoop@devrackA-06 ~]$ echo stat | nc devrackA-04 2181
Zookeeper version: 3.3.5-cdh3u4--1, built on 05/07/2012 20:10 GMT
Clients:
 /172.18.0.7:37950[0](queued=0,recved=1,sent=0)

Latency min/avg/max: 0/0/0
Received: 8
Sent: 7
Outstanding: 0
Zxid: 0x0
Mode: follower
Node count: 4


~~~~~~~~~~~~~~~~~~~


[hadoop@devrackB-07 ~]$ echo ruok | nc devrackA-04 2181
imok[hadoop@devrackB-07 ~]$ echo stat | nc devrackA-03 2181
This ZooKeeper instance is not currently serving requests
[hadoop@devrackB-07 ~]$ echo stat | nc devrackA-05 2181
Zookeeper version: 3.3.5-cdh3u4--1, built on 05/07/2012 20:10 GMT
Clients:
 /172.18.0.72:40784[0](queued=0,recved=1,sent=0)

Latency min/avg/max: 0/0/0
Received: 7
Sent: 6
Outstanding: 0
Zxid: 0x0
Mode: follower
Node count: 4
[hadoop@devrackB-07 ~]$ echo stat | nc devrackA-04 2181
Zookeeper version: 3.3.5-cdh3u4--1, built on 05/07/2012 20:10 GMT
Clients:
 /172.18.0.72:60795[0](queued=0,recved=1,sent=0)

Latency min/avg/max: 0/0/0
Received: 10
Sent: 9
Outstanding: 0
Zxid: 0x0
Mode: follower
Node count: 4
[hadoop@devrackB-07 ~]$

~~~~~~~~~~~

I know it says connection refused in the error, but are there files
associated with a HRegionServer that I need to clean up?  I did NOT move
the HMaster or HQuorumPeers.  I only moved the HRegionServers

Thanks you for the help.

---
Jay Wilson





On 7/2/2012 2:43 PM, Suraj Varma wrote:
> The error you are getting is:
> 
>> 2012-07-02 12:39:02,205 INFO org.apache.zookeeper.ClientCnxn: Opening
>> socket connection to server devrackA-05/172.18.0.6:2181
>> 2012-07-02 12:39:02,211 WARN org.apache.zookeeper.ClientCnxn: Session
>> 0x0 for server null, unexpected error, closing socket connection and
>> attempting reconnect
>> java.net.ConnectException: Connection refused
> 
> 
> This means this server is not able to reach the zookeeper. Did you
> change your hbase-site.xml as well with the new zookeeper quorum?
> Do basic connectivity testing to ensure that your hosts / DNS is all
> in place after your relocations - checkout
> http://hbase.apache.org/book.html#d1952e311 and see if the dns checker
> tool might help.
> --S
> 
> 
> 
> On Mon, Jul 2, 2012 at 1:12 PM, Jay Wilson
> <registration@circle-cross-jn.com> wrote:
>> First, Yep I am a newbie to Hadoop/Hbase. I have read both of the
>> O'Reilly books (Hadoop and Hbase), so my knowledge level at this point
>> is pure book learning and understanding the log messages is very vexing.
>>
>> Second, based on the recommendations of this mail-list I decided to move
>> my HRegionservers to nodes other than where where my HQuorumpeers are.
>> I updated my regionservers file on every node in the cluster. I ran
>> stop-hbase.sh, stop-all.sh, and cleaned up my zookeeper files.  Then I
>> ran start-all.sh, waited, and then ran start-hbase.sh.  Now my HMaster
>> and HRegionservers terminate within seconds.  Before I had them at least
>> running for 30 minutes.  The message is:
>>
>> 2012-07-02 12:39:02,193 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:java.io.tmpdir=/tmp
>> 2012-07-02 12:39:02,193 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:java.compiler=<NA>
>> 2012-07-02 12:39:02,193 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:os.name=Linux
>> 2012-07-02 12:39:02,193 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:os.arch=amd64
>> 2012-07-02 12:39:02,193 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:os.version=2.6.18-194.el5
>> 2012-07-02 12:39:02,193 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:user.name=hadoop
>> 2012-07-02 12:39:02,193 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:user.home=/home/hadoop
>> 2012-07-02 12:39:02,193 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:user.dir=/home/hadoop/jscripts
>> 2012-07-02 12:39:02,194 INFO org.apache.zookeeper.ZooKeeper: Initiating
>> client connection,
>> connectString=devrackA-03:2181,devrackA-05:2181,devrackA-04:2181
>> sessionTimeout=180000 watcher=master:60000
>> 2012-07-02 12:39:02,205 INFO org.apache.zookeeper.ClientCnxn: Opening
>> socket connection to server devrackA-05/172.18.0.6:2181
>> 2012-07-02 12:39:02,211 WARN org.apache.zookeeper.ClientCnxn: Session
>> 0x0 for server null, unexpected error, closing socket connection and
>> attempting reconnect
>> java.net.ConnectException: Connection refused
>>
>> I tried the same sequence again (stop-hbase.sh, stop-all.sh, and cleaned
>> up zookeeper), but I get the same result (Connection refused).  Is there
>> something else I need to do when I move a regionserver?
>>
>> My zookeeper working directory is /home/hbase/zookeeper.  Would there be
>> other places that I need to clean up?
>>
>>
>>
>> Thank You
>> --
>> Jay
>>
>>
>>
>> On 7/2/2012 11:25 AM, Amandeep Khurana wrote:
>>> As someone who has been developing/running/using the software for a longer period
of time than the person who is asking the question, you can best serve the poser by making
them aware of the trade offs and why it's a good/bad idea to do things a certain way. At the
end of the day, it's their choice to make based on their requirements and constraints.
>>>
>>> Having said that, it'll be really nice to stop this thread from becoming more
about how to answer questions rather than answering the question itself.
>>>
>>> Bringing the thread back to track:
>>>
>>> Jay, you can certainly run zookeepers with the Datanodes and Region Server processes.
The issue there (as highlighted by Andy earlier) is that you will likely load up the machine
(primarily due to I/O) which will cause ZK some grief. It is generally recommended to collocate
in the following groups:
>>>
>>> Datanode + Region Servers on the same physical nodes
>>> Zookeeper and HBase Master on the same physical nodes (make sure to give ZK a
dedicated spindle)
>>> Namenode on an independent node
>>> Secondary Namenode on an independent node
>>>
>>> These are the general recommendations and different environments might warrant
different decisions. For instance, if it's just a PoC or Dev cluster where you don't really
want to fret about SLAs and want to keep costs low, it might even be okay to collocate the
Namenode, Zookeeper and HBase master on the same physical host.
>>>
>>> Hope that helps
>>>
>>> -Amandeep
>>>
>>>
>>> On Monday, July 2, 2012 at 4:40 AM, Michael Segel wrote:
>>>
>>>> I am not finding fault with what Andy was saying. The problem is that we
tend not to use stronger language when discussing these topics. And my point wasn't just on
this topic but others posts where we say 'not a good idea' yet someone still pursues the idea
until there's a chorus of saying not to do something. I'm not faulting the poster because
he wasn't and isn't the only one who does this... We see it all the time where someone goes
down the wrong path, and is looking for a quick solution, rather than following the recommendation.
>>>
>>>
>>
> 
> 


Mime
View raw message