Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9580A10CFE for ; Thu, 22 Aug 2013 14:36:33 +0000 (UTC) Received: (qmail 11047 invoked by uid 500); 22 Aug 2013 14:36:30 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 10997 invoked by uid 500); 22 Aug 2013 14:36:29 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 10989 invoked by uid 99); 22 Aug 2013 14:36:28 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 22 Aug 2013 14:36:28 +0000 X-ASF-Spam-Status: No, hits=1.8 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_LOW,SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of pavan0591@gmail.com designates 209.85.214.180 as permitted sender) Received: from [209.85.214.180] (HELO mail-ob0-f180.google.com) (209.85.214.180) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 22 Aug 2013 14:36:24 +0000 Received: by mail-ob0-f180.google.com with SMTP id v19so1997866obq.25 for ; Thu, 22 Aug 2013 07:36:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=5phdF23Ft1g6m1zUSJI3QgW9llX1jM/nw5DSex434X4=; b=J/lA/RPpUkQu+Dbgfty/Kg2XOuF8rRAcYgjuvr9wSPdffp5uowAxTLJ4gXvhNS4rWS fVZAYaPwuwdSp9rQJsyqFBsrk2KC6EWqd++DNasvERDdcl9JrWCUNi3SUZM07bo9rZAT 1rSJ4lVCb4Orl9N8xwdxBrqTHI0YA6ed1VJXo7ki71aVSw3COxUxhPvgBOcgQezbpoXa dwNC1RRxbT+dihaWG17fXiiJ6d/xQSvAsirkXqZ/AvyV8jANFvlcmP3N5IGGr45CRpM5 k8R6B0uKJLZ32mRW4H50Fh6sdRLHhcQSXNw/6sXW81Nhz43smrkS3lkDrHT0PZDJYHzV NLng== MIME-Version: 1.0 X-Received: by 10.60.116.170 with SMTP id jx10mr1607956oeb.98.1377182163457; Thu, 22 Aug 2013 07:36:03 -0700 (PDT) Received: by 10.182.176.106 with HTTP; Thu, 22 Aug 2013 07:36:03 -0700 (PDT) In-Reply-To: References: <760345F9-C3FD-4207-B5EE-0D1391E27327@gmail.com> Date: Thu, 22 Aug 2013 20:06:03 +0530 Message-ID: Subject: Re: Hbase region server is not communicating with zookeeper and stopping after some time it was started From: Pavan Sudheendra To: user@hbase.apache.org Content-Type: multipart/alternative; boundary=089e011614dea4e21d04e48a35d0 X-Virus-Checked: Checked by ClamAV on apache.org --089e011614dea4e21d04e48a35d0 Content-Type: text/plain; charset=ISO-8859-1 And just to be clear, sorry if this is a dumb question.. after updating the /etc/hosts file are we supposed to restart hbase? On Thu, Aug 22, 2013 at 8:03 PM, Pavan Sudheendra wrote: > Isn't hbase.zookeeper.quorum suppose to contain only the address of the > HBase master instead of all the region servers? > > > > On Thu, Aug 22, 2013 at 8:01 PM, Pavan Sudheendra wrote: > >> Vamshi and Jay .. Can you both share your /etc/hosts file? >> >> I have the exact same problem .. All my namenode cluster just log this >> connection refused when they are to log something useful for de-bugging.. >> But for me HBase region server tries to connect to localhost when i want it >> to connect it to its master.. >> >> >> On Thu, Aug 22, 2013 at 7:24 PM, Jay Vyas wrote: >> >>> Yes this sounds like a zookeeper DNS error. >>> >>> I just ran into these type of issues a few months ago and wrote up my >>> solutions to the 3 main hbase communication/setup errors I got. >>> >>> See if this helps >>> http://jayunit100.blogspot.com/2013/05/debugging-hbase-installation.html >>> >>> Also Make sure iptables are off etc.. >>> >>> On Aug 22, 2013, at 6:02 AM, Vamshi Krishna >>> wrote: >>> >>> > Hi I setup a hbase cluster of 2 machines. >>> > >>> > Master Machine (vamshi_RS) running both master & Regionserver >>> > slave machine - Running only Region server. >>> > >>> > After i ran start-hbase.sh all the daemons are starting perfectly but >>> after >>> > some time Regionserver on slave machine is stopping. >>> > >>> > I analysed the region server log and below is the log content. >>> > Some how the Region server machine is not able to communicate with the >>> > zookeeper (I guess). Is that the reason..? >>> > >>> > Please look at my hbase-site.xml below (after log content), which is >>> same >>> > in both the machines and kindly let me know the solution for this >>> issue. >>> > >>> > >>> > 2013-08-22 14:03:25,023 INFO org.apache.zookeeper.ZooKeeper: Initiating >>> > client connection, connectString=vamshi_RS:2181 sessionTimeout=180000 >>> > watcher=regionserver:60020 >>> > 2013-08-22 14:03:25,033 INFO >>> > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: The identifier >>> of >>> > this process is 7426@vamshi >>> > 2013-08-22 14:03:25,038 INFO org.apache.zookeeper.ClientCnxn: Opening >>> > socket connection to server vamshi_RS/192.168.1.57:2181. Will not >>> attempt >>> > to authenticate using SASL (Unable to locate a login configuration) >>> > 2013-08-22 14:04:28,171 WARN org.apache.zookeeper.ClientCnxn: Session >>> 0x0 >>> > for server null, unexpected error, closing socket connection and >>> attempting >>> > reconnect >>> > java.net.ConnectException: Connection timed out >>> > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) >>> > at >>> > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) >>> > at >>> > >>> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350) >>> > at >>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068) >>> > 2013-08-22 14:04:28,287 WARN >>> > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly >>> transient >>> > ZooKeeper exception: >>> > org.apache.zookeeper.KeeperException$ConnectionLossException: >>> > KeeperErrorCode = ConnectionLoss for /hbase/master >>> > 2013-08-22 14:04:28,287 INFO org.apache.hadoop.hbase.util.RetryCounter: >>> > Sleeping 2000ms before retry #1... >>> > 2013-08-22 14:04:29,282 INFO org.apache.zookeeper.ClientCnxn: Opening >>> > socket connection to server vamshi_RS/192.168.1.57:2181. Will not >>> attempt >>> > to authenticate using SASL (Unable to locate a login configuration) >>> > 2013-08-22 14:05:32,425 WARN org.apache.zookeeper.ClientCnxn: Session >>> 0x0 >>> > for server null, unexpected error, closing socket connection and >>> attempting >>> > reconnect >>> > java.net.ConnectException: Connection timed out >>> > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) >>> > at >>> > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) >>> > at >>> > >>> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350) >>> > at >>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068) >>> > 2013-08-22 14:05:32,526 WARN >>> > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly >>> transient >>> > ZooKeeper exception: >>> > org.apache.zookeeper.KeeperException$ConnectionLossException: >>> > KeeperErrorCode = ConnectionLoss for /hbase/master >>> > 2013-08-22 14:05:32,526 INFO org.apache.hadoop.hbase.util.RetryCounter: >>> > Sleeping 4000ms before retry #2... >>> > 2013-08-22 14:05:33,526 INFO org.apache.zookeeper.ClientCnxn: Opening >>> > socket connection to server vamshi_RS/192.168.1.57:2181. Will not >>> attempt >>> > to authenticate using SASL (Unable to locate a login configuration) >>> > 2013-08-22 14:06:36,617 WARN org.apache.zookeeper.ClientCnxn: Session >>> 0x0 >>> > for server null, unexpected error, closing socket connection and >>> attempting >>> > reconnect >>> > java.net.ConnectException: Connection timed out >>> > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) >>> > at >>> > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) >>> > at >>> > >>> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350) >>> > at >>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068) >>> > . >>> > . >>> > . >>> > >>> > >>> > hbase-site.xml: >>> > >>> > >>> > hbase.rootdir >>> > >>> > >>> >>> > /home/biginfolabs/BILSftwrs/hbase-0.94.10/hbstmp/ >>> > >>> > >>> > >>> > hbase.cluster.distributed >>> > true >>> > >>> > >>> > hbase.master >>> > vamshi_RS >>> > >>> > >>> > hbase.zookeeper.property.clientPort >>> > 2181 >>> > >>> > >>> > >>> > hbase.hregion.max.filesize >>> > 50 >>> > >>> > >>> > >>> > hbase.balancer.period >>> > 60000 >>> > >>> > >>> > >>> > hbase.zookeeper.quorum >>> > vamshi_RS >>> > >>> > >>> > hbase.zookeeper.property.dataDir >>> > /home/biginfolabs/BILSftwrs/hbase-0.94.10/zkptmp >>> > >>> > >>> > hbase.client.scanner.caching >>> > 1000 >>> > Number of rows that will be fetched when calling next >>> > >>> > >>> > >>> > hbase.zookeeper.property.maxClientCnxns >>> > 1024 >>> > >>> > >>> > >>> > hbase.coprocessor.user.region.classes >>> > com.bil.coproc.ColumnAggregationEndpoint >>> > >>> > >>> > -- >>> > *Regards* >>> > * >>> > Vamshi Krishna >>> > * >>> >> >> >> >> -- >> Regards- >> Pavan >> > > > > -- > Regards- > Pavan > -- Regards- Pavan --089e011614dea4e21d04e48a35d0--