Return-Path: Delivered-To: apmail-hadoop-hbase-user-archive@minotaur.apache.org Received: (qmail 49601 invoked from network); 7 Dec 2009 21:43:36 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 7 Dec 2009 21:43:36 -0000 Received: (qmail 13153 invoked by uid 500); 7 Dec 2009 21:43:35 -0000 Delivered-To: apmail-hadoop-hbase-user-archive@hadoop.apache.org Received: (qmail 13107 invoked by uid 500); 7 Dec 2009 21:43:35 -0000 Mailing-List: contact hbase-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-user@hadoop.apache.org Delivered-To: mailing list hbase-user@hadoop.apache.org Received: (qmail 13097 invoked by uid 99); 7 Dec 2009 21:43:35 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 07 Dec 2009 21:43:35 +0000 X-ASF-Spam-Status: No, hits=-2.6 required=5.0 tests=AWL,BAYES_00,HTML_MESSAGE,NORMAL_HTTP_TO_IP,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of vpuranik@gmail.com designates 209.85.160.50 as permitted sender) Received: from [209.85.160.50] (HELO mail-pw0-f50.google.com) (209.85.160.50) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 07 Dec 2009 21:43:31 +0000 Received: by pwi20 with SMTP id 20so1486156pwi.29 for ; Mon, 07 Dec 2009 13:43:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=3+Z8KNssHc4Cd21OId0ZliNsvv7MOoSeP+8Vyh89/Yw=; b=NeF7BXIvqnBqOj4Ve2XRwXiw3S3axKmzLx8jZWFcD1YHRHAAjR0qK6YLcK9488QtTL xrXdsmydfpYdvQZbUpZGTerrARXIoWUbagJ20uKZvM8ajwHe5YIU2HEAtA1vwboYD0Ya +SiJQEcMad3NAAm3Go0/zZ7uskDFuYZ6YfiZk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=kN2SAKc0odKj06ZarDWVa4HglE9pFiZZbcHIy6G+3iySEi7SSQPaDzT5VeZLSqiGGB Vo4z049jHfPaz1EGr4hu9VnfrhBs1FolGF0lU9tNbTaqNw5xcPQYlNZjrWhGp1TjZjFf mcLogndhvVOGbhZXBW9llX2K6aK15L2dtUObU= MIME-Version: 1.0 Received: by 10.114.214.37 with SMTP id m37mr12997052wag.118.1260222190909; Mon, 07 Dec 2009 13:43:10 -0800 (PST) In-Reply-To: <1eabbac30912071205w5d0f8d32n7b7feced8cfe0fe2@mail.gmail.com> References: <1eabbac30912032327oc05965fj2b4148289ce6a235@mail.gmail.com> <31a243e70912041253w1cad6177y8a3052c07dfe0412@mail.gmail.com> <4B198EBC.1050405@apache.org> <31a243e70912041445q1019c73as5f47bc6395f7e2ad@mail.gmail.com> <4B19952A.4000400@apache.org> <1eabbac30912062104s372d7f0fjd6001c1537b4d7ef@mail.gmail.com> <2eef7bd70912071112y7a662495p16800a7c051c853f@mail.gmail.com> <1eabbac30912071130h3869d83es26d984172a2a54cb@mail.gmail.com> <2eef7bd70912071134tdbbd250w4e0c435792b4a27a@mail.gmail.com> <1eabbac30912071205w5d0f8d32n7b7feced8cfe0fe2@mail.gmail.com> Date: Mon, 7 Dec 2009 13:43:10 -0800 Message-ID: <2eef7bd70912071343i5bbceaa1gee82e2c2f31948c6@mail.gmail.com> Subject: Re: Starting HBase in fully distributed mode... From: Vaibhav Puranik To: hbase-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=0016e64b92de06fc31047a2a5805 --0016e64b92de06fc31047a2a5805 Content-Type: text/plain; charset=ISO-8859-1 Changing connection method to custom might be ok. I don't remember it exactly. Unfortunately, there is no way to add an instance to a security group once it's booted. You have to specify the security group at launch. (check this FAQ from Amazon - http://developer.amazonwebservices.com/connect/entry.jspa?externalID=1145#13 ) You will have to shutdown your cluster. Reboot the instances with 'hbase' security group. Regards, Vaibhav On Mon, Dec 7, 2009 at 12:05 PM, Something Something < mailinglists19@gmail.com> wrote: > After doing that.. "Connection Method" was changed automatically to > "Custom...". Is that correct? > > Next step is... "Add all the machines on which hbase is running - master > and > slaves to hbase > group" > > Does that mean - > > Connection Method: SSH > Protocol: TCP > From & To Ports: ? > Source: Master IP > > Connection Method: SSH > Protocol: TCP > From & To Ports: ? > Source: Slave 1 IP > > Connection Method: SSH > Protocol: TCP > From & To Ports: ? > Source: Slave 2 IP > > Is that what you mean? Please let me know. Thanks again for your help. > > > On Mon, Dec 7, 2009 at 11:34 AM, Vaibhav Puranik > wrote: > > > Select the following fields: > > > > Connection Method:All > > Leave Protocal, From Port and To Port empty (or default) > > Type 'hbase' (or the same group name) in the source field. Notice that > the > > source field says IP or Group. You can type any group name there. > > > > Regards, > > Vaibhav Puranik > > Gumgum > > > > On Mon, Dec 7, 2009 at 11:30 AM, Something Something < > > mailinglists19@gmail.com> wrote: > > > > > Hmm.. not sure what you mean by "Add hbase into hbase" > > > > > > I added security group 'hbase' using AWS Console. The screen has > > following > > > columns at the bottom: > > > > > > Connection Method, Protocol, From Port, To Port, Source, Actions > > > > > > Please let me know. Thanks. > > > > > > > > > On Mon, Dec 7, 2009 at 11:12 AM, Vaibhav Puranik > > > wrote: > > > > > > > Here is what I suggest: > > > > > > > > Make a security group - say hbase. > > > > Add hbase into hbase. > > > > > > > > Add all the machines on which hbase is running - master and slaves to > > > hbase > > > > group. > > > > > > > > And use private names that start with domU-XXXXXXXXXXXX in > > configuration > > > > files. > > > > > > > > This should work. > > > > > > > > Regards, > > > > Vaibhav > > > > > > > > On Sun, Dec 6, 2009 at 9:04 PM, Something Something < > > > > mailinglists19@gmail.com> wrote: > > > > > > > > > After using internal IPs on EC2, Hadoop started cleanly, with no > > errors > > > > in > > > > > any of the 4 logs (on Master) & 2 logs (on each Slave). > > > > > > > > > > But when I start HBase, I get this... > > > > > > > > > > java.net.ConnectException: Connection refused > > > > > at sun.nio.ch.Net.connect(Native Method) > > > > > at > > > > sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:507) > > > > > at > > java.nio.channels.SocketChannel.open(SocketChannel.java:146) > > > > > at > > > > > > > > > > > > > > > > > > > > org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:323) > > > > > at > > > > > > > > > > > > > > > > > > > > org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:356) > > > > > at > > > > > > > > > > > > > > > > > > > > org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:603) > > > > > at > > > > > > > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:488) > > > > > 2009-12-07 04:24:56,006 INFO > > > > > org.apache.zookeeper.server.quorum.FastLeaderElection: Notification > > > time > > > > > out: 400 > > > > > 2009-12-07 04:24:56,428 WARN > > > > > org.apache.zookeeper.server.quorum.QuorumCnxManager: Cannot open > > > channel > > > > to > > > > > 2 at election address domU-12-31-38-00-44-99/10.252.75.133:3888 > > > > > java.net.ConnectException: Connection refused > > > > > at sun.nio.ch.Net.connect(Native Method) > > > > > at > > > > sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:507) > > > > > at > > java.nio.channels.SocketChannel.open(SocketChannel.java:146) > > > > > at > > > > > > > > > > > > > > > > > > > > org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:323) > > > > > at > > > > > > > > > > > > > > > > > > > > org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:356) > > > > > at > > > > > > > > > > > > > > > > > > > > org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:603) > > > > > at > > > > > > > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:488) > > > > > 2009-12-07 04:24:56,434 WARN > > > > > org.apache.zookeeper.server.quorum.QuorumCnxManager: Cannot open > > > channel > > > > to > > > > > 1 at election address domU-12-31-38-00-91-99/10.252.150.226:3888 > > > > > > > > > > > > > > > > > > > > At first I thought this was because Port 3888 is not open, so I > added > > > > > '3888' > > > > > to my "default group" under "Security Groups" of EC2 Console with > > > source > > > > > set > > > > > to 0.0.0.0/0. To the best of my knowledge that's the way to open > a > > > port > > > > > under EC2 (correct?) > > > > > > > > > > Later I created 3 new EC2 instances from scratch, but still got > these > > > > > messages. Seems like ZooKeeper is not starting automatically on > this > > > > port > > > > > on the Slaves. Any reason why? Please help. Thanks. > > > > > > > > > > > > > > > > > > > > On Fri, Dec 4, 2009 at 3:03 PM, Patrick Hunt > > wrote: > > > > > > > > > > > That is weird because it works for me. I just tried your example > > > (eth0 > > > > vs > > > > > > ath0) and I was able to "echo stat |nc 2181" as > well > > > as > > > > > > connect a ZK client successfully using either IP address. > > > > > > > > > > > > netstat -a shows this: > > > > > > tcp6 0 0 [::]:2181 [::]:* LISTEN > > > > > > > > > > > > > > > > > > What do you see for netstat? > > > > > > > > > > > > I'm on ipv4, are you running ipv6? > > > > > > > > > > > > > > > > > > Patrick > > > > > > > > > > > > > > > > > > Jean-Daniel Cryans wrote: > > > > > > > > > > > >> It seems not... For example on my dev machine I have an > interface > > > for > > > > > >> wired network and another one for wireless. When I start ZK it > > binds > > > > > >> on only one interface so if I connect to the other IP it doesn't > > > work. > > > > > >> > > > > > >> J-D > > > > > >> > > > > > >> On Fri, Dec 4, 2009 at 2:35 PM, Patrick Hunt > > > > wrote: > > > > > >> > > > > > >>> Sorry, but I'm still not able to grok this issue. Perhaps you > can > > > > shed > > > > > >>> more > > > > > >>> light: here's the exact code from our server to bind to the > > client > > > > > port: > > > > > >>> > > > > > >>> ss.socket().bind(new InetSocketAddress(port)); > > > > > >>> > > > > > >>> my understanding from the java docs is this: > > > > > >>> > > > > > >>> public InetSocketAddress(int port) > > > > > >>> "Creates a socket address where the IP address is the > > > wildcard > > > > > >>> address and the port number a specified value." > > > > > >>> > > > > > >>> > > > > > >>> afaik this binds the socket onto the specified port for any ip > on > > > any > > > > > >>> interface of the host. Where am I going wrong? > > > > > >>> > > > > > >>> Patrick > > > > > >>> > > > > > >>> Jean-Daniel Cryans wrote: > > > > > >>> > > > > > >>>> The first two definitions here is what I'm talking about > > > > > >>>> > > > > > >>>> > > > > > > > > > > > > > > > http://developer.amazonwebservices.com/connect/entry.jspa?externalID=1346 > > > > > >>>> > > > > > >>>> So by default it usually doesn't listen on the interface > > > associated > > > > > >>>> with the hostname ec2-IP-compute-1.amazonaws.com but on the > > other > > > > one > > > > > >>>> (IIRC starts with dom-). > > > > > >>>> > > > > > >>>> J-D > > > > > >>>> > > > > > >>>> On Fri, Dec 4, 2009 at 12:41 PM, Patrick Hunt < > phunt@apache.org > > > > > > > > wrote: > > > > > >>>> > > > > > >>>>> I'm not familiar with ec2, when you say "listen on private > > > > hostname" > > > > > >>>>> what > > > > > >>>>> does that mean? Do you mean "by default listen on an > interface > > > with > > > > a > > > > > >>>>> non-routable (localonly) ip"? Or something else. Is there an > > aws > > > > page > > > > > >>>>> you > > > > > >>>>> can point me to? > > > > > >>>>> > > > > > >>>>> Patrick > > > > > >>>>> > > > > > >>>>> Jean-Daniel Cryans wrote: > > > > > >>>>> > > > > > >>>>>> When you saw: > > > > > >>>>>> > > > > > >>>>>> org.apache.hadoop.hdfs.server.namenode.SafeModeException: > > Cannot > > > > > >>>>>> delete > > > > > >>>>>> /ebs1/mapred/system,/ebs2/mapred/system. Name node is in > safe > > > > mode. > > > > > >>>>>> The ratio of reported blocks 0.0000 has not reached the > > > threshold > > > > > >>>>>> 0.9990. > > > > > >>>>>> *Safe > > > > > >>>>>> mode will be turned off automatically*. > > > > > >>>>>> > > > > > >>>>>> It means that HDFS is blocking everything (aka safe mode) > > until > > > > all > > > > > >>>>>> datanodes reported for duty (and then it waits for 30 > seconds > > to > > > > > make > > > > > >>>>>> sure). > > > > > >>>>>> > > > > > >>>>>> When you saw: > > > > > >>>>>> > > > > > >>>>>> Caused by: > > org.apache.zookeeper.KeeperException$NoNodeException: > > > > > >>>>>> KeeperErrorCode = *NoNode for /hbase/master* > > > > > >>>>>> > > > > > >>>>>> It means that the Master node didn't write his znode in > > > Zookeeper > > > > > >>>>>> because... when you saw: > > > > > >>>>>> > > > > > >>>>>> 2009-12-04 07:07:37,149 WARN > org.apache.zookeeper.ClientCnxn: > > > > > >>>>>> Exception > > > > > >>>>>> closing session 0x0 to sun.nio.ch.SelectionKeyImpl@10e35d5 > > > > > >>>>>> java.net.ConnectException: Connection refused > > > > > >>>>>> > > > > > >>>>>> It really means that the connection was refused. It then > says > > it > > > > > >>>>>> attempted to connect to > > > > ec2-174-129-127-141.compute-1.amazonaws.com > > > > > >>>>>> but wasn't able to. AFAIK in EC2 the java processes tend to > > > listen > > > > > on > > > > > >>>>>> their private hostname not the public one (which would be > bad > > > > > >>>>>> anyways). > > > > > >>>>>> > > > > > >>>>>> Bottom line, make sure stuff listens where they are expected > > and > > > > it > > > > > >>>>>> should then work well. > > > > > >>>>>> > > > > > >>>>>> J-D > > > > > >>>>>> > > > > > >>>>>> On Fri, Dec 4, 2009 at 11:23 AM, Something Something > > > > > >>>>>> wrote: > > > > > >>>>>> > > > > > >>>>>>> Hadoop: 0.20.1 > > > > > >>>>>>> > > > > > >>>>>>> HBase: 0.20.2 > > > > > >>>>>>> > > > > > >>>>>>> Zookeeper: The one which gets started by default by HBase. > > > > > >>>>>>> > > > > > >>>>>>> > > > > > >>>>>>> HBase logs: > > > > > >>>>>>> > > > > > >>>>>>> 1) Master log shows this WARN message, but then it says > > > > > 'connection > > > > > >>>>>>> successful' > > > > > >>>>>>> > > > > > >>>>>>> > > > > > >>>>>>> 2009-12-04 07:07:37,149 WARN > org.apache.zookeeper.ClientCnxn: > > > > > >>>>>>> Exception > > > > > >>>>>>> closing session 0x0 to sun.nio.ch.SelectionKeyImpl@10e35d5 > > > > > >>>>>>> java.net.ConnectException: Connection refused > > > > > >>>>>>> at sun.nio.ch.SocketChannelImpl.checkConnect(Native > > Method) > > > > > >>>>>>> at > > > > > >>>>>>> > > > > > >>>>>>> > > > > > > > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) > > > > > >>>>>>> at > > > > > >>>>>>> > > > > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:933) > > > > > >>>>>>> 2009-12-04 07:07:37,150 WARN > org.apache.zookeeper.ClientCnxn: > > > > > >>>>>>> Ignoring > > > > > >>>>>>> exception during shutdown input > > > > > >>>>>>> java.nio.channels.ClosedChannelException > > > > > >>>>>>> at > > > > > >>>>>>> > > > > > >>>>>>> > > > > > > > sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:638) > > > > > >>>>>>> at > > > > > sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360) > > > > > >>>>>>> at > > > > > >>>>>>> > > > > > >>>>>>> > > > > > > > org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:999) > > > > > >>>>>>> at > > > > > >>>>>>> > > > > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:970) > > > > > >>>>>>> 2009-12-04 07:07:37,150 WARN > org.apache.zookeeper.ClientCnxn: > > > > > >>>>>>> Ignoring > > > > > >>>>>>> exception during shutdown output > > > > > >>>>>>> java.nio.channels.ClosedChannelException > > > > > >>>>>>> at > > > > > >>>>>>> > > > > > >>>>>>> > > > > > > > sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:649) > > > > > >>>>>>> at > > > > > >>>>>>> > > sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368) > > > > > >>>>>>> at > > > > > >>>>>>> > > > > > >>>>>>> > > > > > >>>>>>> > > > > > > > > > org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1004) > > > > > >>>>>>> at > > > > > >>>>>>> > > > > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:970) > > > > > >>>>>>> 2009-12-04 07:07:37,199 INFO > > > > > >>>>>>> org.apache.hadoop.hbase.master.RegionManager: > > > > > >>>>>>> -ROOT- region unset (but not set to be reassigned) > > > > > >>>>>>> 2009-12-04 07:07:37,200 INFO > > > > > >>>>>>> org.apache.hadoop.hbase.master.RegionManager: > > > > > >>>>>>> ROOT inserted into regionsInTransition > > > > > >>>>>>> 2009-12-04 07:07:37,667 INFO > org.apache.zookeeper.ClientCnxn: > > > > > >>>>>>> Attempting > > > > > >>>>>>> connection to server > > > > > >>>>>>> > > ec2-174-129-127-141.compute-1.amazonaws.com/10.252.146.65:2181 > > > > > >>>>>>> 2009-12-04 07:07:37,668 INFO > org.apache.zookeeper.ClientCnxn: > > > > > Priming > > > > > >>>>>>> connection to java.nio.channels.SocketChannel[connected > > local=/ > > > > > >>>>>>> 10.252.162.19:46195 remote= > > > > > >>>>>>> > > ec2-174-129-127-141.compute-1.amazonaws.com/10.252.146.65:2181 > > > ] > > > > > >>>>>>> 2009-12-04 07:07:37,670 INFO > org.apache.zookeeper.ClientCnxn: > > > > > Server > > > > > >>>>>>> connection successful > > > > > >>>>>>> > > > > > >>>>>>> > > > > > >>>>>>> > > > > > >>>>>>> 2) Regionserver log shows this... but later seems to have > > > > > recovered: > > > > > >>>>>>> > > > > > >>>>>>> 2009-12-04 07:07:36,576 WARN > org.apache.zookeeper.ClientCnxn: > > > > > >>>>>>> Exception > > > > > >>>>>>> closing session 0x0 to sun.nio.ch.SelectionKeyImpl@4ee70b > > > > > >>>>>>> java.net.ConnectException: Connection refused > > > > > >>>>>>> at sun.nio.ch.SocketChannelImpl.checkConnect(Native > > Method) > > > > > >>>>>>> at > > > > > >>>>>>> > > > > > >>>>>>> > > > > > > > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) > > > > > >>>>>>> at > > > > > >>>>>>> > > > > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:933) > > > > > >>>>>>> 2009-12-04 07:07:36,611 WARN > org.apache.zookeeper.ClientCnxn: > > > > > >>>>>>> Ignoring > > > > > >>>>>>> exception during shutdown input > > > > > >>>>>>> java.nio.channels.ClosedChannelException > > > > > >>>>>>> at > > > > > >>>>>>> > > > > > >>>>>>> > > > > > > > sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:638) > > > > > >>>>>>> at > > > > > sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360) > > > > > >>>>>>> at > > > > > >>>>>>> > > > > > >>>>>>> > > > > > > > org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:999) > > > > > >>>>>>> at > > > > > >>>>>>> > > > > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:970) > > > > > >>>>>>> 2009-12-04 07:07:36,611 WARN > org.apache.zookeeper.ClientCnxn: > > > > > >>>>>>> Ignoring > > > > > >>>>>>> exception during shutdown output > > > > > >>>>>>> java.nio.channels.ClosedChannelException > > > > > >>>>>>> at > > > > > >>>>>>> > > > > > >>>>>>> > > > > > > > sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:649) > > > > > >>>>>>> at > > > > > >>>>>>> > > sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368) > > > > > >>>>>>> at > > > > > >>>>>>> > > > > > >>>>>>> > > > > > >>>>>>> > > > > > > > > > org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1004) > > > > > >>>>>>> at > > > > > >>>>>>> > > > > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:970) > > > > > >>>>>>> 2009-12-04 07:07:36,742 WARN > > > > > >>>>>>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Failed > to > > > set > > > > > >>>>>>> watcher > > > > > >>>>>>> on > > > > > >>>>>>> ZNode /hbase/master > > > > > >>>>>>> > org.apache.zookeeper.KeeperException$ConnectionLossException: > > > > > >>>>>>> KeeperErrorCode = ConnectionLoss for /hbase/master > > > > > >>>>>>> at > > > > > >>>>>>> > > > > > > org.apache.zookeeper.KeeperException.create(KeeperException.java:90) > > > > > >>>>>>> at > > > > > >>>>>>> > > > > > > org.apache.zookeeper.KeeperException.create(KeeperException.java:42) > > > > > >>>>>>> at > > > org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:780) > > > > > >>>>>>> at > > > > > >>>>>>> > > > > > >>>>>>> > > > > > >>>>>>> > > > > > >>>>>>> > > > > > > > > > > > > > > > org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.watchMasterAddress(ZooKeeperWrapper.java:304) > > > > > >>>>>>> at > > > > > >>>>>>> > > > > > >>>>>>> > > > > > >>>>>>> > > > > > >>>>>>> > > > > > > > > > > > > > > > org.apache.hadoop.hbase.regionserver.HRegionServer.watchMasterAddress(HRegionServer.java:385) > > > > > >>>>>>> at > > > > > >>>>>>> > > > > > >>>>>>> > > > > > >>>>>>> > > > > > >>>>>>> > > > > > > > > > > > > > > > org.apache.hadoop.hbase.regionserver.HRegionServer.reinitializeZooKeeper(HRegionServer.java:315) > > > > > >>>>>>> at > > > > > >>>>>>> > > > > > >>>>>>> > > > > > >>>>>>> > > > > > >>>>>>> > > > > > > > > > > > > > > > org.apache.hadoop.hbase.regionserver.HRegionServer.reinitialize(HRegionServer.java:306) > > > > > >>>>>>> at > > > > > >>>>>>> > > > > > >>>>>>> > > > > > >>>>>>> > > > > > >>>>>>> > > > > > > > > > > > > > > > org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:276) > > > > > >>>>>>> at > > > > > sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > > > > > >>>>>>> Method) > > > > > >>>>>>> at > > > > > >>>>>>> > > > > > >>>>>>> > > > > > >>>>>>> > > > > > >>>>>>> > > > > > > > > > > > > > > > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) > > > > > >>>>>>> at > > > > > >>>>>>> > > > > > >>>>>>> > > > > > >>>>>>> > > > > > >>>>>>> > > > > > > > > > > > > > > > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) > > > > > >>>>>>> at > > > > > >>>>>>> > > java.lang.reflect.Constructor.newInstance(Constructor.java:513) > > > > > >>>>>>> at > > > > > >>>>>>> > > > > > >>>>>>> > > > > > >>>>>>> > > > > > >>>>>>> > > > > > > > > > > > > > > > org.apache.hadoop.hbase.regionserver.HRegionServer.doMain(HRegionServer.java:2474) > > > > > >>>>>>> at > > > > > >>>>>>> > > > > > >>>>>>> > > > > > >>>>>>> > > > > > >>>>>>> > > > > > > > > > > > > > > > org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:2542) > > > > > >>>>>>> 2009-12-04 07:07:36,743 WARN > > > > > >>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Unable > to > > > set > > > > > >>>>>>> watcher > > > > > >>>>>>> on > > > > > >>>>>>> ZooKeeper master address. Retrying. > > > > > >>>>>>> > > > > > >>>>>>> > > > > > >>>>>>> > > > > > >>>>>>> 3) Zookeepr log: Nothing much in there... just a starting > > > > message > > > > > >>>>>>> line.. > > > > > >>>>>>> followed by > > > > > >>>>>>> > > > > > >>>>>>> ulimit -n 1024 > > > > > >>>>>>> > > > > > >>>>>>> I looked at archives. There was one mail that talked about > > > > > 'ulimit'. > > > > > >>>>>>> Wonder if that has something to do with it. > > > > > >>>>>>> > > > > > >>>>>>> Thanks for your help. > > > > > >>>>>>> > > > > > >>>>>>> > > > > > >>>>>>> > > > > > >>>>>>> On Fri, Dec 4, 2009 at 8:18 AM, Mark Vigeant > > > > > >>>>>>> wrote: > > > > > >>>>>>> > > > > > >>>>>>> When I first started my hbase cluster, it too gave me the > > > nonode > > > > > for > > > > > >>>>>>>> /hbase/master several times before it started working, and > I > > > > > believe > > > > > >>>>>>>> this is > > > > > >>>>>>>> a common beginner's error (I've seen it in a few emails in > > the > > > > > past > > > > > >>>>>>>> 2 > > > > > >>>>>>>> weeks). > > > > > >>>>>>>> > > > > > >>>>>>>> What versions of HBase, Hadoop and ZooKeeper are you > using? > > > > > >>>>>>>> > > > > > >>>>>>>> Also, take a look in your HBASE_HOME/logs folder. That > would > > > be > > > > a > > > > > >>>>>>>> good > > > > > >>>>>>>> place to start looking for some answers. > > > > > >>>>>>>> > > > > > >>>>>>>> -Mark > > > > > >>>>>>>> > > > > > >>>>>>>> -----Original Message----- > > > > > >>>>>>>> From: Something Something [mailto: > mailinglists19@gmail.com] > > > > > >>>>>>>> Sent: Friday, December 04, 2009 2:28 AM > > > > > >>>>>>>> To: hbase-user@hadoop.apache.org > > > > > >>>>>>>> Subject: Starting HBase in fully distributed mode... > > > > > >>>>>>>> > > > > > >>>>>>>> Hello, > > > > > >>>>>>>> > > > > > >>>>>>>> I am trying to get Hadoop/HBase up and running in a fully > > > > > >>>>>>>> distributed > > > > > >>>>>>>> mode. > > > > > >>>>>>>> For now, I have only *1 Master & 2 Slaves*. > > > > > >>>>>>>> > > > > > >>>>>>>> The Hadoop starts correctly.. I think. The only exception > I > > > see > > > > > in > > > > > >>>>>>>> various > > > > > >>>>>>>> log files is this one... > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > >>>>>>>> org.apache.hadoop.ipc.RemoteException: > > > > > >>>>>>>> org.apache.hadoop.hdfs.server.namenode.SafeModeException: > > > Cannot > > > > > >>>>>>>> delete > > > > > >>>>>>>> /ebs1/mapred/system,/ebs2/mapred/system. Name node is in > > safe > > > > > mode. > > > > > >>>>>>>> The ratio of reported blocks 0.0000 has not reached the > > > > threshold > > > > > >>>>>>>> 0.9990. > > > > > >>>>>>>> *Safe > > > > > >>>>>>>> mode will be turned off automatically*. > > > > > >>>>>>>> at > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > > > > > > > > > > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:1696) > > > > > >>>>>>>> at > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > > > > > > > > > > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:1676) > > > > > >>>>>>>> at > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > > > > > org.apache.hadoop.hdfs.server.namenode.NameNode.delete(NameNode.java:517) > > > > > >>>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > > > > Method) > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > >>>>>>>> Somehow this doesn't sound critical, so I assumed > everything > > > was > > > > > >>>>>>>> good > > > > > >>>>>>>> to > > > > > >>>>>>>> go > > > > > >>>>>>>> with Hadoop. > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > >>>>>>>> So then I started HBase and opened a shell (hbase shell). > > So > > > > far > > > > > >>>>>>>> everything > > > > > >>>>>>>> looks good. Now when I try to run a 'list' command, I > keep > > > > > getting > > > > > >>>>>>>> this > > > > > >>>>>>>> message: > > > > > >>>>>>>> > > > > > >>>>>>>> Caused by: > > > org.apache.zookeeper.KeeperException$NoNodeException: > > > > > >>>>>>>> KeeperErrorCode = *NoNode for /hbase/master* > > > > > >>>>>>>> at > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > > org.apache.zookeeper.KeeperException.create(KeeperException.java:102) > > > > > >>>>>>>> at > > > > > >>>>>>>> > > > > > > org.apache.zookeeper.KeeperException.create(KeeperException.java:42) > > > > > >>>>>>>> at > > org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:892) > > > > > >>>>>>>> at > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > > > > > > > > > > > org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.readAddressOrThrow(ZooKeeperWrapper.java:328) > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > >>>>>>>> Here's what I have in my *Master hbase-site.xml* > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > >>>>>>>> hbase.rootdir > > > > > >>>>>>>> hdfs://master:54310/hbase > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > >>>>>>>> hbase.cluster.distributed > > > > > >>>>>>>> true > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > >>>>>>>> hbase.zookeeper.property.clientPort > > > > > >>>>>>>> 2181 > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > >>>>>>>> hbase.zookeeper.quorum > > > > > >>>>>>>> master,slave1,slave2 > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > >>>>>>>> The *Slave *hbase-site.xml are set as follows: > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > >>>>>>>> hbase.rootdir > > > > > >>>>>>>> hdfs://master:54310/hbase > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > >>>>>>>> hbase.cluster.distributed > > > > > >>>>>>>> false > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > >>>>>>>> hbase.zookeeper.property.clientPort > > > > > >>>>>>>> 2181 > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > >>>>>>>> In the hbase-env.sh file on ALL 3 machines I have set the > > > > > JAVA_HOME > > > > > >>>>>>>> and > > > > > >>>>>>>> set > > > > > >>>>>>>> the HBase classpath as follows: > > > > > >>>>>>>> > > > > > >>>>>>>> export > > > HBASE_CLASSPATH=$HBASE_CLASSPATH:/ebs1/hadoop-0.20.1/conf > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > >>>>>>>> On *Master* I have added Master & Slaves IP hostnames to > > > > > >>>>>>>> *regionservers* > > > > > >>>>>>>> file. > > > > > >>>>>>>> On *slaves*, the regionservers file is empty. > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > >>>>>>>> I have run hadoop namenode -format multiple times, but > still > > > > keep > > > > > >>>>>>>> getting.. > > > > > >>>>>>>> "NoNode for /hbase/master". What step did I miss? Thanks > > for > > > > > your > > > > > >>>>>>>> help. > > > > > >>>>>>>> > > > > > >>>>>>>> This email message and any attachments are for the sole > use > > of > > > > the > > > > > >>>>>>>> intended > > > > > >>>>>>>> recipients and may contain proprietary and/or confidential > > > > > >>>>>>>> information > > > > > >>>>>>>> which > > > > > >>>>>>>> may be privileged or otherwise protected from disclosure. > > Any > > > > > >>>>>>>> unauthorized > > > > > >>>>>>>> review, use, disclosure or distribution is prohibited. If > > you > > > > are > > > > > >>>>>>>> not > > > > > >>>>>>>> an > > > > > >>>>>>>> intended recipient, please contact the sender by reply > email > > > and > > > > > >>>>>>>> destroy > > > > > >>>>>>>> the > > > > > >>>>>>>> original message and any copies of the message as well as > > any > > > > > >>>>>>>> attachments to > > > > > >>>>>>>> the original message. > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > > > > > > > > > > > --0016e64b92de06fc31047a2a5805--