Return-Path: Delivered-To: apmail-hadoop-hbase-user-archive@minotaur.apache.org Received: (qmail 43002 invoked from network); 4 Dec 2009 20:53:56 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 4 Dec 2009 20:53:56 -0000 Received: (qmail 32903 invoked by uid 500); 4 Dec 2009 20:53:55 -0000 Delivered-To: apmail-hadoop-hbase-user-archive@hadoop.apache.org Received: (qmail 32842 invoked by uid 500); 4 Dec 2009 20:53:55 -0000 Mailing-List: contact hbase-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-user@hadoop.apache.org Delivered-To: mailing list hbase-user@hadoop.apache.org Received: (qmail 32832 invoked by uid 99); 4 Dec 2009 20:53:55 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 04 Dec 2009 20:53:55 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of jdcryans@gmail.com designates 209.85.220.224 as permitted sender) Received: from [209.85.220.224] (HELO mail-fx0-f224.google.com) (209.85.220.224) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 04 Dec 2009 20:53:43 +0000 Received: by fxm24 with SMTP id 24so2977383fxm.11 for ; Fri, 04 Dec 2009 12:53:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:message-id:subject:from:to :content-type:content-transfer-encoding; bh=raNhLgXkAlsJ7SbsMYEFY9gMemyL3zTop4oLn9Akn9A=; b=qXZlxf3V90Frznjeq5pdEUbXHMbLKxIfzMW390pH21sqngFaZXexyhAjlVuQZoNypr WLRTpGSZAXnplWn0fx/xZMUD41TZK7RSI/7to61299ratmiMBSp7QGI1KsqXoRefLnJc ImGOHh+TGmwGGVFiViWZIUYIhEsPd2Z5bNwgU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:content-type :content-transfer-encoding; b=ahHxF7n7pUmZ2M8+4Pu1EKzzO/mHSLqIApNJX7HYzyxyWv8jdQTHxX+mYhXkRlEVsA lBd+fjegdrf2PuRwbUvXKBDCSKKtjoJxAwXz5GIAmp5oFRSl4LLHJcTirfxcu8Fsw7Yl K8M8z0ExF4JItoDaqdFQEbPYX+Ti/Sd6dtRLo= MIME-Version: 1.0 Sender: jdcryans@gmail.com Received: by 10.223.5.25 with SMTP id 25mr563444fat.38.1259960002392; Fri, 04 Dec 2009 12:53:22 -0800 (PST) In-Reply-To: <4B1973EC.9070208@apache.org> References: <1eabbac30912032327oc05965fj2b4148289ce6a235@mail.gmail.com> <5D66A842901F8E41815AF6D27A28EC490A8DADDBAE@Mail-Ab02.rmg-ny.com> <1eabbac30912041123t4ea4e02br813f3deff1f572b5@mail.gmail.com> <31a243e70912041209t2cc9ee50ia7b5e1da9b77e7a0@mail.gmail.com> <4B1973EC.9070208@apache.org> Date: Fri, 4 Dec 2009 12:53:22 -0800 X-Google-Sender-Auth: d85537f3af2f23b6 Message-ID: <31a243e70912041253w1cad6177y8a3052c07dfe0412@mail.gmail.com> Subject: Re: Starting HBase in fully distributed mode... From: Jean-Daniel Cryans To: hbase-user@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org The first two definitions here is what I'm talking about http://developer.amazonwebservices.com/connect/entry.jspa?externalID=3D1346 So by default it usually doesn't listen on the interface associated with the hostname ec2-IP-compute-1.amazonaws.com but on the other one (IIRC starts with dom-). J-D On Fri, Dec 4, 2009 at 12:41 PM, Patrick Hunt wrote: > I'm not familiar with ec2, when you say "listen on private hostname" what > does that mean? Do you mean "by default listen on an interface with a > non-routable (localonly) ip"? Or something else. Is there an aws page you > can point me to? > > Patrick > > Jean-Daniel Cryans wrote: >> >> When you saw: >> >> org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot delete >> /ebs1/mapred/system,/ebs2/mapred/system. Name node is in safe mode. >> The ratio of reported blocks 0.0000 has not reached the threshold 0.9990= . >> *Safe >> mode will be turned off automatically*. >> >> It means that HDFS is blocking everything (aka safe mode) until all >> datanodes reported for duty (and then it waits for 30 seconds to make >> sure). >> >> When you saw: >> >> Caused by: org.apache.zookeeper.KeeperException$NoNodeException: >> KeeperErrorCode =3D *NoNode for /hbase/master* >> >> It means that the Master node didn't write his znode in Zookeeper >> because... when you saw: >> >> 2009-12-04 07:07:37,149 WARN org.apache.zookeeper.ClientCnxn: Exception >> closing session 0x0 to sun.nio.ch.SelectionKeyImpl@10e35d5 >> java.net.ConnectException: Connection refused >> >> It really means that the connection was refused. It then says it >> attempted to connect to ec2-174-129-127-141.compute-1.amazonaws.com >> but wasn't able to. AFAIK in EC2 the java processes tend to listen on >> their private hostname not the public one (which would be bad >> anyways). >> >> Bottom line, make sure stuff listens where they are expected and it >> should then work well. >> >> J-D >> >> On Fri, Dec 4, 2009 at 11:23 AM, Something Something >> wrote: >>> >>> Hadoop: 0.20.1 >>> >>> HBase: 0.20.2 >>> >>> Zookeeper: The one which gets started by default by HBase. >>> >>> >>> HBase logs: >>> >>> 1) =A0Master log shows this WARN message, but then it says 'connection >>> successful' >>> >>> >>> 2009-12-04 07:07:37,149 WARN org.apache.zookeeper.ClientCnxn: Exception >>> closing session 0x0 to sun.nio.ch.SelectionKeyImpl@10e35d5 >>> java.net.ConnectException: Connection refused >>> =A0 =A0 =A0 at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) >>> =A0 =A0 =A0 at >>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) >>> =A0 =A0 =A0 at >>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:933) >>> 2009-12-04 07:07:37,150 WARN org.apache.zookeeper.ClientCnxn: Ignoring >>> exception during shutdown input >>> java.nio.channels.ClosedChannelException >>> =A0 =A0 =A0 at >>> sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:638) >>> =A0 =A0 =A0 at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.jav= a:360) >>> =A0 =A0 =A0 at >>> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:999) >>> =A0 =A0 =A0 at >>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:970) >>> 2009-12-04 07:07:37,150 WARN org.apache.zookeeper.ClientCnxn: Ignoring >>> exception during shutdown output >>> java.nio.channels.ClosedChannelException >>> =A0 =A0 =A0 at >>> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:649) >>> =A0 =A0 =A0 at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.ja= va:368) >>> =A0 =A0 =A0 at >>> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1004= ) >>> =A0 =A0 =A0 at >>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:970) >>> 2009-12-04 07:07:37,199 INFO >>> org.apache.hadoop.hbase.master.RegionManager: >>> -ROOT- region unset (but not set to be reassigned) >>> 2009-12-04 07:07:37,200 INFO >>> org.apache.hadoop.hbase.master.RegionManager: >>> ROOT inserted into regionsInTransition >>> 2009-12-04 07:07:37,667 INFO org.apache.zookeeper.ClientCnxn: Attemptin= g >>> connection to server >>> ec2-174-129-127-141.compute-1.amazonaws.com/10.252.146.65:2181 >>> 2009-12-04 07:07:37,668 INFO org.apache.zookeeper.ClientCnxn: Priming >>> connection to java.nio.channels.SocketChannel[connected local=3D/ >>> 10.252.162.19:46195 remote=3D >>> ec2-174-129-127-141.compute-1.amazonaws.com/10.252.146.65:2181] >>> 2009-12-04 07:07:37,670 INFO org.apache.zookeeper.ClientCnxn: Server >>> connection successful >>> >>> >>> >>> 2) =A0Regionserver log shows this... but later seems to have recovered: >>> >>> 2009-12-04 07:07:36,576 WARN org.apache.zookeeper.ClientCnxn: Exception >>> closing session 0x0 to sun.nio.ch.SelectionKeyImpl@4ee70b >>> java.net.ConnectException: Connection refused >>> =A0 =A0 =A0 at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) >>> =A0 =A0 =A0 at >>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) >>> =A0 =A0 =A0 at >>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:933) >>> 2009-12-04 07:07:36,611 WARN org.apache.zookeeper.ClientCnxn: Ignoring >>> exception during shutdown input >>> java.nio.channels.ClosedChannelException >>> =A0 =A0 =A0 at >>> sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:638) >>> =A0 =A0 =A0 at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.jav= a:360) >>> =A0 =A0 =A0 at >>> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:999) >>> =A0 =A0 =A0 at >>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:970) >>> 2009-12-04 07:07:36,611 WARN org.apache.zookeeper.ClientCnxn: Ignoring >>> exception during shutdown output >>> java.nio.channels.ClosedChannelException >>> =A0 =A0 =A0 at >>> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:649) >>> =A0 =A0 =A0 at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.ja= va:368) >>> =A0 =A0 =A0 at >>> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1004= ) >>> =A0 =A0 =A0 at >>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:970) >>> 2009-12-04 07:07:36,742 WARN >>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Failed to set watch= er >>> on >>> ZNode /hbase/master >>> org.apache.zookeeper.KeeperException$ConnectionLossException: >>> KeeperErrorCode =3D ConnectionLoss for /hbase/master >>> =A0 =A0 =A0 at >>> org.apache.zookeeper.KeeperException.create(KeeperException.java:90) >>> =A0 =A0 =A0 at >>> org.apache.zookeeper.KeeperException.create(KeeperException.java:42) >>> =A0 =A0 =A0 at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:780= ) >>> =A0 =A0 =A0 at >>> >>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.watchMasterAddress(Z= ooKeeperWrapper.java:304) >>> =A0 =A0 =A0 at >>> >>> org.apache.hadoop.hbase.regionserver.HRegionServer.watchMasterAddress(H= RegionServer.java:385) >>> =A0 =A0 =A0 at >>> >>> org.apache.hadoop.hbase.regionserver.HRegionServer.reinitializeZooKeepe= r(HRegionServer.java:315) >>> =A0 =A0 =A0 at >>> >>> org.apache.hadoop.hbase.regionserver.HRegionServer.reinitialize(HRegion= Server.java:306) >>> =A0 =A0 =A0 at >>> >>> org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer= .java:276) >>> =A0 =A0 =A0 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(N= ative >>> Method) >>> =A0 =A0 =A0 at >>> >>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructor= AccessorImpl.java:39) >>> =A0 =A0 =A0 at >>> >>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingCon= structorAccessorImpl.java:27) >>> =A0 =A0 =A0 at java.lang.reflect.Constructor.newInstance(Constructor.ja= va:513) >>> =A0 =A0 =A0 at >>> >>> org.apache.hadoop.hbase.regionserver.HRegionServer.doMain(HRegionServer= .java:2474) >>> =A0 =A0 =A0 at >>> >>> org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.j= ava:2542) >>> 2009-12-04 07:07:36,743 WARN >>> org.apache.hadoop.hbase.regionserver.HRegionServer: Unable to set watch= er >>> on >>> ZooKeeper master address. Retrying. >>> >>> >>> >>> 3) =A0Zookeepr log: =A0Nothing much in there... just a starting message >>> line.. >>> followed by >>> >>> ulimit -n 1024 >>> >>> I looked at archives. =A0There was one mail that talked about 'ulimit'. >>> =A0Wonder if that has something to do with it. >>> >>> Thanks for your help. >>> >>> >>> >>> On Fri, Dec 4, 2009 at 8:18 AM, Mark Vigeant >>> wrote: >>> >>>> When I first started my hbase cluster, it too gave me the nonode for >>>> /hbase/master several times before it started working, and I believe >>>> this is >>>> a common beginner's error (I've seen it in a few emails in the past 2 >>>> weeks). >>>> >>>> What versions of HBase, Hadoop and ZooKeeper are you using? >>>> >>>> Also, take a look in your HBASE_HOME/logs folder. That would be a good >>>> place to start looking for some answers. >>>> >>>> -Mark >>>> >>>> -----Original Message----- >>>> From: Something Something [mailto:mailinglists19@gmail.com] >>>> Sent: Friday, December 04, 2009 2:28 AM >>>> To: hbase-user@hadoop.apache.org >>>> Subject: Starting HBase in fully distributed mode... >>>> >>>> Hello, >>>> >>>> I am trying to get Hadoop/HBase up and running in a fully distributed >>>> mode. >>>> =A0For now, I have only *1 Master & 2 Slaves*. >>>> >>>> The Hadoop starts correctly.. I think. =A0The only exception I see in >>>> various >>>> log files is this one... >>>> >>>> >>>> org.apache.hadoop.ipc.RemoteException: >>>> org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot delet= e >>>> /ebs1/mapred/system,/ebs2/mapred/system. Name node is in safe mode. >>>> The ratio of reported blocks 0.0000 has not reached the threshold >>>> 0.9990. >>>> *Safe >>>> mode will be turned off automatically*. >>>> =A0 =A0 =A0 at >>>> >>>> >>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSN= amesystem.java:1696) >>>> =A0 =A0 =A0 at >>>> >>>> >>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesyste= m.java:1676) >>>> =A0 =A0 =A0 at >>>> >>>> org.apache.hadoop.hdfs.server.namenode.NameNode.delete(NameNode.java:5= 17) >>>> =A0 =A0 =A0 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Met= hod) >>>> >>>> >>>> Somehow this doesn't sound critical, so I assumed everything was good = to >>>> go >>>> with Hadoop. >>>> >>>> >>>> So then I started HBase and opened a shell (hbase shell). =A0So far >>>> everything >>>> looks good. =A0Now when I try to run a 'list' command, I keep getting = this >>>> message: >>>> >>>> Caused by: org.apache.zookeeper.KeeperException$NoNodeException: >>>> KeeperErrorCode =3D *NoNode for /hbase/master* >>>> at org.apache.zookeeper.KeeperException.create(KeeperException.java:10= 2) >>>> at org.apache.zookeeper.KeeperException.create(KeeperException.java:42= ) >>>> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:892) >>>> at >>>> >>>> >>>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.readAddressOrThrow(= ZooKeeperWrapper.java:328) >>>> >>>> >>>> Here's what I have in my *Master hbase-site.xml* >>>> >>>> >>>> =A0 >>>> =A0 hbase.rootdir >>>> =A0 hdfs://master:54310/hbase >>>> =A0 >>>> =A0 >>>> =A0 hbase.cluster.distributed >>>> =A0 true >>>> =A0 >>>> =A0 >>>> =A0 hbase.zookeeper.property.clientPort >>>> =A0 2181 >>>> =A0 >>>> =A0 >>>> =A0 hbase.zookeeper.quorum >>>> =A0 master,slave1,slave2 >>>> =A0 >>>> >>>> >>>> >>>> >>>> The *Slave *hbase-site.xml are set as follows: >>>> >>>> =A0 >>>> =A0 hbase.rootdir >>>> =A0 hdfs://master:54310/hbase >>>> =A0 >>>> =A0 >>>> =A0 hbase.cluster.distributed >>>> =A0 false >>>> =A0 >>>> =A0 >>>> =A0 hbase.zookeeper.property.clientPort >>>> =A0 2181 >>>> =A0 >>>> >>>> >>>> In the hbase-env.sh file on ALL 3 machines I have set the JAVA_HOME an= d >>>> set >>>> the HBase classpath as follows: >>>> >>>> export HBASE_CLASSPATH=3D$HBASE_CLASSPATH:/ebs1/hadoop-0.20.1/conf >>>> >>>> >>>> On *Master* I have added Master & Slaves IP hostnames to *regionserver= s* >>>> file. >>>> =A0On *slaves*, the regionservers file is empty. >>>> >>>> >>>> I have run hadoop namenode -format multiple times, but still keep >>>> getting.. >>>> "NoNode for /hbase/master". =A0What step did I miss? =A0Thanks for you= r >>>> help. >>>> >>>> This email message and any attachments are for the sole use of the >>>> intended >>>> recipients and may contain proprietary and/or confidential information >>>> which >>>> may be privileged or otherwise protected from disclosure. Any >>>> unauthorized >>>> review, use, disclosure or distribution is prohibited. If you are not = an >>>> intended recipient, please contact the sender by reply email and destr= oy >>>> the >>>> original message and any copies of the message as well as any >>>> attachments to >>>> the original message. >>>> >