hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Hunt <ph...@apache.org>
Subject Re: Starting HBase in fully distributed mode...
Date Fri, 04 Dec 2009 22:35:40 GMT
Sorry, but I'm still not able to grok this issue. Perhaps you can shed 
more light: here's the exact code from our server to bind to the client 
port:

     ss.socket().bind(new InetSocketAddress(port));

my understanding from the java docs is this:

     public InetSocketAddress(int port)
         "Creates a socket address where the IP address is the wildcard 
address and the port number a specified value."


afaik this binds the socket onto the specified port for any ip on any 
interface of the host. Where am I going wrong?

Patrick

Jean-Daniel Cryans wrote:
> The first two definitions here is what I'm talking about
> http://developer.amazonwebservices.com/connect/entry.jspa?externalID=1346
> 
> So by default it usually doesn't listen on the interface associated
> with the hostname ec2-IP-compute-1.amazonaws.com but on the other one
> (IIRC starts with dom-).
> 
> J-D
> 
> On Fri, Dec 4, 2009 at 12:41 PM, Patrick Hunt <phunt@apache.org> wrote:
>> I'm not familiar with ec2, when you say "listen on private hostname" what
>> does that mean? Do you mean "by default listen on an interface with a
>> non-routable (localonly) ip"? Or something else. Is there an aws page you
>> can point me to?
>>
>> Patrick
>>
>> Jean-Daniel Cryans wrote:
>>> When you saw:
>>>
>>> org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot delete
>>> /ebs1/mapred/system,/ebs2/mapred/system. Name node is in safe mode.
>>> The ratio of reported blocks 0.0000 has not reached the threshold 0.9990.
>>> *Safe
>>> mode will be turned off automatically*.
>>>
>>> It means that HDFS is blocking everything (aka safe mode) until all
>>> datanodes reported for duty (and then it waits for 30 seconds to make
>>> sure).
>>>
>>> When you saw:
>>>
>>> Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
>>> KeeperErrorCode = *NoNode for /hbase/master*
>>>
>>> It means that the Master node didn't write his znode in Zookeeper
>>> because... when you saw:
>>>
>>> 2009-12-04 07:07:37,149 WARN org.apache.zookeeper.ClientCnxn: Exception
>>> closing session 0x0 to sun.nio.ch.SelectionKeyImpl@10e35d5
>>> java.net.ConnectException: Connection refused
>>>
>>> It really means that the connection was refused. It then says it
>>> attempted to connect to ec2-174-129-127-141.compute-1.amazonaws.com
>>> but wasn't able to. AFAIK in EC2 the java processes tend to listen on
>>> their private hostname not the public one (which would be bad
>>> anyways).
>>>
>>> Bottom line, make sure stuff listens where they are expected and it
>>> should then work well.
>>>
>>> J-D
>>>
>>> On Fri, Dec 4, 2009 at 11:23 AM, Something Something
>>> <mailinglists19@gmail.com> wrote:
>>>> Hadoop: 0.20.1
>>>>
>>>> HBase: 0.20.2
>>>>
>>>> Zookeeper: The one which gets started by default by HBase.
>>>>
>>>>
>>>> HBase logs:
>>>>
>>>> 1)  Master log shows this WARN message, but then it says 'connection
>>>> successful'
>>>>
>>>>
>>>> 2009-12-04 07:07:37,149 WARN org.apache.zookeeper.ClientCnxn: Exception
>>>> closing session 0x0 to sun.nio.ch.SelectionKeyImpl@10e35d5
>>>> java.net.ConnectException: Connection refused
>>>>       at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>       at
>>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>       at
>>>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:933)
>>>> 2009-12-04 07:07:37,150 WARN org.apache.zookeeper.ClientCnxn: Ignoring
>>>> exception during shutdown input
>>>> java.nio.channels.ClosedChannelException
>>>>       at
>>>> sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:638)
>>>>       at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
>>>>       at
>>>> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:999)
>>>>       at
>>>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:970)
>>>> 2009-12-04 07:07:37,150 WARN org.apache.zookeeper.ClientCnxn: Ignoring
>>>> exception during shutdown output
>>>> java.nio.channels.ClosedChannelException
>>>>       at
>>>> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:649)
>>>>       at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>>>>       at
>>>> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1004)
>>>>       at
>>>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:970)
>>>> 2009-12-04 07:07:37,199 INFO
>>>> org.apache.hadoop.hbase.master.RegionManager:
>>>> -ROOT- region unset (but not set to be reassigned)
>>>> 2009-12-04 07:07:37,200 INFO
>>>> org.apache.hadoop.hbase.master.RegionManager:
>>>> ROOT inserted into regionsInTransition
>>>> 2009-12-04 07:07:37,667 INFO org.apache.zookeeper.ClientCnxn: Attempting
>>>> connection to server
>>>> ec2-174-129-127-141.compute-1.amazonaws.com/10.252.146.65:2181
>>>> 2009-12-04 07:07:37,668 INFO org.apache.zookeeper.ClientCnxn: Priming
>>>> connection to java.nio.channels.SocketChannel[connected local=/
>>>> 10.252.162.19:46195 remote=
>>>> ec2-174-129-127-141.compute-1.amazonaws.com/10.252.146.65:2181]
>>>> 2009-12-04 07:07:37,670 INFO org.apache.zookeeper.ClientCnxn: Server
>>>> connection successful
>>>>
>>>>
>>>>
>>>> 2)  Regionserver log shows this... but later seems to have recovered:
>>>>
>>>> 2009-12-04 07:07:36,576 WARN org.apache.zookeeper.ClientCnxn: Exception
>>>> closing session 0x0 to sun.nio.ch.SelectionKeyImpl@4ee70b
>>>> java.net.ConnectException: Connection refused
>>>>       at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>       at
>>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>       at
>>>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:933)
>>>> 2009-12-04 07:07:36,611 WARN org.apache.zookeeper.ClientCnxn: Ignoring
>>>> exception during shutdown input
>>>> java.nio.channels.ClosedChannelException
>>>>       at
>>>> sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:638)
>>>>       at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
>>>>       at
>>>> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:999)
>>>>       at
>>>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:970)
>>>> 2009-12-04 07:07:36,611 WARN org.apache.zookeeper.ClientCnxn: Ignoring
>>>> exception during shutdown output
>>>> java.nio.channels.ClosedChannelException
>>>>       at
>>>> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:649)
>>>>       at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>>>>       at
>>>> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1004)
>>>>       at
>>>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:970)
>>>> 2009-12-04 07:07:36,742 WARN
>>>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Failed to set watcher
>>>> on
>>>> ZNode /hbase/master
>>>> org.apache.zookeeper.KeeperException$ConnectionLossException:
>>>> KeeperErrorCode = ConnectionLoss for /hbase/master
>>>>       at
>>>> org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
>>>>       at
>>>> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>>>>       at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:780)
>>>>       at
>>>>
>>>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.watchMasterAddress(ZooKeeperWrapper.java:304)
>>>>       at
>>>>
>>>> org.apache.hadoop.hbase.regionserver.HRegionServer.watchMasterAddress(HRegionServer.java:385)
>>>>       at
>>>>
>>>> org.apache.hadoop.hbase.regionserver.HRegionServer.reinitializeZooKeeper(HRegionServer.java:315)
>>>>       at
>>>>
>>>> org.apache.hadoop.hbase.regionserver.HRegionServer.reinitialize(HRegionServer.java:306)
>>>>       at
>>>>
>>>> org.apache.hadoop.hbase.regionserver.HRegionServer.<init>(HRegionServer.java:276)
>>>>       at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>>>> Method)
>>>>       at
>>>>
>>>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>>>>       at
>>>>
>>>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>>>>       at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>>>>       at
>>>>
>>>> org.apache.hadoop.hbase.regionserver.HRegionServer.doMain(HRegionServer.java:2474)
>>>>       at
>>>>
>>>> org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:2542)
>>>> 2009-12-04 07:07:36,743 WARN
>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Unable to set watcher
>>>> on
>>>> ZooKeeper master address. Retrying.
>>>>
>>>>
>>>>
>>>> 3)  Zookeepr log:  Nothing much in there... just a starting message
>>>> line..
>>>> followed by
>>>>
>>>> ulimit -n 1024
>>>>
>>>> I looked at archives.  There was one mail that talked about 'ulimit'.
>>>>  Wonder if that has something to do with it.
>>>>
>>>> Thanks for your help.
>>>>
>>>>
>>>>
>>>> On Fri, Dec 4, 2009 at 8:18 AM, Mark Vigeant
>>>> <mark.vigeant@riskmetrics.com>wrote:
>>>>
>>>>> When I first started my hbase cluster, it too gave me the nonode for
>>>>> /hbase/master several times before it started working, and I believe
>>>>> this is
>>>>> a common beginner's error (I've seen it in a few emails in the past 2
>>>>> weeks).
>>>>>
>>>>> What versions of HBase, Hadoop and ZooKeeper are you using?
>>>>>
>>>>> Also, take a look in your HBASE_HOME/logs folder. That would be a good
>>>>> place to start looking for some answers.
>>>>>
>>>>> -Mark
>>>>>
>>>>> -----Original Message-----
>>>>> From: Something Something [mailto:mailinglists19@gmail.com]
>>>>> Sent: Friday, December 04, 2009 2:28 AM
>>>>> To: hbase-user@hadoop.apache.org
>>>>> Subject: Starting HBase in fully distributed mode...
>>>>>
>>>>> Hello,
>>>>>
>>>>> I am trying to get Hadoop/HBase up and running in a fully distributed
>>>>> mode.
>>>>>  For now, I have only *1 Master & 2 Slaves*.
>>>>>
>>>>> The Hadoop starts correctly.. I think.  The only exception I see in
>>>>> various
>>>>> log files is this one...
>>>>>
>>>>>
>>>>> org.apache.hadoop.ipc.RemoteException:
>>>>> org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot delete
>>>>> /ebs1/mapred/system,/ebs2/mapred/system. Name node is in safe mode.
>>>>> The ratio of reported blocks 0.0000 has not reached the threshold
>>>>> 0.9990.
>>>>> *Safe
>>>>> mode will be turned off automatically*.
>>>>>       at
>>>>>
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:1696)
>>>>>       at
>>>>>
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:1676)
>>>>>       at
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.delete(NameNode.java:517)
>>>>>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>
>>>>>
>>>>> Somehow this doesn't sound critical, so I assumed everything was good
to
>>>>> go
>>>>> with Hadoop.
>>>>>
>>>>>
>>>>> So then I started HBase and opened a shell (hbase shell).  So far
>>>>> everything
>>>>> looks good.  Now when I try to run a 'list' command, I keep getting this
>>>>> message:
>>>>>
>>>>> Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
>>>>> KeeperErrorCode = *NoNode for /hbase/master*
>>>>> at org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
>>>>> at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>>>>> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:892)
>>>>> at
>>>>>
>>>>>
>>>>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.readAddressOrThrow(ZooKeeperWrapper.java:328)
>>>>>
>>>>>
>>>>> Here's what I have in my *Master hbase-site.xml*
>>>>>
>>>>> <configuration>
>>>>>  <property>
>>>>>   <name>hbase.rootdir</name>
>>>>>   <value>hdfs://master:54310/hbase</value>
>>>>>  </property>
>>>>>  <property>
>>>>>   <name>hbase.cluster.distributed</name>
>>>>>   <value>true</value>
>>>>>  </property>
>>>>>  <property>
>>>>>   <name>hbase.zookeeper.property.clientPort</name>
>>>>>   <value>2181</value>
>>>>>  </property>
>>>>>  <property>
>>>>>   <name>hbase.zookeeper.quorum</name>
>>>>>   <value>master,slave1,slave2</value>
>>>>>  </property>
>>>>> <property>
>>>>>
>>>>>
>>>>>
>>>>> The *Slave *hbase-site.xml are set as follows:
>>>>>
>>>>>  <property>
>>>>>   <name>hbase.rootdir</name>
>>>>>   <value>hdfs://master:54310/hbase</value>
>>>>>  </property>
>>>>>  <property>
>>>>>   <name>hbase.cluster.distributed</name>
>>>>>   <value>false</value>
>>>>>  </property>
>>>>>  <property>
>>>>>   <name>hbase.zookeeper.property.clientPort</name>
>>>>>   <value>2181</value>
>>>>>  </property>
>>>>>
>>>>>
>>>>> In the hbase-env.sh file on ALL 3 machines I have set the JAVA_HOME and
>>>>> set
>>>>> the HBase classpath as follows:
>>>>>
>>>>> export HBASE_CLASSPATH=$HBASE_CLASSPATH:/ebs1/hadoop-0.20.1/conf
>>>>>
>>>>>
>>>>> On *Master* I have added Master & Slaves IP hostnames to *regionservers*
>>>>> file.
>>>>>  On *slaves*, the regionservers file is empty.
>>>>>
>>>>>
>>>>> I have run hadoop namenode -format multiple times, but still keep
>>>>> getting..
>>>>> "NoNode for /hbase/master".  What step did I miss?  Thanks for your
>>>>> help.
>>>>>
>>>>> This email message and any attachments are for the sole use of the
>>>>> intended
>>>>> recipients and may contain proprietary and/or confidential information
>>>>> which
>>>>> may be privileged or otherwise protected from disclosure. Any
>>>>> unauthorized
>>>>> review, use, disclosure or distribution is prohibited. If you are not
an
>>>>> intended recipient, please contact the sender by reply email and destroy
>>>>> the
>>>>> original message and any copies of the message as well as any
>>>>> attachments to
>>>>> the original message.
>>>>>

Mime
View raw message