hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lars George <l...@worldlingo.com>
Subject Re: HBase 0.20.1 Distributed Install Problems
Date Wed, 11 Nov 2009 08:15:19 GMT
Chris,

What do you mean there are no region server logs? On the M2-M5 you have 
no logs? Is the Java process for the RS running? If so, could you jstck 
it to see where it hangs?

Maybe you have an access/owner issue with the log dirs on the RS machines?

The master log looks OK.

Lars

Chris Bates schrieb:
> Again, I really appreciate the help.  I removed the master from the region
> server list and made sure the rest of the machines had an updated list.  No
> region servers still:
> hbase(main):001:0> zk_dump
>
> HBase tree in ZooKeeper is rooted at /hbase
>   Cluster up? true
>   In safe mode? true
>   Master address: 172.16.1.46:60000
>   Region server holding ROOT: 172.16.1.46:60020
>   Region servers:
>
> hbase(main):002:0> status 'simple'
> 0 live servers
> 0 dead servers
>
> I checked the /etc/hosts file on all machines and they all have 127.0.0.1
> localhost.localdomain localhost and then their other mappings for other
> domains, with the box name mapping was removed.
>
> There are no regionserver logs.  But the master log is this:
> 2009-11-11 03:02:34,798 INFO org.apache.hadoop.hbase.master.RegionManager:
> -ROOT- region unset (but not set to be reassigned)
> 2009-11-11 03:02:34,799 INFO org.apache.hadoop.hbase.master.RegionManager:
> ROOT inserted into regionsInTransition
> 2009-11-11 03:02:35,078 INFO org.apache.zookeeper.ClientCnxn: Attempting
> connection to server chanel2/172.16.1.46:2181
> 2009-11-11 03:02:35,078 INFO org.apache.zookeeper.ClientCnxn: Priming
> connection to java.nio.channels.SocketChannel[connected local=/
> 172.16.1.46:53335 remote=chanel2/172.16.1.46:2181]
> 2009-11-11 03:02:35,078 INFO org.apache.zookeeper.ClientCnxn: Server
> connection successful
> 2009-11-11 03:02:35,179 INFO org.apache.hadoop.hbase.master.HMaster: HMaster
> initialized on 172.16.1.46:60000
> 2009-11-11 03:02:35,197 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
> Initializing JVM Metrics with processName=Master, sessionId=HMaster
> 2009-11-11 03:02:35,198 INFO
> org.apache.hadoop.hbase.master.metrics.MasterMetrics: Initialized
> 2009-11-11 03:02:35,373 INFO org.apache.hadoop.http.HttpServer: Port
> returned by webServer.getConnectors()[0].getLocalPort() before open() is -1.
> Opening the listener on 60010
> 2009-11-11 03:02:35,374 INFO org.apache.hadoop.http.HttpServer:
> listener.getLocalPort() returned 60010
> webServer.getConnectors()[0].getLocalPort() returned 60010
> 2009-11-11 03:02:35,374 INFO org.apache.hadoop.http.HttpServer: Jetty bound
> to port 60010
> 2009-11-11 03:02:52,692 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> Responder: starting
> 2009-11-11 03:02:52,693 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> listener on 60000: starting
> 2009-11-11 03:02:52,695 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 0 on 60000: starting
> 2009-11-11 03:02:52,695 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 1 on 60000: starting
> 2009-11-11 03:02:52,696 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 2 on 60000: starting
> 2009-11-11 03:02:52,696 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 3 on 60000: starting
> 2009-11-11 03:02:52,696 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 4 on 60000: starting
> 2009-11-11 03:02:52,697 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 5 on 60000: starting
> 2009-11-11 03:02:52,697 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 6 on 60000: starting
> 2009-11-11 03:02:52,697 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 7 on 60000: starting
> 2009-11-11 03:02:52,698 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 8 on 60000: starting
> 2009-11-11 03:02:52,698 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 9 on 60000: starting
> 2009-11-11 03:03:34,719 INFO org.apache.hadoop.hbase.master.ServerManager: 0
> region servers, 0 dead, average load NaN
> 2009-11-11 03:03:35,200 INFO org.apache.hadoop.hbase.master.BaseScanner: All
> 0 .META. region(s) scanned
>
>
>
> On Wed, Nov 11, 2009 at 2:39 AM, Jeff Zhang <zjffdu@gmail.com> wrote:
>
>   
>> Hi Jean,
>>
>> Thank you, after I remove the mapping from sha-cs-03 stuff to localhost it
>> works.
>>
>> But I installed hadoop successfully on these machines before, is hbase
>> different from hadoop about the ip mapping ?
>>
>>
>> Jeff Zhang
>>
>>
>>
>> On Wed, Nov 11, 2009 at 1:29 PM, Jean-Daniel Cryans <jdcryans@apache.org
>>     
>>> wrote:
>>>       
>>> Check your OS networking configuration, make sure stuff don't resolves
>>> to localhost or 127.0.0.1 or 127.0.1.1
>>>
>>> Also you said you can't run the list, what does it do then?
>>>
>>> J-D
>>>
>>> On Tue, Nov 10, 2009 at 9:23 PM, Jeff Zhang <zjffdu@gmail.com> wrote:
>>>       
>>>> *I configure the regionservers in the file regsionservers as
>>>>         
>> following:*
>>     
>>>> sha-cs-01
>>>> sha-cs-02
>>>> sha-cs-03
>>>> sha-cs-05
>>>> sha-cs-06
>>>>
>>>> *And also I configure the zookeeper in file hbase-site.xml as
>>>>         
>> following:*
>>     
>>>> <configuration>
>>>>  <property>
>>>>    <name>hbase.cluster.distributed</name>
>>>>    <value>true</value>
>>>>    <description>The mode the cluster will be in. Possible values are
>>>>      false: standalone and pseudo-distributed setups with managed
>>>>         
>>> Zookeeper
>>>       
>>>>      true: fully-distributed with unmanaged Zookeeper Quorum (see
>>>> hbase-env.sh)
>>>>    </description>
>>>>  </property>
>>>>  <property>
>>>>      <name>hbase.zookeeper.property.clientPort</name>
>>>>      <value>2222</value>
>>>>      <description>Property from ZooKeeper's config zoo.cfg.
>>>>      The port at which the clients will connect.
>>>>      </description>
>>>>    </property>
>>>>  <property>
>>>>      <name>hbase.zookeeper.quorum</name>
>>>>      <value>*sha-cs-01,sha-cs-02,sha-cs-03,sha-cs-04,sha-cs-06*</value>
>>>>      <description>Comma separated list of servers in the ZooKeeper
>>>>         
>>> Quorum.
>>>       
>>>>      For example, "host1.mydomain.com,host2.mydomain.com,
>>>>         
>>> host3.mydomain.com
>>>       
>>>> ".
>>>>      By default this is set to localhost for local and
>>>>         
>> pseudo-distributed
>>     
>>>> modes
>>>>      of operation. For a fully-distributed setup, this should be set to
>>>>         
>> a
>>     
>>>> full
>>>>      list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK is set in
>>>> hbase-env.sh
>>>>      this is the list of servers which we will start/stop ZooKeeper on.
>>>>      </description>
>>>>  </property>
>>>>  <property>
>>>>    <name>hbase.rootdir</name>
>>>>    <value>hdfs://sha-cs-04:9000/hbase</value>
>>>>    <description>The directory shared by region servers.
>>>>    </description>
>>>>  </property>
>>>>
>>>> </configuration>
>>>>
>>>>
>>>> I still do not understand what's wrong with my configuration ?
>>>>
>>>>
>>>> Jeff Zhang
>>>>
>>>>
>>>>
>>>> On Wed, Nov 11, 2009 at 12:56 PM, Jean-Daniel Cryans <
>>>>         
>>> jdcryans@apache.org>wrote:
>>>       
>>>>> Please read my answer to Chris (wrote about 10-15 minutes ago), you
>>>>> also seem to confuse regionservers and zookeeper quorum members.
>>>>>
>>>>> In this case it also seems some region servers registered themselves
>>>>> as localhost and then with their good address the master probably gave
>>>>> them. Please check your OS network configurations and make sure the
>>>>> hostname points at the right place.
>>>>>
>>>>> J-D
>>>>>
>>>>> On Tue, Nov 10, 2009 at 8:47 PM, Jeff Zhang <zjffdu@gmail.com> wrote:
>>>>>           
>>>>>> Hi Jean,
>>>>>>
>>>>>> I try the hbase 0.20.2, I look the logs, it seems the master the
>>>>>>             
>>> regions
>>>       
>>>>>> works.
>>>>>>
>>>>>> But I can not run list command on hbase shell. When I invoke command
>>>>>>             
>>>>> status
>>>>>           
>>>>>> 'simple' on hbase shell. It shows the following message:
>>>>>> 09/11/11 12:42:55 DEBUG client.HConnectionManager$ClientZKWatcher:
>>>>>>             
>> Got
>>     
>>>>>> ZooKeeper event, state: SyncConnected, type: None, path: null
>>>>>> 09/11/11 12:42:55 DEBUG zookeeper.ZooKeeperWrapper: Read ZNode
>>>>>>             
>>>>> /hbase/master
>>>>>           
>>>>>> got 10.148.224.13:60000
>>>>>> 8 servers, 0 dead, 0.1250 average load
>>>>>> hbase(main):002:0> status 'simple'
>>>>>> 8 live servers
>>>>>>    localhost:60020 1257914319445
>>>>>>        requests=0, regions=0, usedHeap=0, maxHeap=0
>>>>>>    sha-cs-03:60020 1257914321331
>>>>>>        requests=0, regions=0, usedHeap=33, maxHeap=991
>>>>>>    localhost:60020 1257914320265
>>>>>>        requests=0, regions=0, usedHeap=0, maxHeap=0
>>>>>>    sha-cs-01:60020 1257914320551
>>>>>>        requests=0, regions=1, usedHeap=34, maxHeap=991
>>>>>>    sha-cs-05:60020 1257914322656
>>>>>>        requests=0, regions=0, usedHeap=33, maxHeap=991
>>>>>>    sha-cs-06:60020 1257914321467
>>>>>>        requests=0, regions=0, usedHeap=34, maxHeap=991
>>>>>>    localhost:60020 1257914320202
>>>>>>        requests=0, regions=0, usedHeap=0, maxHeap=0
>>>>>>    localhost:60020 1257914321532
>>>>>>        requests=0, regions=0, usedHeap=0, maxHeap=0
>>>>>>
>>>>>>
>>>>>> It's weired that why here I have 3 localhost zookeeper, actually I
>>>>>>             
>> set
>>     
>>> 5
>>>       
>>>>>> machines on hbase.zookeeper.quorum
>>>>>>
>>>>>>
>>>>>>
>>>>>> Jeff Zhang
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Nov 11, 2009 at 9:47 AM, Jean-Daniel Cryans <
>>>>>>             
>>> jdcryans@apache.org
>>>       
>>>>>> wrote:
>>>>>>
>>>>>>             
>>>>>>> This particular problem is fixed in the current 0.20 branch and we
>>>>>>> just released a candidate for 0.20.2, you can get it here
>>>>>>> http://people.apache.org/~jdcryans/hbase-0.20.2-candidate-1/<
>>>>>>>               
>> http://people.apache.org/%7Ejdcryans/hbase-0.20.2-candidate-1/>
>>     
>>> <http://people.apache.org/%7Ejdcryans/hbase-0.20.2-candidate-1/>
>>>       
>>>>> <http://people.apache.org/%7Ejdcryans/hbase-0.20.2-candidate-1/>
>>>>>           
>>>>>>> J-D
>>>>>>>
>>>>>>> On Tue, Nov 10, 2009 at 5:43 PM, Jeff Zhang <zjffdu@gmail.com>
>>>>>>>               
>>> wrote:
>>>       
>>>>>>>> The following is the region server's log :
>>>>>>>>
>>>>>>>>
>>>>>>>> 2009-11-10 18:09:08,062 INFO org.apache.hadoop.ipc.HBaseServer:
>>>>>>>>                 
>> IPC
>>     
>>>>>>> Server
>>>>>>>               
>>>>>>>> handler 3 on 60020: starting
>>>>>>>> 2009-11-10 18:09:08,063 INFO org.apache.hadoop.ipc.HBaseServer:
>>>>>>>>                 
>> IPC
>>     
>>>>>>> Server
>>>>>>>               
>>>>>>>> handler 4 on 60020: starting
>>>>>>>> 2009-11-10 18:09:08,063 INFO org.apache.hadoop.ipc.HBaseServer:
>>>>>>>>                 
>> IPC
>>     
>>>>>>> Server
>>>>>>>               
>>>>>>>> handler 5 on 60020: starting
>>>>>>>> 2009-11-10 18:09:08,063 INFO org.apache.hadoop.ipc.HBaseServer:
>>>>>>>>                 
>> IPC
>>     
>>>>>>> Server
>>>>>>>               
>>>>>>>> handler 6 on 60020: starting
>>>>>>>> 2009-11-10 18:09:08,063 INFO org.apache.hadoop.ipc.HBaseServer:
>>>>>>>>                 
>> IPC
>>     
>>>>>>> Server
>>>>>>>               
>>>>>>>> handler 7 on 60020: starting
>>>>>>>> 2009-11-10 18:09:08,063 INFO org.apache.hadoop.ipc.HBaseServer:
>>>>>>>>                 
>> IPC
>>     
>>>>>>> Server
>>>>>>>               
>>>>>>>> handler 8 on 60020: starting
>>>>>>>> 2009-11-10 18:09:08,063 INFO
>>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: HRegionServer
>>>>>>>>                 
>>>>> started
>>>>>           
>>>>>>>> at: 10.148.224.11:60020
>>>>>>>> 2009-11-10 18:09:08,064 INFO org.apache.hadoop.ipc.HBaseServer:
>>>>>>>>                 
>> IPC
>>     
>>>>>>> Server
>>>>>>>               
>>>>>>>> handler 9 on 60020: starting
>>>>>>>> 2009-11-10 18:09:08,070 INFO
>>>>>>>>                 
>>>>>>> org.apache.hadoop.hbase.regionserver.StoreFile:
>>>>>>>               
>>>>>>>> Allocating LruBlockCache with maximum size 198.3m
>>>>>>>> 2009-11-10 18:09:08,095 INFO
>>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer:
>>>>>>>>                 
>>>>>>> MSG_CALL_SERVER_STARTUP
>>>>>>>               
>>>>>>>> 2009-11-10 18:09:08,229 INFO
>>>>>>>>                 
>>>>> org.apache.hadoop.hbase.regionserver.HLog:
>>>>>           
>>>>>>> HLog
>>>>>>>               
>>>>>>>> configuration: blocksize=67108864, rollsize=63753420,
>>>>>>>>                 
>> enabled=true,
>>     
>>>>>>>> flushlogentries=100, optionallogflushinternal=10000ms
>>>>>>>> 2009-11-10 18:09:08,253 INFO
>>>>>>>>                 
>>>>> org.apache.hadoop.hbase.regionserver.HLog:
>>>>>           
>>>>>>> New
>>>>>>>               
>>>>>>>> hlog /hbase/.logs/10.148.224.11
>>>>>>>>                 
>>>>>>> ,60020,1257847748205/hlog.dat.1257847748229
>>>>>>>               
>>>>>>>> 2009-11-10 18:09:08,255 INFO
>>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Telling
>>>>>>>>                 
>> master
>>     
>>> at
>>>       
>>>>>>>> 10.148.224.13:60000 that we are up
>>>>>>>> 2009-11-10 18:09:08,302 FATAL
>>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Unhandled
>>>>>>>>                 
>>>>> exception.
>>>>>           
>>>>>>>> Aborting...
>>>>>>>> java.lang.NullPointerException
>>>>>>>>        at
>>>>>>>>
>>>>>>>>                 
>> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:459)
>>     
>>>>>>>>        at java.lang.Thread.run(Thread.java:619)
>>>>>>>> 2009-11-10 18:09:08,304 INFO
>>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of
>>>>>>>>                 
>>> metrics:
>>>       
>>>>>>>> request=0.0, regions=0, stores=0, storefiles=0,
>>>>>>>>                 
>>> storefileIndexSize=0,
>>>       
>>>>>>>> memstoreSize=0, usedHeap=31, maxHeap=99
>>>>>>>> 1, blockCacheSize=1707288, blockCacheFree=206264664,
>>>>>>>>                 
>>>>> blockCacheCount=0,
>>>>>           
>>>>>>>> blockCacheHitRatio=0
>>>>>>>> 2009-11-10 18:09:08,304 INFO org.apache.hadoop.ipc.HBaseServer:
>>>>>>>>                 
>>>>> Stopping
>>>>>           
>>>>>>>> server on 60020
>>>>>>>> 2009-11-10 18:09:08,304 INFO org.apache.hadoop.ipc.HBaseServer:
>>>>>>>>                 
>> IPC
>>     
>>>>>>> Server
>>>>>>>               
>>>>>>>> handler 0 on 60020: exiting
>>>>>>>> 2009-11-10 18:09:08,304 INFO org.apache.hadoop.ipc.HBaseServer:
>>>>>>>>                 
>>>>> Stopping
>>>>>           
>>>>>>> IPC
>>>>>>>               
>>>>>>>> Server listener on 60020
>>>>>>>> 2009-11-10 18:09:08,304 INFO org.apache.hadoop.ipc.HBaseServer:
>>>>>>>>                 
>> IPC
>>     
>>>>>>> Server
>>>>>>>               
>>>>>>>> handler 1 on 60020: exiting
>>>>>>>> 2009-11-10 18:09:08,304 INFO org.apache.hadoop.ipc.HBaseServer:
>>>>>>>>                 
>> IPC
>>     
>>>>>>> Server
>>>>>>>               
>>>>>>>> handler 2 on 60020: exiting
>>>>>>>> 2009-11-10 18:09:08,305 INFO org.apache.hadoop.ipc.HBaseServer:
>>>>>>>>                 
>> IPC
>>     
>>>>>>> Server
>>>>>>>               
>>>>>>>> handler 3 on 60020: exiting
>>>>>>>> 2009-11-10 18:09:08,305 INFO org.apache.hadoop.ipc.HBaseServer:
>>>>>>>>                 
>> IPC
>>     
>>>>>>> Server
>>>>>>>               
>>>>>>>> handler 4 on 60020: exiting
>>>>>>>> 2009-11-10 18:09:08,305 INFO org.apache.hadoop.ipc.HBaseServer:
>>>>>>>>                 
>> IPC
>>     
>>>>>>> Server
>>>>>>>               
>>>>>>>> handler 5 on 60020: exiting
>>>>>>>> 2009-11-10 18:09:08,305 INFO org.apache.hadoop.ipc.HBaseServer:
>>>>>>>>                 
>> IPC
>>     
>>>>>>> Server
>>>>>>>               
>>>>>>>> handler 6 on 60020: exiting
>>>>>>>> 2009-11-10 18:09:08,305 INFO org.apache.hadoop.ipc.HBaseServer:
>>>>>>>>                 
>> IPC
>>     
>>>>>>> Server
>>>>>>>               
>>>>>>>> handler 7 on 60020: exiting
>>>>>>>> 2009-11-10 18:09:08,305 INFO org.apache.hadoop.ipc.HBaseServer:
>>>>>>>>                 
>> IPC
>>     
>>>>>>> Server
>>>>>>>               
>>>>>>>> handler 8 on 60020: exiting
>>>>>>>> 2009-11-10 18:09:08,305 INFO org.apache.hadoop.ipc.HBaseServer:
>>>>>>>>                 
>> IPC
>>     
>>>>>>> Server
>>>>>>>               
>>>>>>>> handler 9 on 60020: exiting
>>>>>>>> 2009-11-10 18:09:08,306 INFO
>>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Stopping
>>>>>>>>                 
>>>>> infoServer
>>>>>           
>>>>>>>> 2009-11-10 18:09:08,307 INFO org.apache.hadoop.ipc.HBaseServer:
>>>>>>>>                 
>>>>> Stopping
>>>>>           
>>>>>>> IPC
>>>>>>>               
>>>>>>>> Server Responder
>>>>>>>> 2009-11-10 18:09:08,412 INFO
>>>>>>>> org.apache.hadoop.hbase.regionserver.MemStoreFlusher:
>>>>>>>> regionserver/127.0.0.1:60020.cacheFlusher exiting
>>>>>>>> 2009-11-10 18:09:08,412 INFO
>>>>>>>> org.apache.hadoop.hbase.regionserver.LogFlusher:
>>>>>>>> regionserver/127.0.0.1:60020.logFlusher exiting
>>>>>>>> 2009-11-10 18:09:08,412 INFO
>>>>>>>> org.apache.hadoop.hbase.regionserver.CompactSplitThread:
>>>>>>>> regionserver/127.0.0.1:60020.compactor exiting
>>>>>>>> 2009-11-10 18:09:08,412 INFO
>>>>>>>>                 
>>>>>>> org.apache.hadoop.hbase.regionserver.LogRoller:
>>>>>>>               
>>>>>>>> LogRoller exiting.
>>>>>>>> 2009-11-10 18:09:08,413 INFO
>>>>>>>>
>>>>>>>>                 
>> org.apache.hadoop.hbase.regionserver.HRegionServer$MajorCompactionChecker:
>>     
>>>>>>>> regionserver/127.0.0.1:60020.majorCompactionChecker exiting
>>>>>>>> 2009-11-10 18:09:08,427 INFO
>>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: On abort,
>>>>>>>>                 
>>> closed
>>>       
>>>>> hlog
>>>>>           
>>>>>>>> 2009-11-10 18:09:08,428 INFO
>>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: aborting
>>>>>>>>                 
>> server
>>     
>>>>> at:
>>>>>           
>>>>>>>> 10.148.224.11:60020
>>>>>>>> 2009-11-10 18:09:17,489 INFO
>>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: worker thread
>>>>>>>>                 
>>>>> exiting
>>>>>           
>>>>>>>> 2009-11-10 18:09:17,489 INFO org.apache.zookeeper.ZooKeeper:
>>>>>>>>                 
>>> Closing
>>>       
>>>>>>>> session: 0x324dcceb05c0003
>>>>>>>> 2009-11-10 18:09:17,490 INFO org.apache.zookeeper.ClientCnxn:
>>>>>>>>                 
>>> Closing
>>>       
>>>>>>>> ClientCnxn for session: 0x324dcceb05c0003
>>>>>>>> 2009-11-10 18:09:17,495 INFO org.apache.hadoop.hbase.Leases:
>>>>>>>> regionserver/127.0.0.1:60020.leaseChecker closing leases
>>>>>>>> 2009-11-10 18:09:17,495 INFO org.apache.hadoop.hbase.Leases:
>>>>>>>> regionserver/127.0.0.1:60020.leaseChecker closed leases
>>>>>>>> 2009-11-10 18:09:17,500 INFO org.apache.zookeeper.ClientCnxn:
>>>>>>>>                 
>>>>> Exception
>>>>>           
>>>>>>>> while closing send thread for session 0x324dcceb05c0003 : Read
>>>>>>>>                 
>>> error
>>>       
>>>>> rc =
>>>>>           
>>>>>>> -1
>>>>>>>               
>>>>>>>> java.nio.DirectByteBuffer[pos=0 lim=4 cap=4]
>>>>>>>> 2009-11-10 18:09:17,604 INFO org.apache.zookeeper.ClientCnxn:
>>>>>>>>                 
>>>>>>> Disconnecting
>>>>>>>               
>>>>>>>> ClientCnxn for session: 0x324dcceb05c0003
>>>>>>>> 2009-11-10 18:09:17,604 INFO org.apache.zookeeper.ZooKeeper:
>>>>>>>>                 
>>> Session:
>>>       
>>>>>>>> 0x324dcceb05c0003 closed
>>>>>>>> 2009-11-10 18:09:17,605 INFO
>>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver/
>>>>>>>> 127.0.0.1:60020 exiting
>>>>>>>> 2009-11-10 18:09:17,605 INFO org.apache.zookeeper.ClientCnxn:
>>>>>>>>                 
>>>>> EventThread
>>>>>           
>>>>>>>> shut down
>>>>>>>> 2009-11-10 18:09:17,606 INFO
>>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Starting
>>>>>>>>                 
>>> shutdown
>>>       
>>>>>>>> thread.
>>>>>>>> 2009-11-10 18:09:17,606 INFO
>>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Shutdown
>>>>>>>>                 
>> thread
>>     
>>>>>>> complete
>>>>>>>               
>>>>>>>> On Tue, Nov 10, 2009 at 10:55 PM, Andrew Purtell <
>>>>>>>>                 
>>> apurtell@apache.org
>>>       
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>                 
>>>>>>>>> When you try to start the region servers, what do you see in the
>>>>>>>>>                   
>>> log?
>>>       
>>>>>>>>> If you don't change the client port
>>>>>>>>>                   
>>>>>>> (hbase.zookeeper.property.clientPort),
>>>>>>>               
>>>>>>>>> does it work?
>>>>>>>>>
>>>>>>>>>     - Andy
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ________________________________
>>>>>>>>> From: Jeff Zhang <zjffdu@gmail.com>
>>>>>>>>> To: hbase-user@hadoop.apache.org
>>>>>>>>> Sent: Tue, November 10, 2009 2:40:28 PM
>>>>>>>>> Subject: Re: HBase 0.20.1 Distributed Install Problems
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I meet the same problem that I can not start the regionserver.
>>>>>>>>>
>>>>>>>>> When I invoke zk_dump
>>>>>>>>>
>>>>>>>>> it shows:
>>>>>>>>>
>>>>>>>>> HBase tree in ZooKeeper is rooted at /hbase
>>>>>>>>>  Cluster up? true
>>>>>>>>>  In safe mode? true
>>>>>>>>>  Master address: 10.148.224.13:60000
>>>>>>>>>  Region server holding ROOT: null
>>>>>>>>>  Region servers:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The following is my hbase-site.xml
>>>>>>>>>
>>>>>>>>> <configuration>
>>>>>>>>>  <property>
>>>>>>>>>    <name>hbase.cluster.distributed</name>
>>>>>>>>>    <value>true</value>
>>>>>>>>>    <description>The mode the cluster will be in. Possible values
>>>>>>>>>                   
>>> are
>>>       
>>>>>>>>>      false: standalone and pseudo-distributed setups with
>>>>>>>>>                   
>> managed
>>     
>>>>>>> Zookeeper
>>>>>>>               
>>>>>>>>>      true: fully-distributed with unmanaged Zookeeper Quorum
>>>>>>>>>                   
>> (see
>>     
>>>>>>>>> hbase-env.sh)
>>>>>>>>>    </description>
>>>>>>>>>  </property>
>>>>>>>>>  <property>
>>>>>>>>>    <name>hbase.rootdir</name>
>>>>>>>>>    <value>hdfs://sha-cs-04:9000/hbase</value>
>>>>>>>>>    <description>The directory shared by region servers.
>>>>>>>>>    </description>
>>>>>>>>>  </property>
>>>>>>>>>  <property>
>>>>>>>>>      <name>hbase.zookeeper.property.clientPort</name>
>>>>>>>>>      <value>2222</value>
>>>>>>>>>      <description>Property from ZooKeeper's config zoo.cfg.
>>>>>>>>>      The port at which the clients will connect.
>>>>>>>>>      </description>
>>>>>>>>>   </property>
>>>>>>>>>   <property>
>>>>>>>>>      <name>hbase.zookeeper.quorum</name>
>>>>>>>>>
>>>>>>>>>                   
>>>  <value>sha-cs-01,sha-cs-02,sha-cs-03,sha-cs-05,sha-cs-06</value>
>>>       
>>>>>>>>>      <description>Comma separated list of servers in the
>>>>>>>>>                   
>> ZooKeeper
>>     
>>>>>>> Quorum.
>>>>>>>               
>>>>>>>>>      For example, "host1.mydomain.com,host2.mydomain.com,
>>>>>>>>> host3.mydomain.com
>>>>>>>>> ".
>>>>>>>>>      By default this is set to localhost for local and
>>>>>>>>>                   
>>>>>>> pseudo-distributed
>>>>>>>               
>>>>>>>>> modes
>>>>>>>>>      of operation. For a fully-distributed setup, this should be
>>>>>>>>>                   
>>> set
>>>       
>>>>> to
>>>>>           
>>>>>>> a
>>>>>>>               
>>>>>>>>> full
>>>>>>>>>      list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK is
>>>>>>>>>                   
>> set
>>     
>>> in
>>>       
>>>>>>>>> hbase-env.sh
>>>>>>>>>      this is the list of servers which we will start/stop
>>>>>>>>>                   
>>> ZooKeeper
>>>       
>>>>> on.
>>>>>           
>>>>>>>>>      </description>
>>>>>>>>>    </property>
>>>>>>>>>
>>>>>>>>> </configuration>
>>>>>>>>>
>>>>>>>>> What's wrong with my configuration ?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thank you in advance.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Jeff Zhang
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Nov 10, 2009 at 12:47 PM, Tatsuya Kawano
>>>>>>>>> <tatsuyaml@snowcocoa.info>wrote:
>>>>>>>>>
>>>>>>>>>                   
>>>>>>>>>> Hello,
>>>>>>>>>>
>>>>>>>>>> It looks like the master and the region servers are cannot
>>>>>>>>>>                     
>>> locate
>>>       
>>>>> each
>>>>>           
>>>>>>>>>> other. HBase 0.20.x uses ZooKeeper (zk) to locate other
>>>>>>>>>>                     
>> cluster
>>     
>>>>>>>>>> members, so maybe your zk has wrong information.
>>>>>>>>>>
>>>>>>>>>> Can you type zk_dump from hbase shell and let us the result?
>>>>>>>>>>
>>>>>>>>>> If the cluster is properly configured, you'll get something
>>>>>>>>>>                     
>> like
>>     
>>>>> this:
>>>>>           
>>>>>>>>>> =====================================
>>>>>>>>>> hbase(main):007:0> zk_dump
>>>>>>>>>>
>>>>>>>>>> HBase tree in ZooKeeper is rooted at /hbase
>>>>>>>>>>  Cluster up? true
>>>>>>>>>>  In safe mode? false
>>>>>>>>>>  Master address: 172.16.80.26:60000
>>>>>>>>>>  Region server holding ROOT: 172.16.80.27:60020
>>>>>>>>>>  Region servers:
>>>>>>>>>>   - 172.16.80.27:60020
>>>>>>>>>>   - 172.16.80.29:60020
>>>>>>>>>>   - 172.16.80.28:60020
>>>>>>>>>> =====================================
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                     
>>>>>>>>>>> one of my co-workers apparently can log into his box and
>>>>>>>>>>>                       
>>> submit
>>>       
>>>>>>> jobs,
>>>>>>>               
>>>>>>>>> but
>>>>>>>>>                   
>>>>>>>>>>> me or anyone else is still unable to log in.
>>>>>>>>>>>                       
>>>>>>>>>> Maybe you're a bit confused; your co-worker seems to be able
>>>>>>>>>>                     
>> to
>>     
>>> use
>>>       
>>>>>>>>>> Hadoop Map/Reduce, not HBase.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                     
>>>>>>>>>>> Does Hbase allow concurrent connections?
>>>>>>>>>>>                       
>>>>>>>>>> Yes.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                     
>>>>>>>>>>>> I think it also says the master is on port 60000
>>>>>>>>>>>> when the install directions say its supposed to be 60010?
>>>>>>>>>>>>                         
>>>>>>>>>> Port 60000 is correct. The master uses port 60000 to accept
>>>>>>>>>>                     
>>>>> connection
>>>>>           
>>>>>>>>>> from hbase shell and region servers. Port 60010 is for the
>>>>>>>>>>                     
>>>>> web-based
>>>>>           
>>>>>>>>>> HBase console.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                     
>>>>>>>>>>> We tried applying this fix (to explicitly set the master):
>>>>>>>>>>>
>>>>>>>>>>>                       
>>>>> http://osdir.com/ml/hbase-user-hadoop-apache/2009-05/msg00321.html
>>>>>           
>>>>>>>>>> No, this is an old way to configure a cluster. You shouldn't
>>>>>>>>>>                     
>> use
>>     
>>>>> this
>>>>>           
>>>>>>>>>> with HBase 0.20.x
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Tatsuya Kawano (Mr.)
>>>>>>>>>> Tokyo, Japan
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, Nov 10, 2009 at 1:10 PM, Chris Bates
>>>>>>>>>> <christopher.andrew.bates@gmail.com> wrote:
>>>>>>>>>>                     
>>>>>>>>>>> Another interesting data point.  We tried applying this fix
>>>>>>>>>>>                       
>>> (to
>>>       
>>>>>>>>>> explicitly
>>>>>>>>>>                     
>>>>>>>>>>> set the master):
>>>>>>>>>>>
>>>>>>>>>>>                       
>>>>> http://osdir.com/ml/hbase-user-hadoop-apache/2009-05/msg00321.html
>>>>>           
>>>>>>>>>>> But when I log in to the master node, it takes really long
>>>>>>>>>>>                       
>> to
>>     
>>>>> submit
>>>>>           
>>>>>>> a
>>>>>>>               
>>>>>>>>>> query
>>>>>>>>>>                     
>>>>>>>>>>> and I get this in response:
>>>>>>>>>>> hbase(main):001:0> list
>>>>>>>>>>> NativeException:
>>>>>>>>>>>                       
>>>>>>>>>> org.apache.hadoop.hbase.client.RetriesExhaustedException:
>>>>>>>>>>                     
>>>>>>>>>>> Trying to contact region server null for region , row '',
>>>>>>>>>>>                       
>> but
>>     
>>>>> failed
>>>>>           
>>>>>>>>>> after 5
>>>>>>>>>>                     
>>>>>>>>>>> attempts.
>>>>>>>>>>> Exceptions:
>>>>>>>>>>> org.apache.hadoop.hbase.client.NoServerForRegionException:
>>>>>>>>>>>                       
>>> Timed
>>>       
>>>>> out
>>>>>           
>>>>>>>>>> trying
>>>>>>>>>>                     
>>>>>>>>>>> to locate root region
>>>>>>>>>>> org.apache.hadoop.hbase.client.NoServerForRegionException:
>>>>>>>>>>>                       
>>> Timed
>>>       
>>>>> out
>>>>>           
>>>>>>>>>> trying
>>>>>>>>>>                     
>>>>>>>>>>> to locate root region
>>>>>>>>>>> org.apache.hadoop.hbase.client.NoServerForRegionException:
>>>>>>>>>>>                       
>>> Timed
>>>       
>>>>> out
>>>>>           
>>>>>>>>>> trying
>>>>>>>>>>                     
>>>>>>>>>>> to locate root region
>>>>>>>>>>> org.apache.hadoop.hbase.client.NoServerForRegionException:
>>>>>>>>>>>                       
>>> Timed
>>>       
>>>>> out
>>>>>           
>>>>>>>>>> trying
>>>>>>>>>>                     
>>>>>>>>>>> to locate root region
>>>>>>>>>>> org.apache.hadoop.hbase.client.NoServerForRegionException:
>>>>>>>>>>>                       
>>> Timed
>>>       
>>>>> out
>>>>>           
>>>>>>>>>> trying
>>>>>>>>>>                     
>>>>>>>>>>> to locate root region
>>>>>>>>>>>
>>>>>>>>>>> from
>>>>>>>>>>>                       
>>>>> org/apache/hadoop/hbase/client/HConnectionManager.java:1001:in
>>>>>           
>>>>>>>>>>> `getRegionServerWithRetries'
>>>>>>>>>>>  from org/apache/hadoop/hbase/client/MetaScanner.java:55:in
>>>>>>>>>>>                       
>>>>>>> `metaScan'
>>>>>>>               
>>>>>>>>>>> from org/apache/hadoop/hbase/client/MetaScanner.java:28:in
>>>>>>>>>>>                       
>>>>>>> `metaScan'
>>>>>>>               
>>>>>>>>>>>  from
>>>>>>>>>>>                       
>>>>> org/apache/hadoop/hbase/client/HConnectionManager.java:432:in
>>>>>           
>>>>>>>>>>> `listTables'
>>>>>>>>>>> from org/apache/hadoop/hbase/client/HBaseAdmin.java:127:in
>>>>>>>>>>>                       
>>>>>>> `listTables'
>>>>>>>               
>>>>>>>>>>>  from sun/reflect/NativeMethodAccessorImpl.java:-2:in
>>>>>>>>>>>                       
>>> `invoke0'
>>>       
>>>>>>>>>>> from sun/reflect/NativeMethodAccessorImpl.java:39:in
>>>>>>>>>>>                       
>> `invoke'
>>     
>>>>>>>>>>>  from sun/reflect/DelegatingMethodAccessorImpl.java:25:in
>>>>>>>>>>>                       
>>>>> `invoke'
>>>>>           
>>>>>>>>>>> from java/lang/reflect/Method.java:597:in `invoke'
>>>>>>>>>>>  from org/jruby/javasupport/JavaMethod.java:298:in
>>>>>>>>>>> `invokeWithExceptionHandling'
>>>>>>>>>>> from org/jruby/javasupport/JavaMethod.java:259:in `invoke'
>>>>>>>>>>>  from
>>>>>>>>>>>                       
>> org/jruby/java/invokers/InstanceMethodInvoker.java:36:in
>>     
>>>>>>> `call'
>>>>>>>               
>>>>>>>>>>> from org/jruby/runtime/callsite/CachingCallSite.java:253:in
>>>>>>>>>>>                       
>>>>>>>>>> `cacheAndCall'
>>>>>>>>>>                     
>>>>>>>>>>>  from org/jruby/runtime/callsite/CachingCallSite.java:72:in
>>>>>>>>>>>                       
>>>>> `call'
>>>>>           
>>>>>>>>>>> from org/jruby/ast/CallNoArgNode.java:61:in `interpret'
>>>>>>>>>>>  from org/jruby/ast/ForNode.java:104:in `interpret'
>>>>>>>>>>> ... 116 levels...
>>>>>>>>>>> from
>>>>>>>>>>>
>>>>>>>>>>>                       
>>> opt/hadoop/hbase_minus_0_dot_20_dot_1/bin/$_dot_dot_/bin/hirb#start:-1:in
>>>       
>>>>>>>>>>> `call'
>>>>>>>>>>>  from
>>>>>>>>>>>                       
>>>>> org/jruby/internal/runtime/methods/DynamicMethod.java:226:in
>>>>>           
>>>>>>>>> `call'
>>>>>>>>>                   
>>>>>>>>>>> from
>>>>>>>>>>>                       
>>>>> org/jruby/internal/runtime/methods/CompiledMethod.java:211:in
>>>>>           
>>>>>>>>> `call'
>>>>>>>>>                   
>>>>>>>>>>>  from
>>>>>>>>>>>                       
>>>>> org/jruby/internal/runtime/methods/CompiledMethod.java:71:in
>>>>>           
>>>>>>>>> `call'
>>>>>>>>>                   
>>>>>>>>>>> from org/jruby/runtime/callsite/CachingCallSite.java:253:in
>>>>>>>>>>>                       
>>>>>>>>>> `cacheAndCall'
>>>>>>>>>>                     
>>>>>>>>>>>  from org/jruby/runtime/callsite/CachingCallSite.java:72:in
>>>>>>>>>>>                       
>>>>> `call'
>>>>>           
>>>>>>>>>>> from
>>>>>>>>>>>                       
>>> opt/hadoop/hbase_minus_0_dot_20_dot_1/bin/$_dot_dot_/bin/hirb.rb:497:in
>>>       
>>>>>>>>>>> `__file__'
>>>>>>>>>>>  from
>>>>>>>>>>>                       
>>>>> opt/hadoop/hbase_minus_0_dot_20_dot_1/bin/$_dot_dot_/bin/hirb.rb:-1:in
>>>>>           
>>>>>>>>>>> `load'
>>>>>>>>>>> from org/jruby/Ruby.java:577:in `runScript'
>>>>>>>>>>>  from org/jruby/Ruby.java:480:in `runNormally'
>>>>>>>>>>> from org/jruby/Ruby.java:354:in `runFromMain'
>>>>>>>>>>>  from org/jruby/Main.java:229:in `run'
>>>>>>>>>>> from org/jruby/Main.java:110:in `run'
>>>>>>>>>>>  from org/jruby/Main.java:94:in `main'
>>>>>>>>>>> from /opt/hadoop/hbase-0.20.1/bin/../bin/hirb.rb:338:in
>>>>>>>>>>>                       
>> `list'
>>     
>>>>>>>>>>>  from (hbase):2hbase(main):002:0>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Nov 9, 2009 at 10:52 PM, Chris Bates <
>>>>>>>>>>> christopher.andrew.bates@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>                       
>>>>>>>>>>>> thanks for your response Sujee.  These boxes are all on an
>>>>>>>>>>>>                         
>>>>> internal
>>>>>           
>>>>>>>>> DNS
>>>>>>>>>                   
>>>>>>>>>> and
>>>>>>>>>>                     
>>>>>>>>>>>> they all resolve.
>>>>>>>>>>>>
>>>>>>>>>>>> one of my co-workers apparently can log into his box and
>>>>>>>>>>>>                         
>>> submit
>>>       
>>>>>>> jobs,
>>>>>>>               
>>>>>>>>>> but
>>>>>>>>>>                     
>>>>>>>>>>>> me or anyone else is still unable to log in.  Does Hbase
>>>>>>>>>>>>                         
>>> allow
>>>       
>>>>>>>>>> concurrent
>>>>>>>>>>                     
>>>>>>>>>>>> connections?  In Hive I remember having to configure the
>>>>>>>>>>>>                         
>>>>> metastore
>>>>>           
>>>>>>> to
>>>>>>>               
>>>>>>>>> be
>>>>>>>>>                   
>>>>>>>>>> in
>>>>>>>>>>                     
>>>>>>>>>>>> server mode if multiple people were using it.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Nov 9, 2009 at 10:13 PM, Sujee Maniyam <
>>>>>>>>>>>>                         
>>> sujee@sujee.net
>>>       
>>>>>>>>> wrote:
>>>>>>>>>                   
>>>>>>>>>>>>>> [hadoop@crunch hbase-0.20.1]$ bin/start-hbase.sh
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> crunch2: Warning: Permanently added 'crunch2' (RSA) to
>>>>>>>>>>>>>>                             
>> the
>>     
>>>>> list
>>>>>           
>>>>>>> of
>>>>>>>               
>>>>>>>>>> known
>>>>>>>>>>                     
>>>>>>>>>>>>>> hosts.
>>>>>>>>>>>>>>                             
>>>>>>>>>>>>> is your SSH setup correctly?  From master, you need to be
>>>>>>>>>>>>>                           
>>> able
>>>       
>>>>> to
>>>>>           
>>>>>>>>>>>>> login to all slaves/regionservers without password
>>>>>>>>>>>>>
>>>>>>>>>>>>> And I see you are using short hostnames (crunch2,
>>>>>>>>>>>>>                           
>> crunch3),
>>     
>>> do
>>>       
>>>>>>> they
>>>>>>>               
>>>>>>>>>>>>> all resolve correctly?  or you need to update /etc/hosts
>>>>>>>>>>>>>                           
>> to
>>     
>>>>>>> resolve
>>>>>>>               
>>>>>>>>>>>>> these to an IP address on all machines.
>>>>>>>>>>>>>
>>>>>>>>>>>>> regards
>>>>>>>>>>>>> Sujee Maniyam
>>>>>>>>>>>>> --
>>>>>>>>>>>>> http://sujee.net
>>>>>>>>>>>>>
>>>>>>>>>>>>>                           
>>>>>>>>>>>>                         
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                   
>
>   

Mime
View raw message