hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Zhang <zjf...@gmail.com>
Subject Re: HBase 0.20.1 Distributed Install Problems
Date Wed, 11 Nov 2009 01:43:52 GMT
The following is the region server's log :


2009-11-10 18:09:08,062 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 3 on 60020: starting
2009-11-10 18:09:08,063 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 4 on 60020: starting
2009-11-10 18:09:08,063 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 5 on 60020: starting
2009-11-10 18:09:08,063 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 6 on 60020: starting
2009-11-10 18:09:08,063 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 7 on 60020: starting
2009-11-10 18:09:08,063 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 8 on 60020: starting
2009-11-10 18:09:08,063 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: HRegionServer started
at: 10.148.224.11:60020
2009-11-10 18:09:08,064 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 9 on 60020: starting
2009-11-10 18:09:08,070 INFO org.apache.hadoop.hbase.regionserver.StoreFile:
Allocating LruBlockCache with maximum size 198.3m
2009-11-10 18:09:08,095 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_CALL_SERVER_STARTUP
2009-11-10 18:09:08,229 INFO org.apache.hadoop.hbase.regionserver.HLog: HLog
configuration: blocksize=67108864, rollsize=63753420, enabled=true,
flushlogentries=100, optionallogflushinternal=10000ms
2009-11-10 18:09:08,253 INFO org.apache.hadoop.hbase.regionserver.HLog: New
hlog /hbase/.logs/10.148.224.11,60020,1257847748205/hlog.dat.1257847748229
2009-11-10 18:09:08,255 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at
10.148.224.13:60000 that we are up
2009-11-10 18:09:08,302 FATAL
org.apache.hadoop.hbase.regionserver.HRegionServer: Unhandled exception.
Aborting...
java.lang.NullPointerException
        at
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:459)
        at java.lang.Thread.run(Thread.java:619)
2009-11-10 18:09:08,304 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics:
request=0.0, regions=0, stores=0, storefiles=0, storefileIndexSize=0,
memstoreSize=0, usedHeap=31, maxHeap=99
1, blockCacheSize=1707288, blockCacheFree=206264664, blockCacheCount=0,
blockCacheHitRatio=0
2009-11-10 18:09:08,304 INFO org.apache.hadoop.ipc.HBaseServer: Stopping
server on 60020
2009-11-10 18:09:08,304 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 0 on 60020: exiting
2009-11-10 18:09:08,304 INFO org.apache.hadoop.ipc.HBaseServer: Stopping IPC
Server listener on 60020
2009-11-10 18:09:08,304 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 1 on 60020: exiting
2009-11-10 18:09:08,304 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 2 on 60020: exiting
2009-11-10 18:09:08,305 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 3 on 60020: exiting
2009-11-10 18:09:08,305 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 4 on 60020: exiting
2009-11-10 18:09:08,305 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 5 on 60020: exiting
2009-11-10 18:09:08,305 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 6 on 60020: exiting
2009-11-10 18:09:08,305 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 7 on 60020: exiting
2009-11-10 18:09:08,305 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 8 on 60020: exiting
2009-11-10 18:09:08,305 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 9 on 60020: exiting
2009-11-10 18:09:08,306 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Stopping infoServer
2009-11-10 18:09:08,307 INFO org.apache.hadoop.ipc.HBaseServer: Stopping IPC
Server Responder
2009-11-10 18:09:08,412 INFO
org.apache.hadoop.hbase.regionserver.MemStoreFlusher:
regionserver/127.0.0.1:60020.cacheFlusher exiting
2009-11-10 18:09:08,412 INFO
org.apache.hadoop.hbase.regionserver.LogFlusher:
regionserver/127.0.0.1:60020.logFlusher exiting
2009-11-10 18:09:08,412 INFO
org.apache.hadoop.hbase.regionserver.CompactSplitThread:
regionserver/127.0.0.1:60020.compactor exiting
2009-11-10 18:09:08,412 INFO org.apache.hadoop.hbase.regionserver.LogRoller:
LogRoller exiting.
2009-11-10 18:09:08,413 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer$MajorCompactionChecker:
regionserver/127.0.0.1:60020.majorCompactionChecker exiting
2009-11-10 18:09:08,427 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: On abort, closed hlog
2009-11-10 18:09:08,428 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: aborting server at:
10.148.224.11:60020
2009-11-10 18:09:17,489 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: worker thread exiting
2009-11-10 18:09:17,489 INFO org.apache.zookeeper.ZooKeeper: Closing
session: 0x324dcceb05c0003
2009-11-10 18:09:17,490 INFO org.apache.zookeeper.ClientCnxn: Closing
ClientCnxn for session: 0x324dcceb05c0003
2009-11-10 18:09:17,495 INFO org.apache.hadoop.hbase.Leases:
regionserver/127.0.0.1:60020.leaseChecker closing leases
2009-11-10 18:09:17,495 INFO org.apache.hadoop.hbase.Leases:
regionserver/127.0.0.1:60020.leaseChecker closed leases
2009-11-10 18:09:17,500 INFO org.apache.zookeeper.ClientCnxn: Exception
while closing send thread for session 0x324dcceb05c0003 : Read error rc = -1
java.nio.DirectByteBuffer[pos=0 lim=4 cap=4]
2009-11-10 18:09:17,604 INFO org.apache.zookeeper.ClientCnxn: Disconnecting
ClientCnxn for session: 0x324dcceb05c0003
2009-11-10 18:09:17,604 INFO org.apache.zookeeper.ZooKeeper: Session:
0x324dcceb05c0003 closed
2009-11-10 18:09:17,605 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver/
127.0.0.1:60020 exiting
2009-11-10 18:09:17,605 INFO org.apache.zookeeper.ClientCnxn: EventThread
shut down
2009-11-10 18:09:17,606 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Starting shutdown
thread.
2009-11-10 18:09:17,606 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Shutdown thread complete

On Tue, Nov 10, 2009 at 10:55 PM, Andrew Purtell <apurtell@apache.org>wrote:

> When you try to start the region servers, what do you see in the log?
>
> If you don't change the client port (hbase.zookeeper.property.clientPort),
> does it work?
>
>     - Andy
>
>
>
>
>
> ________________________________
> From: Jeff Zhang <zjffdu@gmail.com>
> To: hbase-user@hadoop.apache.org
> Sent: Tue, November 10, 2009 2:40:28 PM
> Subject: Re: HBase 0.20.1 Distributed Install Problems
>
> Hi,
>
> I meet the same problem that I can not start the regionserver.
>
> When I invoke zk_dump
>
> it shows:
>
> HBase tree in ZooKeeper is rooted at /hbase
>  Cluster up? true
>  In safe mode? true
>  Master address: 10.148.224.13:60000
>  Region server holding ROOT: null
>  Region servers:
>
>
> The following is my hbase-site.xml
>
> <configuration>
>  <property>
>    <name>hbase.cluster.distributed</name>
>    <value>true</value>
>    <description>The mode the cluster will be in. Possible values are
>      false: standalone and pseudo-distributed setups with managed Zookeeper
>      true: fully-distributed with unmanaged Zookeeper Quorum (see
> hbase-env.sh)
>    </description>
>  </property>
>  <property>
>    <name>hbase.rootdir</name>
>    <value>hdfs://sha-cs-04:9000/hbase</value>
>    <description>The directory shared by region servers.
>    </description>
>  </property>
>  <property>
>      <name>hbase.zookeeper.property.clientPort</name>
>      <value>2222</value>
>      <description>Property from ZooKeeper's config zoo.cfg.
>      The port at which the clients will connect.
>      </description>
>   </property>
>   <property>
>      <name>hbase.zookeeper.quorum</name>
>      <value>sha-cs-01,sha-cs-02,sha-cs-03,sha-cs-05,sha-cs-06</value>
>      <description>Comma separated list of servers in the ZooKeeper Quorum.
>      For example, "host1.mydomain.com,host2.mydomain.com,
> host3.mydomain.com
> ".
>      By default this is set to localhost for local and pseudo-distributed
> modes
>      of operation. For a fully-distributed setup, this should be set to a
> full
>      list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK is set in
> hbase-env.sh
>      this is the list of servers which we will start/stop ZooKeeper on.
>      </description>
>    </property>
>
> </configuration>
>
> What's wrong with my configuration ?
>
>
> Thank you in advance.
>
>
> Jeff Zhang
>
>
>
> On Tue, Nov 10, 2009 at 12:47 PM, Tatsuya Kawano
> <tatsuyaml@snowcocoa.info>wrote:
>
> > Hello,
> >
> > It looks like the master and the region servers are cannot locate each
> > other. HBase 0.20.x uses ZooKeeper (zk) to locate other cluster
> > members, so maybe your zk has wrong information.
> >
> > Can you type zk_dump from hbase shell and let us the result?
> >
> > If the cluster is properly configured, you'll get something like this:
> > =====================================
> > hbase(main):007:0> zk_dump
> >
> > HBase tree in ZooKeeper is rooted at /hbase
> >  Cluster up? true
> >  In safe mode? false
> >  Master address: 172.16.80.26:60000
> >  Region server holding ROOT: 172.16.80.27:60020
> >  Region servers:
> >   - 172.16.80.27:60020
> >   - 172.16.80.29:60020
> >   - 172.16.80.28:60020
> > =====================================
> >
> >
> > > one of my co-workers apparently can log into his box and submit jobs,
> but
> > > me or anyone else is still unable to log in.
> >
> > Maybe you're a bit confused; your co-worker seems to be able to use
> > Hadoop Map/Reduce, not HBase.
> >
> >
> > > Does Hbase allow concurrent connections?
> >
> > Yes.
> >
> >
> > >> I think it also says the master is on port 60000
> > >> when the install directions say its supposed to be 60010?
> >
> > Port 60000 is correct. The master uses port 60000 to accept connection
> > from hbase shell and region servers. Port 60010 is for the web-based
> > HBase console.
> >
> >
> > > We tried applying this fix (to explicitly set the master):
> > > http://osdir.com/ml/hbase-user-hadoop-apache/2009-05/msg00321.html
> >
> > No, this is an old way to configure a cluster. You shouldn't use this
> > with HBase 0.20.x
> >
> >
> > Thanks,
> >
> > --
> > Tatsuya Kawano (Mr.)
> > Tokyo, Japan
> >
> >
> >
> > On Tue, Nov 10, 2009 at 1:10 PM, Chris Bates
> > <christopher.andrew.bates@gmail.com> wrote:
> > > Another interesting data point.  We tried applying this fix (to
> > explicitly
> > > set the master):
> > > http://osdir.com/ml/hbase-user-hadoop-apache/2009-05/msg00321.html
> > >
> > > But when I log in to the master node, it takes really long to submit a
> > query
> > > and I get this in response:
> > > hbase(main):001:0> list
> > > NativeException:
> > org.apache.hadoop.hbase.client.RetriesExhaustedException:
> > > Trying to contact region server null for region , row '', but failed
> > after 5
> > > attempts.
> > > Exceptions:
> > > org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out
> > trying
> > > to locate root region
> > > org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out
> > trying
> > > to locate root region
> > > org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out
> > trying
> > > to locate root region
> > > org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out
> > trying
> > > to locate root region
> > > org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out
> > trying
> > > to locate root region
> > >
> > > from org/apache/hadoop/hbase/client/HConnectionManager.java:1001:in
> > > `getRegionServerWithRetries'
> > >  from org/apache/hadoop/hbase/client/MetaScanner.java:55:in `metaScan'
> > > from org/apache/hadoop/hbase/client/MetaScanner.java:28:in `metaScan'
> > >  from org/apache/hadoop/hbase/client/HConnectionManager.java:432:in
> > > `listTables'
> > > from org/apache/hadoop/hbase/client/HBaseAdmin.java:127:in `listTables'
> > >  from sun/reflect/NativeMethodAccessorImpl.java:-2:in `invoke0'
> > > from sun/reflect/NativeMethodAccessorImpl.java:39:in `invoke'
> > >  from sun/reflect/DelegatingMethodAccessorImpl.java:25:in `invoke'
> > > from java/lang/reflect/Method.java:597:in `invoke'
> > >  from org/jruby/javasupport/JavaMethod.java:298:in
> > > `invokeWithExceptionHandling'
> > > from org/jruby/javasupport/JavaMethod.java:259:in `invoke'
> > >  from org/jruby/java/invokers/InstanceMethodInvoker.java:36:in `call'
> > > from org/jruby/runtime/callsite/CachingCallSite.java:253:in
> > `cacheAndCall'
> > >  from org/jruby/runtime/callsite/CachingCallSite.java:72:in `call'
> > > from org/jruby/ast/CallNoArgNode.java:61:in `interpret'
> > >  from org/jruby/ast/ForNode.java:104:in `interpret'
> > > ... 116 levels...
> > > from
> > >
> opt/hadoop/hbase_minus_0_dot_20_dot_1/bin/$_dot_dot_/bin/hirb#start:-1:in
> > > `call'
> > >  from org/jruby/internal/runtime/methods/DynamicMethod.java:226:in
> `call'
> > > from org/jruby/internal/runtime/methods/CompiledMethod.java:211:in
> `call'
> > >  from org/jruby/internal/runtime/methods/CompiledMethod.java:71:in
> `call'
> > > from org/jruby/runtime/callsite/CachingCallSite.java:253:in
> > `cacheAndCall'
> > >  from org/jruby/runtime/callsite/CachingCallSite.java:72:in `call'
> > > from
> > opt/hadoop/hbase_minus_0_dot_20_dot_1/bin/$_dot_dot_/bin/hirb.rb:497:in
> > > `__file__'
> > >  from
> > opt/hadoop/hbase_minus_0_dot_20_dot_1/bin/$_dot_dot_/bin/hirb.rb:-1:in
> > > `load'
> > > from org/jruby/Ruby.java:577:in `runScript'
> > >  from org/jruby/Ruby.java:480:in `runNormally'
> > > from org/jruby/Ruby.java:354:in `runFromMain'
> > >  from org/jruby/Main.java:229:in `run'
> > > from org/jruby/Main.java:110:in `run'
> > >  from org/jruby/Main.java:94:in `main'
> > > from /opt/hadoop/hbase-0.20.1/bin/../bin/hirb.rb:338:in `list'
> > >  from (hbase):2hbase(main):002:0>
> > >
> > >
> > > On Mon, Nov 9, 2009 at 10:52 PM, Chris Bates <
> > > christopher.andrew.bates@gmail.com> wrote:
> > >
> > >> thanks for your response Sujee.  These boxes are all on an internal
> DNS
> > and
> > >> they all resolve.
> > >>
> > >> one of my co-workers apparently can log into his box and submit jobs,
> > but
> > >> me or anyone else is still unable to log in.  Does Hbase allow
> > concurrent
> > >> connections?  In Hive I remember having to configure the metastore to
> be
> > in
> > >> server mode if multiple people were using it.
> > >>
> > >>
> > >> On Mon, Nov 9, 2009 at 10:13 PM, Sujee Maniyam <sujee@sujee.net>
> wrote:
> > >>
> > >>> > [hadoop@crunch hbase-0.20.1]$ bin/start-hbase.sh
> > >>> >
> > >>> > crunch2: Warning: Permanently added 'crunch2' (RSA) to the list
of
> > known
> > >>> > hosts.
> > >>>
> > >>>
> > >>> is your SSH setup correctly?  From master, you need to be able to
> > >>> login to all slaves/regionservers without password
> > >>>
> > >>> And I see you are using short hostnames (crunch2, crunch3), do they
> > >>> all resolve correctly?  or you need to update /etc/hosts to resolve
> > >>> these to an IP address on all machines.
> > >>>
> > >>> regards
> > >>> Sujee Maniyam
> > >>> --
> > >>> http://sujee.net
> > >>>
> > >>
> > >>
> > >
> >
>
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message