hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: Why did HBase dead after a regionserver stopped.
Date Wed, 31 Mar 2010 16:50:26 GMT
I would recommend upgrading to 0.20.3, but I don't think it will fix
your problem... since I can't really figure it with such small logs.

So you said you ran this command: hbase/bin/regionservers.sh
hbase/bin/hbase-daemon.sh stop regionserver.

Which BTW is equivalent to running: hbase/bin/hbase-daemons.sh stop regionserver

And you tried a count in the shell? The stop commend really just stops
all the region servers, so you should not expect to be able to do
anything ;)

FYI, the NotServingRegionException means that a region moved to
somewhere else, it's "normal" and it's logged on the INFO level.

I still don't really understand your problem, can you explain exactly
what you were trying to do, what you expected, and what was the
result?

Also I recommend using pastebin.com (or equivalent) to post the logs,
also don't be afraid to post thousands of lines, it really helps
having the full context!

J-D

2010/3/30 无名氏 <sitong1978@gmail.com>:
> thks!
>
> Hadoop version is 0.20.1,   and hbase version is 0.20.2
>
> The configure of Hadoop:
> hdfs-site.xml
> <?xml version="1.0"?>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
> <!-- Put site-specific property overrides in this file. -->
> <configuration>
> <property>
> <name>dfs.data.dir</name>
> <value>dfs/data</value>
> </property>
> <property>
> <name>dfs.name.dir</name>
> <value>dfs/name</value>
> </property>
> <property>
> <name>dfs.datanode.max.xcievers</name>
> <values>4096</values>
> </property>
> </configuration>
>
> mapred-site.xml
> <?xml version="1.0"?>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
> <!-- Put site-specific property overrides in this file. -->
> <configuration>
> <property>
>  <name>mapred.job.tracker</name>
>  <value>search9b.cm3:9001</value>
>  <description>The host and port that the MapReduce job tracker runs
>  at.  If "local", then jobs are run in-process as a single map
>  and reduce task.
>  </description>
> </property>
> </configuration>
>
>
> The configure of HBase is:
> hbase-site.xml
> <configuration>
>  <property>
>    <name>hbase.rootdir</name>
>    <value>hdfs://search9b.cm3:9000/hbase</value>
>    <description>The directory shared by region servers.
>    Should be fully-qualified to include the filesystem to use.
>    E.g: hdfs://NAMENODE_SERVER:PORT/HBASE_ROOTDIR
>    </description>
>  </property>
>  <property>
>    <name>hbase.cluster.distributed</name>
>    <value>true</value>
>    <description>The mode the cluster will be in. Possible values are
>      false: standalone and pseudo-distributed setups with managed Zookeeper
>      true: fully-distributed with unmanaged Zookeeper Quorum (see
> hbase-env.sh)
>    </description>
>  </property>
>  <property>
>    <name>hbase.zookeeper.quorum</name>
>        <value>search58c.cm3,build13.cm3,build14.cm3</value>
>    <description>Comma separated list of servers in the ZooKeeper Quorum.
>    For example, "host1.mydomain.com,host2.mydomain.com,host3.mydomain.com".
>    By default this is set to localhost for local and pseudo-distributed
> modes
>    of operation. For a fully-distributed setup, this should be set to a
> full
>    list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK is set in
> hbase-env.sh
>    this is the list of servers which we will start/stop ZooKeeper on.
>    </description>
>  </property>
> </configuration>
>
> hdfs-site.xml
> <configuration>
> <property>
> <name>dfs.data.dir</name>
> <value>dfs/data</value>
> </property>
> <property>
> <name>dfs.name.dir</name>
> <value>dfs/name</value>
> </property>
> </configuration>
>
> linux version: x86_64 x86_64 x86_64 GNU/Linux
>
>
> I restart all the regionservers of HBase with command
> hbase/bin/regionservers.sh hbase/bin/hbase-daemon.sh stop regionserver.
> Then I run hbase shell and command "count 'web_info'"(web_info is the table
> name), and throw Exceptions:
> NativeException: org.apache.hadoop.hbase.client.RetriesExhaustedException:
> Trying to contact region server 172.23.51.55:60020 for region
> web_info,,1267870002080, row '', but failed after 5 attempts.
> Exceptions:
> org.apache.hadoop.hbase.NotServingRegionException:
> org.apache.hadoop.hbase.NotServingRegionException: web_info,,1267870002080
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2309)
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:1896)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:648)
>        at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
>
> I login the regionserver search55b.cm3(172.23.51.55).
> In hadoop log, found SocketTimeoutException.
>
> java.net.SocketTimeoutException: 480000 millis timeout while waiting for
> channel to be ready for write. ch :
> java.nio.channels.SocketChannel[connected local=/172.23.51.55:50010 remote=/
> 172.23.51.55:47568]
>    at
> org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
>    at
> org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159)
>    at
> org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198)
>    at
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:313)
>    at
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:400)
>    at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:180)
>    at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:95)
>    at java.lang.Thread.run(Thread.java:619)
>
> 2010-03-30 00:58:59,672 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(
> 172.23.51.55:50010, storageID=DS-225596341-172.23.51.55-50010-1261706639224,
> infoPort=50075, ipcPort=50020):DataXceiver
> java.net.SocketTimeoutException: 480000 millis timeout while waiting for
> channel to be ready for write. ch :
> java.nio.channels.SocketChannel[connected local=/172.23.51.55:50010 remote=/
> 172.23.51.55:47568]
>    at
> org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
>    at
> org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159)
>    at
> org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198)
>    at
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:313)
>    at
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:400)
>    at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:180)
>    at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:95)
>    at java.lang.Thread.run(Thread.java:619)
>
> In hbase log, I found
> org.apache.hadoop.hbase.NotServingRegionException: web_info,,1267870002080
>    at
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2309)
>    at
> org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:1896)
>    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>    at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>    at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>    at java.lang.reflect.Method.invoke(Method.java:597)
>    at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:648)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> 2010-03-31 14:16:39,076 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 0 on 60020, call openScanner([B@6defe475, startRow=, stopRow=,
> maxVersions=1, caching=10, cacheBlocks=false,
> timeRange=[0,9223372036854775807), families=ALL) from 172.23.52.58:42223:
> error: org.apache.hadoop.hbase.NotServingRegionException:
> web_info,,1267870002080
>
>
> 2010/3/31 Jean-Daniel Cryans <jdcryans@apache.org>
>
>> Please provide us with the usuals: Hadoop/HBase version,
>> configurations for both, hardware, OS, etc
>>
>> Also did you take a look at search.38d.cm3's region server log? Any
>> obvious exceptions and if you google search them, can you find the
>> solution?
>>
>> Thx
>>
>> J-D
>>
>> On Tue, Mar 30, 2010 at 7:50 PM, 无名氏 <sitong1978@gmail.com> wrote:
>> > I contributed a HBase cluster,  and the regionserver list is
>> > search10a.cm3
>> > search10b.cm3
>> > search162a.cm3
>> > search166a.cm3
>> > search168a.cm3
>> > search16a.cm3
>> > search178a.cm3
>> > search180a.cm3
>> > search182a.cm3
>> > search184a.cm3
>> > search188a.cm3
>> > search189a.cm3
>> > search18b.cm3
>> > search190a.cm3
>> > search192a.cm3
>> > search200t.cm3
>> > search33d.cm3
>> > search34c.cm3
>> > search34d.cm3
>> > search35c.cm3
>> > search35d.cm3
>> > search38d.cm3
>> > search3a.cm3
>> > search49a.cm3
>> > search4a.cm3
>> > search50a.cm3
>> > search51a.cm3
>> > search54b.cm3
>> > search55b.cm3
>> > search55d.cm3
>> > search56b.cm3
>> > search5a.cm3
>> > search60a.cm3
>> > search61a.cm3
>> > search62a.cm3
>> > build2.cme
>> >
>> > The regionserver search38d.cm3 stopped yestory.
>> >
>> > Now I run hbase shell, execute listcommand,  and throwed exception.
>> >
>> > NativeException:
>> org.apache.hadoop.hbase.client.RetriesExhaustedException:
>> > Trying to contact region server null for region , row '', but failed
>> after 5
>> > attempts.
>> > Exceptions:
>> > org.apache.hadoop.hbase.NotServingRegionException:
>> > org.apache.hadoop.hbase.NotServingRegionException: -ROOT-,,0
>> >        at
>> >
>> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2309)
>> >        at
>> >
>> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionInfo(HRegionServer.java:1761)
>> >        at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
>> >        at
>> >
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> >        at java.lang.reflect.Method.invoke(Method.java:597)
>> >        at
>> > org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:648)
>> >        at
>> > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
>> >
>> > org.apache.hadoop.hbase.NotServingRegionException:
>> > org.apache.hadoop.hbase.NotServingRegionException: -ROOT-,,0
>> >        at
>> >
>> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2309)
>> >        at
>> >
>> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionInfo(HRegionServer.java:1761)
>> >        at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
>> >        at
>> >
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> >        at java.lang.reflect.Method.invoke(Method.java:597)
>> >        at
>> > org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:648)
>> >        at
>> > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
>> >
>> > org.apache.hadoop.hbase.NotServingRegionException:
>> > org.apache.hadoop.hbase.NotServingRegionException: -ROOT-,,0
>> >        at
>> >
>> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2309)
>> >        at
>> >
>> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionInfo(HRegionServer.java:1761)
>> >        at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
>> >        at
>> >
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> >        at java.lang.reflect.Method.invoke(Method.java:597)
>> >        at
>> > org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:648)
>> >        at
>> > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
>> >
>> > org.apache.hadoop.hbase.NotServingRegionException:
>> > org.apache.hadoop.hbase.NotServingRegionException: -ROOT-,,0
>> >        at
>> >
>> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2309)
>> >        at
>> >
>> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionInfo(HRegionServer.java:1761)
>> >        at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
>> >        at
>> >
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> >        at java.lang.reflect.Method.invoke(Method.java:597)
>> >        at
>> > org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:648)
>> >        at
>> > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
>> >
>> > org.apache.hadoop.hbase.NotServingRegionException:
>> > org.apache.hadoop.hbase.NotServingRegionException: -ROOT-,,0
>> >        at
>> >
>> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2309)
>> >        at
>> >
>> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionInfo(HRegionServer.java:1761)
>> >        at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
>> >        at
>> >
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> >        at java.lang.reflect.Method.invoke(Method.java:597)
>> >        at
>> > org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:648)
>> >        at
>> > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
>> >
>> >
>> >        from
>> org/apache/hadoop/hbase/client/HConnectionManager.java:1002:in
>> > `getRegionServerWithRetries'
>> >        from org/apache/hadoop/hbase/client/MetaScanner.java:55:in
>> > `metaScan'
>> >        from org/apache/hadoop/hbase/client/MetaScanner.java:28:in
>> > `metaScan'
>> >        from org/apache/hadoop/hbase/client/HConnectionManager.java:433:in
>> > `listTables'
>> >        from org/apache/hadoop/hbase/client/HBaseAdmin.java:127:in
>> > `listTables'
>> >        from sun/reflect/NativeMethodAccessorImpl.java:-2:in `invoke0'
>> >        from sun/reflect/NativeMethodAccessorImpl.java:39:in `invoke'
>> >        from sun/reflect/DelegatingMethodAccessorImpl.java:25:in `invoke'
>> >        from java/lang/reflect/Method.java:597:in `invoke'
>> >        from org/jruby/javasupport/JavaMethod.java:298:in
>> > `invokeWithExceptionHandling'
>> >        from org/jruby/javasupport/JavaMethod.java:259:in `invoke'
>> >        from org/jruby/java/invokers/InstanceMethodInvoker.java:36:in
>> `call'
>> >        from org/jruby/runtime/callsite/CachingCallSite.java:70:in `call'
>> >        from org/jruby/ast/CallNoArgNode.java:61:in `interpret'
>> >        from org/jruby/ast/ForNode.java:104:in `interpret'
>> >        from org/jruby/ast/NewlineNode.java:104:in `interpret'
>> > ... 110 levels...
>> >        from home/admin/shrek/hbase/bin/hirb#start:-1:in `call'
>> >        from org/jruby/internal/runtime/methods/DynamicMethod.java:226:in
>> > `call'
>> >        from org/jruby/internal/runtime/methods/CompiledMethod.java:211:in
>> > `call'
>> >        from org/jruby/internal/runtime/methods/CompiledMethod.java:71:in
>> > `call'
>> >        from org/jruby/runtime/callsite/CachingCallSite.java:253:in
>> > `cacheAndCall'
>> >        from org/jruby/runtime/callsite/CachingCallSite.java:72:in `call'
>> >        from home/admin/shrek/hbase/bin/hirb.rb:497:in `__file__'
>> >        from home/admin/shrek/hbase/bin/hirb.rb:-1:in `load'
>> >        from org/jruby/Ruby.java:577:in `runScript'
>> >        from org/jruby/Ruby.java:480:in `runNormally'
>> >        from org/jruby/Ruby.java:354:in `runFromMain'
>> >        from org/jruby/Main.java:229:in `run'
>> >        from org/jruby/Main.java:110:in `run'
>> >        from org/jruby/Main.java:94:in `main'
>> >        from /home/admin/shrek/hbase/bin/hirb.rb:338:in `list'
>> >        from (hbase):19hbase(main):
>> >
>> >
>> > Now, I should go about how to operate.
>> > thks.
>> >
>>
>

Mime
View raw message