hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: NotReplicatedYetException, LeaseExpiredException and one RegionServer Down
Date Thu, 19 Mar 2009 13:04:38 GMT
I've seen HDFS being overwhelmed on very small clusters, like yours,
the way you describe it. The region server probably shut down to not
make things worse with the data.

J-D

On Thu, Mar 19, 2009 at 6:43 AM, schubert zhang <zsongbo@gmail.com> wrote:
> Report two issues I have met:
>
> Testbed: 1 master + 3 slaves.
>
> 1. after long time running (batch insert data by mapreduce), sometimes I saw
> following WARN.
>    Is it caused by network issue or other?
>
> 2009-03-19 13:33:43,630 INFO org.apache.hadoop.hbase.regionserver.HLog:
> removing old log file
> /hbase/log_10.24.1.14_1237428361729_60020/hlog.dat.1237440285652 whose
> highest sequence/edit id is 301219308
> 2009-03-19 13:33:52,688 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
> Responder, call batchUpdates([B@29a925de,
> [Lorg.apache.hadoop.hbase.io.BatchUpdate;@1ff52730) from 10.24.1.14:40269:
> output error
> 2009-03-19 13:33:52,746 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 11 on 60020 caught: java.nio.channels.ClosedChannelException
>        at
> sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:126)
>        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
>        at org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(Unknown
> Source)
>        at org.apache.hadoop.hbase.ipc.HBaseServer.access$2000(Unknown
> Source)
>        at
> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(Unknown
> Source)
>        at
> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(Unknown Source)
>        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(Unknown
> Source)
>
> 2. Maybe following is caused by DFS issue? Since my cluster is too small, I
> config the replication factor = 2
>
> 2009-03-19 13:37:43,232 INFO org.apache.hadoop.hdfs.DFSClient:
> org.apache.hadoop.ipc.RemoteException:
> org.apache.hadoop.hdfs.server.namenode.NotReplicatedYetException: Not
> replicated
> yet:/hbase/log_10.24.1.14_1237428361729_60020/hlog.dat.1237441061021
>        at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(Unknown
> Source)
>        at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(Unknown
> Source)
>        at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at org.apache.hadoop.ipc.RPC$Server.call(Unknown Source)
>        at org.apache.hadoop.ipc.Server$Handler.run(Unknown Source)
>
>        at org.apache.hadoop.ipc.Client.call(Unknown Source)
>        at org.apache.hadoop.ipc.RPC$Invoker.invoke(Unknown Source)
>        at $Proxy1.addBlock(Unknown Source)
>        at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(Unknown
> Source)
>        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(Unknown
> Source)
>        at $Proxy1.addBlock(Unknown Source)
>        at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(Unknown
> Source)
>        at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(Unknown
> Source)
>        at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(Unknown Source)
>        at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(Unknown
> Source)
>
> 2009-03-19 13:37:43,232 WARN org.apache.hadoop.hdfs.DFSClient:
> NotReplicatedYetException sleeping
> /hbase/log_10.24.1.14_1237428361729_60020/hlog.dat.1237441061021 retries
> left 4
> 2009-03-19 13:37:43,637 INFO org.apache.hadoop.hdfs.DFSClient:
> org.apache.hadoop.ipc.RemoteException:
> org.apache.hadoop.hdfs.server.namenode.NotReplicatedYetException: Not
> replicated
> yet:/hbase/log_10.24.1.14_1237428361729_60020/hlog.dat.1237441061021
>        at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(Unknown
> Source)
>        at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(Unknown
> Source)
>        at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at org.apache.hadoop.ipc.RPC$Server.call(Unknown Source)
>        at org.apache.hadoop.ipc.Server$Handler.run(Unknown Source)
>
>        at org.apache.hadoop.ipc.Client.call(Unknown Source)
>        at org.apache.hadoop.ipc.RPC$Invoker.invoke(Unknown Source)
>        at $Proxy1.addBlock(Unknown Source)
>        at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
>
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(Unknown
> Source)
>        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(Unknown
> Source)
>        at $Proxy1.addBlock(Unknown Source)
>        at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(Unknown
> Source)
>        at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(Unknown
> Source)
>        at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(Unknown Source)
>        at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(Unknown
> Source)
>
>
> 3. 2009-03-19 16:33:55,254 WARN org.apache.hadoop.hdfs.DFSClient: DFS Read:
> java.io.IOException: Cannot open filename
> /hbase/CDR/1546147041/wap/mapfiles/6210665186830995164/data
>        at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(Unknown
> Source)
>        at
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(Unknown
> Source)
>        at
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(Unknown Source)
>        at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(Unknown
> Source)
>        at java.io.DataInputStream.readFully(DataInputStream.java:178)
>        at org.apache.hadoop.hbase.io.DataOutputBuffer$Buffer.write(Unknown
> Source)
>        at org.apache.hadoop.hbase.io.DataOutputBuffer.write(Unknown Source)
>        at org.apache.hadoop.hbase.io.SequenceFile$Reader.readBuffer(Unknown
> Source)
>        at
> org.apache.hadoop.hbase.io.SequenceFile$Reader.seekToCurrentValue(Unknown
> Source)
>        at
> org.apache.hadoop.hbase.io.SequenceFile$Reader.getCurrentValue(Unknown
> Source)
>        at org.apache.hadoop.hbase.io.SequenceFile$Reader.next(Unknown
> Source)
>        at org.apache.hadoop.hbase.io.MapFile$Reader.next(Unknown Source)
>        at org.apache.hadoop.hbase.regionserver.HStore.compact(Unknown
> Source)
>        at org.apache.hadoop.hbase.regionserver.HStore.compact(Unknown
> Source)
>        at
> org.apache.hadoop.hbase.regionserver.HRegion.compactStores(Unknown Source)
>        at
> org.apache.hadoop.hbase.regionserver.HRegion.compactStores(Unknown Source)
>        at
> org.apache.hadoop.hbase.regionserver.CompactSplitThread.run(Unknown Source)
>
> 2009-03-19 16:33:55,800 ERROR
> org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction/Split
> failed for region CDR,13776342680@2009-03-19 10:24:56.416,1237438713117
> java.io.IOException: java.io.IOException: Could not complete write to file
> /hbase/CDR/compaction.dir/1546147041/wap/mapfiles/4755271422260171357/data
> by DFSClient_-577718137
>        at org.apache.hadoop.hdfs.server.namenode.NameNode.complete(Unknown
> Source)
>        at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at org.apache.hadoop.ipc.RPC$Server.call(Unknown Source)
>        at org.apache.hadoop.ipc.Server$Handler.run(Unknown Source)
>
>        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
>        at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>        at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>        at
> org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(Unknown
> Source)
>        at
> org.apache.hadoop.hbase.RemoteExceptionHandler.checkThrowable(Unknown
> Source)
>        at
> org.apache.hadoop.hbase.RemoteExceptionHandler.checkIOException(Unknown
> Source)
>        at
> org.apache.hadoop.hbase.regionserver.CompactSplitThread.run(Unknown Source)
> 2009-03-19 16:33:55,801 INFO
> org.apache.hadoop.hbase.regionserver.CompactSplitThread:
> regionserver/0:0:0:0:0:0:0:0:60020.compactor exiting
>
>
> 4. and then, after many exceptions like 2 NotReplicatedYetException
>   and then
>
> 2009-03-19 16:34:30,104 WARN org.apache.hadoop.hdfs.DFSClient: Error
> Recovery for block null bad datanode[0] nodes == null
> 2009-03-19 16:34:30,105 WARN org.apache.hadoop.hdfs.DFSClient: Could not get
> block locations. Source file
> "/hbase/CDR/compaction.dir/574480317/wap/mapfiles/6533441212405796349/index"
> - Aborting...
> 2009-03-19 16:34:30,105 ERROR org.apache.hadoop.hdfs.DFSClient: Exception
> closing file
> /hbase/CDR/compaction.dir/574480317/wap/mapfiles/6533441212405796349/index :
> org.apache.hadoop.ipc.RemoteException:
> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on
> /hbase/CDR/compaction.dir/574480317/wap/mapfiles/6533441212405796349/index
> File does not exist. Holder DFSClient_-577718137 does not have any open
> files.
>        at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(Unknown
> Source)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(Unknown
> Source)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(Unknown
> Source)
>        at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(Unknown
> Source)
>        at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at org.apache.hadoop.ipc.RPC$Server.call(Unknown Source)
>        at org.apache.hadoop.ipc.Server$Handler.run(Unknown Source)
>
> org.apache.hadoop.ipc.RemoteException:
> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on
> /hbase/CDR/compaction.dir/574480317/wap/mapfiles/6533441212405796349/index
> File does not exist. Holder DFSClient_-577718137 does not have any open
> files.
>        at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(Unknown
> Source)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(Unknown
> Source)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(Unknown
> Source)
>        at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(Unknown
> Source)
>        at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at org.apache.hadoop.ipc.RPC$Server.call(Unknown Source)
>        at org.apache.hadoop.ipc.Server$Handler.run(Unknown Source)
>
>        at org.apache.hadoop.ipc.Client.call(Unknown Source)
>        at org.apache.hadoop.ipc.RPC$Invoker.invoke(Unknown Source)
>        at $Proxy1.addBlock(Unknown Source)
>        at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(Unknown
> Source)
>        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(Unknown
> Source)
>        at $Proxy1.addBlock(Unknown Source)
>        at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(Unknown
> Source)
>        at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(Unknown
> Source)
>        at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(Unknown Source)
>        at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(Unknown
> Source)
> 2009-03-19 16:34:31,061 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer:
> regionserver/0:0:0:0:0:0:0:0:60020 exiting
> 2009-03-19 16:34:32,527 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Starting shutdown
> thread.
> 2009-03-19 16:34:32,528 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Shutdown thread complete
>
> The region server down.
>

Mime
View raw message