hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From charan kumar <charan.ku...@gmail.com>
Subject Re: Region Server shutdown (Replay HLOg required)
Date Wed, 12 Jan 2011 23:16:38 GMT
The only explanation I found related to the exact exception I am seeing was
posted in the URL.


http://sudhirvn.blogspot.com/2010/07/hadoop-hdfs-error-javaioioexception.html

   "In the process of creating the merging compressed file, it creates a
temp file with merged content, verifies the file and then compresses it to a
final file after which it deletes the temp file. Errors were being thrown
when the temp file was getting deleted. Certain times delete was happening
(meta data on namenode) before the file was replicated to all nodes and that
was throwing this error. Currently the HDFS API does not provide a way to
synchronouos method to wait till file is replicated"

 Does this seem related to HBase?

On Wed, Jan 12, 2011 at 12:17 PM, charan kumar <charan.kumar@gmail.com>wrote:

> Hello,
>
>   Region servers are dying printing the following exception, under heavy
> write load. Let me know, if you need any more details. Your help is greatly
> appreciated.
>
>  Environment:
>    HBase (0.20.6) setup. (30 nodes/region servers)   LZO Compression
> enabled in Hbase
>
> Region Server log entry:
>
> 2011-01-11 16:00:27,489 FATAL
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Replay of hlog
> required. Forcing server shutdown
>
> org.apache.hadoop.hbase.DroppedSnapshotException: region:
> webtable,XXXXXXXX1294790230320
>
>         at
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1041)
>
>         at
> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:896)
>
>         at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:258)
>
>         at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:231)
>
>         at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:154)
>
> Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException:
> Could not complete write to file
> /hbase/webtable/1138778035/c/4254248379246402147 by DFSClient_535678138
>
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNode.complete(NameNode.java:449)
>
>         at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
>
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>
>         at java.lang.reflect.Method.invoke(Method.java:597)
>
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
>
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
>
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
>
>
>
>         at org.apache.hadoop.ipc.Client.call(Client.java:740)
>
>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
>
>         at $Proxy1.complete(Unknown Source)
>
>         at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
>
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>
>         at java.lang.reflect.Method.invoke(Method.java:597)
>
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
>
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
>
>         at $Proxy1.complete(Unknown Source)
>
>         at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3264)
>
>         at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3188)
>
>         at
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:61)
>
>         at
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:86)
>
>         at
> org.apache.hadoop.hbase.io.hfile.HFile$Writer.close(HFile.java:620)
>
>         at
> org.apache.hadoop.hbase.regionserver.Store.internalFlushCache(Store.java:554)
>
>         at
> org.apache.hadoop.hbase.regionserver.Store.flushCache(Store.java:516)
>
>         at
> org.apache.hadoop.hbase.regionserver.Store.access$100(Store.java:88)
>
>         at
> org.apache.hadoop.hbase.regionserver.Store$StoreFlusherImpl.flushCache(Store.java:1597)
>
>         at
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1000)
>
>         ... 4 more
>
>
> Namenode log exception around the same time:
>
>
>
> 11/01/12 07:01:36 WARN hdfs.StateChange: DIR* NameSystem.completeFile:
> failed to complete /hbase/webtable/689554504/c/3522097247441132047 because
> dir.getFileBlocks() is null  and pendingFile is null
> 11/01/12 07:01:36 INFO ipc.Server: IPC Server handler 30 on 8020, call
> complete(/hbase/webtable/689554504/c/3522097247441132047,
> DFSClient_1190023263) from XXXXXXX:52636: error: java.io.IOException: Could
> not complete write to file /
> hbase/webtable/689554504/c/3522097247441132047 by DFSClient_1190023263
> java.io.IOException: Could not complete write to file
> /hbase/webtable/689554504/c/3522097247441132047 by DFSClient_1190023263
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNode.complete(NameNode.java:449)
>         at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
>
>
> Thanks,
>
> Charan
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message