hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: A question about LeaseExpiredException
Date Wed, 08 Jun 2011 18:07:12 GMT
Grep the missing file in the namenode log and see if you can figure
from mentions therein what happend with this file.  Had the master
taken it from you because it was processing server crash?

St.Ack

2011/6/8 Gaojinchao <gaojinchao@huawei.com>:
> Two regionservers(My cluster is 7 regionsever / datanode) crashed, saying that an file
didn't not exist,
> and that a lease has expired (log detail below). Tried to find in this mailing list.
It seems different:
>
> Hbase version: 0.90.3
> HDFS version: cloudera 0.20.2+320
>
> OS: swappiness :0 and ulimit :600000
> HFDS:  dfs.datanode.max.xcievers: 2047
>
> I didn't see any Xciever count exceeded message.
>
> The cluster run normally before I modified some parameters.
> Parameters:
> Heap size 8G -> 10G
> Hfile block 64k -> 640k( Our cluster uses gz )
>
> Should I make the timeout to 0 or bigger ?
>
> 2011-06-02 19:27:11,666 WARN org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Region
ufdr,11050,1306570494360.1caa8cf34787ccf12495bf7828e0e11c. has too many store files; delaying
flush up to 90000ms
> 2011-06-02 19:27:11,996 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
LRU eviction started; Attempting to free 153.29 MB of total=1.27 GB
> 2011-06-02 19:27:12,000 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
LRU eviction completed; freed=153.77 MB, total=1.12 GB, single=714.14 MB, multi=576.17 MB,
memory=0 KB
> 2011-06-02 19:27:13,940 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
LRU eviction started; Attempting to free 153.37 MB of total=1.27 GB
> 2011-06-02 19:27:13,943 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
LRU eviction completed; freed=153.71 MB, total=1.12 GB, single=712.98 MB, multi=576.79 MB,
memory=0 KB
> 2011-06-02 19:27:15,937 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
LRU eviction started; Attempting to free 153.52 MB of total=1.27 GB
> 2011-06-02 19:27:15,940 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
LRU eviction completed; freed=153.9 MB, total=1.12 GB, single=714.37 MB, multi=576.17 MB,
memory=0 KB
> 2011-06-02 19:27:18,870 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
LRU eviction started; Attempting to free 153.48 MB of total=1.27 GB
> 2011-06-02 19:27:18,873 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
LRU eviction completed; freed=153.76 MB, total=1.12 GB, single=716.21 MB, multi=574.92 MB,
memory=0 KB
> 2011-06-02 19:27:20,087 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Flush requested
on ufdr,11006,1306570494359.8d605fcdef79e342a8626062bf046a14.
> 2011-06-02 19:27:20,087 WARN org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Region
ufdr,11006,1306570494359.8d605fcdef79e342a8626062bf046a14. has too many store files; delaying
flush up to 90000ms
> 2011-06-02 19:27:20,619 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
LRU eviction started; Attempting to free 153.58 MB of total=1.27 GB
> 2011-06-02 19:27:20,621 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
LRU eviction completed; freed=153.84 MB, total=1.12 GB, single=711.31 MB, multi=578.67 MB,
memory=0 KB
> 2011-06-02 19:27:22,152 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
LRU eviction started; Attempting to free 153.6 MB of total=1.27 GB
> 2011-06-02 19:27:22,155 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
LRU eviction completed; freed=153.86 MB, total=1.12 GB, single=714.87 MB, multi=575.12 MB,
memory=0 KB
> 2011-06-02 19:27:23,021 INFO org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter:
Using syncFs -- HDFS-200
> 2011-06-02 19:27:23,089 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING
region server serverName=c3s6.site,60020,1306570384166, load=(requests=192741, regions=434,
usedHeap=4924, maxHeap=10213): IOE in log roller
> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException:
No lease on /hbase/.logs/c3s6.site,60020,1306570384166/c3s6.site%3A60020.1307014000772 File
does not exist. [Lease.  Holder: DFSClient_hb_rs_c3s6.site,60020,1306570384166_1306570388616,
pendingcreates: 2]
>       at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1378)
>       at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1369)
>       at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:1424)
>       at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:1412)
>       at org.apache.hadoop.hdfs.server.namenode.NameNode.complete(NameNode.java:491)
>       at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
>       at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>       at java.lang.reflect.Method.invoke(Method.java:597)
>       at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:512)
>       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:968)
>       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:964)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:396)
>       at org.apache.hadoop.ipc.Server$Handler.run(Server.java:962)
>
>       at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>       at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>       at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>       at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>       at org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:96)
>       at org.apache.hadoop.hbase.RemoteExceptionHandler.checkThrowable(RemoteExceptionHandler.java:48)
>       at org.apache.hadoop.hbase.RemoteExceptionHandler.checkIOException(RemoteExceptionHandler.java:66)
>       at org.apache.hadoop.hbase.regionserver.LogRoller.run(LogRoller.java:104)
> 2011-06-02 19:27:23,090 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Dump
of metrics: requests=63742, regions=434, stores=434, storefiles=1226, storefileIndexSize=135,
memstoreSize=1077, compactionQueueSize=241, flushQueueSize=4, usedHeap=4937, maxHeap=10213,
blockCacheSize=1232047432, blockCacheFree=374318648, blockCacheCount=1867, blockCacheHitCount=95411305,
blockCacheMissCount=12524075, blockCacheEvictedCount=6657895, blockCacheHitRatio=88, blockCacheHitCachingRatio=93
> 2011-06-02 19:27:23,090 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED:
IOE in log roller
> 2011-06-02 19:27:2
>

Mime
View raw message