hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: HRegionServer: endless FileNotFoundException in log file
Date Fri, 13 Mar 2009 14:31:16 GMT
schubert,

Yeah at 70 000 000 rows with only 5 region servers you might expect
stuff like that... But this is not good. Can you take a look in your
datanodes logs to see if there are any obvious exceptions around the
time of the "error closing and deleting..." exception?

I'd like to know more about your hardware.

Also I strongly suggest that you try out the 0.19.1 RC1, it has really
nice fixes.

Thx for reporting all that stuff,

J-D

On Fri, Mar 13, 2009 at 5:41 AM, schubert zhang <zsongbo@gmail.com> wrote:
> I have a 5+1 HBase/Hadoop cluster.  (5 region server and 1 master)A table
> TESTTAB with only one column family with 36 qualifiers.
> A process in the master node use batchUpdate (autoFlush=false) to insert
> random rows into this table.
> After about 70,000,000 rows inserted, failure. And on the web GUI of HBase,
> I can only find 4 regionservers.  Then, I ssh into the missed node to check
> the log.
>
> There is a very big log file (12GB), and when I "tail -f .." this log file,
> endless FileNotFoundException is printing, like following:
>
> 2009-03-13 14:32:21,617 WARN
> org.apache.hadoop.hbase.regionserver.HRegionServer: error getting store file
> index size for 310680591/cdr: java.io.FileNotFoundException: File does not
> exist:
> hdfs://nd0-rack0-cloud:9000/hbase/TESTTAB/310680591/cdr/mapfiles/1049684492857034443/index
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:415)
> at
> org.apache.hadoop.hbase.regionserver.HStoreFile.indexLength(HStoreFile.java:488)
> at
> org.apache.hadoop.hbase.regionserver.HStore.getStorefilesIndexSize(HStore.java:2179)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.doMetrics(HRegionServer.java:941)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:334)
> at java.lang.Thread.run(Thread.java:619) 2009-03-13 14:32:21,627 WARN
> org.apache.hadoop.hbase.regionserver.HRegionServer: Processing message
> (Retry: 0) java.io.FileNotFoundException: File does not exist:
> hdfs://nd0-rack0-cloud:9000/hbase/TESTTAB/310680591/cdr/mapfiles/1049684492857034443/index
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:415)
> at
> org.apache.hadoop.hbase.regionserver.HStoreFile.indexLength(HStoreFile.java:488)
> at
> org.apache.hadoop.hbase.regionserver.HStore.getStorefilesIndexSize(HStore.java:2179)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.createRegionLoad(HRegionServer.java:625)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:341)
> at java.lang.Thread.run(Thread.java:619) 2009-03-13 14:32:21,628 WARN
> org.apache.hadoop.hbase.regionserver.HRegionServer: error getting store file
> index size for 310680591/cdr: java.io.FileNotFoundException: File does not
> exist:
> hdfs://nd0-rack0-cloud:9000/hbase/TESTTAB/310680591/cdr/mapfiles/1049684492857034443/index
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:415)
> at
> org.apache.hadoop.hbase.regionserver.HStoreFile.indexLength(HStoreFile.java:488)
> at
> org.apache.hadoop.hbase.regionserver.HStore.getStorefilesIndexSize(HStore.java:2179)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.doMetrics(HRegionServer.java:941)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:334)
> at java.lang.Thread.run(Thread.java:619) 2009-03-13 14:32:21,629 WARN
> org.apache.hadoop.hbase.regionserver.HRegionServer: Processing message
> (Retry: 1) java.io.FileNotFoundException: File does not exist:
> hdfs://nd0-rack0-cloud:9000/hbase/TESTTAB/310680591/cdr/mapfiles/1049684492857034443/index
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:415)
> at
> org.apache.hadoop.hbase.regionserver.HStoreFile.indexLength(HStoreFile.java:488)
> at
> org.apache.hadoop.hbase.regionserver.HStore.getStorefilesIndexSize(HStore.java:2179)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.createRegionLoad(HRegionServer.java:625)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:341)
> at java.lang.Thread.run(Thread.java:619)
>
> .......
>
> 2009-03-13 14:32:21,641 WARN
> org.apache.hadoop.hbase.regionserver.HRegionServer: error getting store file
> index size for 310680591/cdr: java.io.FileNotFoundException: File does not
> exist:
> hdfs://nd0-rack0-cloud:9000/hbase/TESTTAB/310680591/cdr/mapfiles/1049684492857034443/index
>        at
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:415)
>        at
> org.apache.hadoop.hbase.regionserver.HStoreFile.indexLength(HStoreFile.java:488)
>        at
> org.apache.hadoop.hbase.regionserver.HStore.getStorefilesIndexSize(HStore.java:2179)
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.doMetrics(HRegionServer.java:941)
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:334)
>        at java.lang.Thread.run(Thread.java:619)
>
> 2009-03-13 14:32:21,641 ERROR
> org.apache.hadoop.hbase.regionserver.HRegionServer: Exceeded max retries: 10
> java.io.FileNotFoundException: File does not exist:
> hdfs://nd0-rack0-cloud:9000/hbase/TESTTAB/310680591/cdr/mapfiles/1049684492857034443/index
>        at
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:415)
>        at
> org.apache.hadoop.hbase.regionserver.HStoreFile.indexLength(HStoreFile.java:488)
>        at
> org.apache.hadoop.hbase.regionserver.HStore.getStorefilesIndexSize(HStore.java:2179)
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.createRegionLoad(HRegionServer.java:625)
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:341)
>        at java.lang.Thread.run(Thread.java:619)
>
> But it will not stop, still output  this exception with "Exceeded max
> retries: 10".
> ....endless...
>
> ======================================
>
> Before this exception, there is no other Exception on this node.
> But on other nodes: there are some following:
>
> 2009-03-13 09:22:19,389 ERROR
> org.apache.hadoop.hbase.regionserver.HRegionServer: error closing and
> deleting HLog org.apache.hadoop.ipc.RemoteException: java.io.IOException:
> Could not complete write to file
> /hbase/log_10.24.1.16_1236857785825_60020/hlog.dat.1236906952707 by
> DFSClient_-1142813574 at
> org.apache.hadoop.hdfs.server.namenode.NameNode.complete(NameNode.java:378)
> at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source) at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597) at
> org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481) at
> org.apache.hadoop.ipc.Server$Handler.run(Server.java:894) at
> org.apache.hadoop.ipc.Client.call(Client.java:697) at
> org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216) at
> $Proxy1.complete(Unknown Source) at
> sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597) at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
> at $Proxy1.complete(Unknown Source) at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3130)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3054)
> at
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:61)
> at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:86)
> at org.apache.hadoop.io.SequenceFile$Writer.close(SequenceFile.java:959) at
> org.apache.hadoop.hbase.regionserver.HLog.close(HLog.java:421) at
> org.apache.hadoop.hbase.regionserver.HLog.closeAndDelete(HLog.java:404) at
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:373)
> at java.lang.Thread.run(Thread.java:619)
>
>
> and...
>
> 2009-03-13 10:03:40,149 ERROR
> org.apache.hadoop.hbase.regionserver.HRegionServer: Failed openScanner
> org.apache.hadoop.hbase.NotServingRegionException: TESTTAB,,1236871823378 at
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2065)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:1699)
> at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597) at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632) at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:895)
> 2009-03-13 10:03:40,150 ERROR
> org.apache.hadoop.hbase.regionserver.HRegionServer: Failed openScanner
> org.apache.hadoop.hbase.NotServingRegionException: TESTTAB,,1236871823378 at
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2065)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:1699)
> at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597) at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632) at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:895)
> 2009-03-13 10:03:40,150 ERROR
> org.apache.hadoop.hbase.regionserver.HRegionServer: Failed openScanner
> org.apache.hadoop.hbase.NotServingRegionException: TESTTAB,,1236871823378 at
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2065)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:1699)
> at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597) at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632) at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:895)
>
>
> and ...
>
> 2009-03-13 13:01:58,220 ERROR
> org.apache.hadoop.hbase.regionserver.HRegionServer: Failed openScanner
> org.apache.hadoop.hbase.NotServingRegionException:
> CDRWAP,13576300780@2009-01-31 10:44:29.720,1236888415970 at
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2065)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:1699)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597) at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632) at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:895)
> 2009-03-13 13:01:58,348 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 1 on 60020, call openScanner([B@7a5fe108, [[B@2caf12fc, [B@1a07754f,
> 9223372036854775807,
> org.apache.hadoop.hbase.filter.WhileMatchRowFilter@5fa6a2e2) from
> 10.24.1.10:44605: error: org.apache.hadoop.hbase.NotServingRegionException:
> CDRWAP,13576300780@2009-01-31 10:44:29.720,1236888415970
> org.apache.hadoop.hbase.NotServingRegionException:
> CDRWAP,13576300780@2009-01-31 10:44:29.720,1236888415970 at
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2065)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:1699)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597) at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632) at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:895)
> 2009-03-13 13:02:00,712 ERROR
> org.apache.hadoop.hbase.regionserver.HRegionServer: Failed openScanner
> org.apache.hadoop.hbase.NotServingRegionException:
> CDRWAP,13576300780@2009-01-31 10:44:29.720,1236888415970 at
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2065)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:1699)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597) at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632) at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:895)
> 2009-03-13 13:02:00,730 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 8 on 60020, call openScanner([B@1eb57
>

Mime
View raw message