hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lars hofhansl <la...@apache.org>
Subject Re: FileNotFoundException in bulk load
Date Sun, 06 Jul 2014 17:17:58 GMT
If we do further discussion there we should reopen the jira.
Fine if the exception is identical, or open a new one if this is a different one.

At first blush this looks a bit like a temporary unavailability of HDFS.


-- Lars



________________________________
 From: Ted Yu <yuzhihong@gmail.com>
To: "user@hbase.apache.org" <user@hbase.apache.org> 
Sent: Sunday, July 6, 2014 8:01 AM
Subject: Re: FileNotFoundException in bulk load
 

The IOExceptions likely came from store.assertBulkLoadHFileOk() call.

HBASE-4030 seems to be better place for future discussion since you can
attach regionserver log(s) there.

Cheers



On Sun, Jul 6, 2014 at 5:23 AM, Amit Sela <amits@infolinks.com> wrote:

> Audit log shows that the same regionserver is opening one of the regions,
> renaming (moving from MR output dir into hbase region directory) and trying
> to open again from MR output dir (repeating 10 times).
> Open-Rename-10xOpen  appears in that order in the audit log, with a msec
> difference all in the same region server.
>
>
> On Sun, Jul 6, 2014 at 2:38 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>
> > Have you checked audit log from NameNode to see which client deleted the
> > files ?
> >
> > Thanks
> >

> > On Jul 6, 2014, at 4:19 AM, Amit Sela <amits@infolinks.com> wrote:
> >
> > > I have a bulk load job running daily for months, when suddenly I got
> > > a FileNotFoundException.
> > >
> > > Googling it I found HBASE-4030 and I noticed someone reporting it
> started
> > > to re-appear at 0.94.8.

> > >
> > > I'm running with Hadoop 1.0.4 and 0.94.12.
> > >
> > > Anyone else encountered this problem lately  ?
> > >
> > > Re-open the Jira ?
> > >
> > > Thanks,
> > >
> > > Amit.
> > >
> > > *On the client side this is the Excpetion:*
> > >
> > > java.net.SocketTimeoutException: Call to
> > node.xxx.com/xxx.xxx.xxx.xxx:PORT
> > > failed on socket timeout exception: java.net.SocketTimeoutException:
> > 60000
> > > millis timeout while waiting for channel to be ready for read. ch :
> > > java.nio.channels.SocketChannel[connected
> > > local=/xxx.xxx.xxx.xxx:PORT remote=node.xxx.com/xxx.xxx.xxx.xxx:PORT]
> > > org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$3@29f2a6e3,
> > > org.apache.hadoop.ipc.RemoteException:
> > > org.apache.hadoop.io.MultipleIOException: 6 exceptions
> > > [java.io.FileNotFoundException: File does not exist:
> > >
> >
> /data/output_jobs/output_websites/HFiles_20140705/metadata/88fd743853cf4f8a862fb19646027a48,
> > > java.io.FileNotFoundException: File does not exist:
> > >
> >
> /data/output_jobs/output_websites/HFiles_20140705/gen/31c4c5cea9b348dbb6bb94115a483877,
> > > java.io.FileNotFoundException: File does not exist:
> > >
> >
> /data/output_jobs/output_websites/HFiles_20140705/gen/5762c45aaf4f408ba748a989f7be9647,
> > > java.io.FileNotFoundException: File does not exist:
> > >
> >
> /data/output_jobs/output_websites/HFiles_20140705/gen1/2ee02a005b654704a092d16c5c713373,
> > > java.io.FileNotFoundException: File does not exist:
> > >
> >
> /data/output_jobs/output_websites/HFiles_20140705/gen1/618251330a1842a797de4b304d341a02,
> > > java.io.FileNotFoundException: File does not exist:
> > >
> >
> /data/output_jobs/output_websites/HFiles_20140705/metadata/3955039392ce4f49aee5f58218a61be1]
> > > at
> > >
> >
> org.apache.hadoop.io.MultipleIOException.createIOException(MultipleIOException.java:47)
> > > at
> > >
> >
> org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:3673)
> > > at
> > >
> >
> org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:3622)
> > > at
> > >
> >
> org.apache.hadoop.hbase.regionserver.HRegionServer.bulkLoadHFiles(HRegionServer.java:2930)
> > > at sun.reflect.GeneratedMethodAccessor70.invoke(Unknown Source)
> > > at
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > > at java.lang.reflect.Method.invoke(Method.java:601)
> > > at
> > >
> >
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320)
> > > at
> > >
> >
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426)
> > >
> > > *On the regionserver:*
> > >
> > > ERROR org.apache.hadoop.hbase.regionserver.HRegion: There were one or
> > more
> > > IO errors when checking if the bulk load is ok.
> > > org.apache.hadoop.io.MultipleIOException: 6 exceptions
> > > [java.io.FileNotFoundException: File does not exist:
> > >
> >
> /data/output_jobs/output_websites/HFiles_20140705/metadata/88fd743853cf4f8a862fb19646027a48,
> > > java.io.FileNotFoundException: File does not exist:
> > >
> >
> /data/output_jobs/output_websites/HFiles_20140705/gen/31c4c5cea9b348dbb6bb94115a483877,
> > > java.io.FileNotFoundException: File does not exist:
> > >
> >
> /data/output_jobs/output_websites/HFiles_20140705/gen/5762c45aaf4f408ba748a989f7be9647,
> > > java.io.FileNotFoundException: File does not exist:
> > >
> >
> /data/output_jobs/output_websites/HFiles_20140705/gen1/2ee02a005b654704a092d16c5c713373,
> > > java.io.FileNotFoundException: File does not exist:
> > >
> >
> /data/output_jobs/output_websites/HFiles_20140705/gen1/618251330a1842a797de4b304d341a02,
> > > java.io.FileNotFoundException: File does not exist:
> > >
> >
> /data/output_jobs/output_websites/HFiles_20140705/metadata/3955039392ce4f49aee5f58218a61be1]
> > >        at
> > >
> >
> org.apache.hadoop.io.MultipleIOException.createIOException(MultipleIOException.java:47)
> > >        at
> > >
> >
> org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:3673)
> > >        at
> > >
> >
> org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:3622)
> > >        at
> > >
> >
> org.apache.hadoop.hbase.regionserver.HRegionServer.bulkLoadHFiles(HRegionServer.java:2930)
> > >        at sun.reflect.GeneratedMethodAccessor70.invoke(Unknown Source)
> > >        at
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > >        at java.lang.reflect.Method.invoke(Method.java:601)
> > >        at
> > >
> >
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320)
> > >        at
> > >
> >
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426)
> > >
> > > followed by:
> > >
> > > ERROR org.apache.hadoop.hbase.regionserver.HRegionServer:
> > > org.apache.hadoop.hbase.ipc.CallerDisconnectedException: Aborting call
> > > next(4522610431482097770, 250), rpc version=1, client version=29,
> > > methodsFingerPrint=-1368823753 from x <http://82.80.29.145:51311
> > >xx.xxx.xxx.xxx
> > > after 12507 ms, since caller disconnected
> > >        at
> > >
> >
> org.apache.hadoop.hbase.ipc.HBaseServer$Call.throwExceptionIfCallerDisconnected(HBaseServer.java:436)
> > >        at
> > >
> >
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3980)
> > >        at
> > >
> >
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3890)
> > >        at
> > >
> >
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3880)
> > >        at
> > >
> >
> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2648)
> > >        at sun.reflect.GeneratedMethodAccessor60.invoke(Unknown Source)
> > >        at
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > >        at java.lang.reflect.Method.invoke(Method.java:601)
> > >        at
> > >
> >
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320)
> > >        at
> > >
> >
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426)
> > > 2014-07-06 03:52:14,278 [IPC Server handler 28 on 8041] ERROR
> > > org.apache.hadoop.hbase.regionserver.HRegionServer:
> > > org.apache.hadoop.hbase.ipc.CallerDisconnectedException: Aborting call
> > > next(7354511084312054096, 250), rpc version=1, client version=29,
> > > methodsFingerPrint=-1368823753 from x
> > > <http://82.80.29.145:51311/>xx.xxx.xxx.xxx after
> > > 9476 ms, since caller disconnected
> > >        at
> > >
> >
> org.apache.hadoop.hbase.ipc.HBaseServer$Call.throwExceptionIfCallerDisconnected(HBaseServer.java:436)
> > >        at
> > >
> >
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3980)
> > >        at
> > >
> >
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3890)
> > >        at
> > >
> >
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3880)
> > >        at
> > >
> >
> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2648)
> > >        at sun.reflect.GeneratedMethodAccessor60.invoke(Unknown Source)
> > >        at
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > >        at java.lang.reflect.Method.invoke(Method.java:601)
> > >        at
> > >
> >
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320)
> > >        at
> > >
> >
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426)
> >
>
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message