hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From stack <st...@duboce.net>
Subject Re: failed to create file ... because current leaseholder is trying to recreate file
Date Tue, 27 Nov 2007 22:26:33 GMT
Reading down through the dfs code, at least the 
AlreadyBeingCreatedException should be getting retried.  Why am I seeing 
it then in my code (And what to do about the below IOE that has no 
matching complaint in the namenode)?

Thanks,
St.Ack

stack wrote:
> Here's more:
>
> 2007-11-27 05:00:26,331 DEBUG fs.DFSClient - Failed to connect to 
> /38.99.76.30:50010:java.io.IOException: Got error in response to 
> OP_READ_BLOCK
>     at 
> org.apache.hadoop.dfs.DFSClient$BlockReader.newBlockReader(DFSClient.java:753) 
>
>     at 
> org.apache.hadoop.dfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:979) 
>
>     at 
> org.apache.hadoop.dfs.DFSClient$DFSInputStream.read(DFSClient.java:1075)
>     at java.io.DataInputStream.readFully(DataInputStream.java:178)
>     at java.io.DataInputStream.readFully(DataInputStream.java:152)
>     at 
> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1383)
>     at 
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1360)
>     at 
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1349)
>     at 
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1344)
>     at org.apache.hadoop.io.MapFile$Reader.<init>(MapFile.java:263)
>     at org.apache.hadoop.io.MapFile$Reader.<init>(MapFile.java:242)
>     at 
> org.apache.hadoop.hbase.HStoreFile$BloomFilterMapFile$Reader.<init>(HStoreFile.java:820)

>
>     at org.apache.hadoop.hbase.HStoreFile.getReader(HStoreFile.java:932)
>     at 
> org.apache.hadoop.hbase.HStore.compactHStoreFiles(HStore.java:1084)
>     at org.apache.hadoop.hbase.HStore.compact(HStore.java:1017)
>     at org.apache.hadoop.hbase.HRegion.compactStores(HRegion.java:745)
>     at org.apache.hadoop.hbase.HRegion.compactIfNeeded(HRegion.java:704)
>     at 
> org.apache.hadoop.hbase.HRegionServer$Compactor.run(HRegionServer.java:378) 
>
>
>
> Am I supposed to retry?
> Will that even make a difference? Every ten minutes or so, client will 
> retry getting a reader on either same file or another and gets same 
> error.
>
> Thanks for any insight,
> St.Ack
>
>
> stack wrote:
>> Anyone have insight on the following message from a near-TRUNK 
>> namenode log?
>>
>> 2007-11-26 01:16:23,282 WARN  dfs.StateChange - DIR* 
>> NameSystem.startFile: failed to create file 
>> /hbase/hregion_-1194436719/oldlogfile.log for DFSClient_610028837 on 
>> client 38.99.77.80 because current leaseholder is trying to recreate 
>> file.
>>
>> It starts for no apparent reason.  There is no exception or warning 
>> preceding it in the log, and creation of files in dfs by pertinent 
>> code has been working fine for a good while before this.   After it 
>> starts, it repeats every minute until the client-side goes away.  
>> While the client is up, it retries creating the file every 5 minutes 
>> or so.
>>
>> From the client-side, the exception looks like this:
>>
>> 34974 2007-11-26 01:21:23,999 WARN hbase.HMaster - Processing pending 
>> operations: ProcessServerShutdown of XX.XX.XX.XX:60020
>> 34975 org.apache.hadoop.dfs.AlreadyBeingCreatedException: 
>> org.apache.hadoop.dfs.AlreadyBeingCreatedException: failed to create 
>> file /hbase/hregion_-1194436719/oldlogfile.log for 
>> DFSClient_610028837 on client XX.XX.XX.XX because curren t 
>> leaseholder is trying to recreate file.
>> 34976 at 
>> org.apache.hadoop.dfs.FSNamesystem.startFileInternal(FSNamesystem.java:848) 
>>
>> 34977 at 
>> org.apache.hadoop.dfs.FSNamesystem.startFile(FSNamesystem.java:804)
>> 34978 at org.apache.hadoop.dfs.NameNode.create(NameNode.java:276)
>> 34979 at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
>> 34980 at 
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

>>
>> 34981 at java.lang.reflect.Method.invoke(Method.java:597)
>> 34982 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>> 34983 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
>> 34984
>> 34985 at 
>> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>> 34986 at 
>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)

>>
>> 34987 at 
>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)

>>
>> 34988 at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>> 34989 at 
>> org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:82)

>>
>> 34990 at org.apache.hadoop.hbase.HMaster.run(HMaster.java:1094)
>>
>> The code is not exotic:
>>
>>                SequenceFile.Writer w = SequenceFile.createWriter(fs, 
>> conf, logfile, HLogKey.class, HLogEdit.class);
>>
>> (See HLog from hbase around #180 if you want to see more context).
>>
>> I see the comments in the incomplete HADOOP-2050.  There is some 
>> implication that an attempt at a create should be wrapped in a 
>> try/catch that retries any AlreadyBeingCreatedException.  Should I be 
>> doing this (It looks like its done in DFSClient)?
>>
>> Thanks for any insight,
>> St.Ack
>>
>> P.S. HADOOP-2283 is issue I've opened to cover this particular topic.
>>
>>
>>
>


Mime
View raw message