hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Recover HDFS lease after crash
Date Mon, 16 Jun 2014 17:18:26 GMT
You are likely hitting this: https://issues.apache.org/jira/browse/HDFS-3848

On Mon, Jun 16, 2014 at 10:17 PM, Bogdan Raducanu <lrdbgy@gmail.com> wrote:
> Thanks. I tried to call recoverLease before doing fs.append. Now I'm getting
> only the AlreadyBeingCreatedException
> ("org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException):
> failed to create file /lease_fix for DFSClient_NONMAPREDUCE_394503315_1 for
> client 10.0.0.1 because current leaseholder is trying to recreate file.")
> once and then it seems to work.
>
> But it's curious why I'm getting that exception now. I traced it to this
> code, in FSNamesystem.java:
>
>       //
>       // We found the lease for this file. And surprisingly the original
>       // holder is trying to recreate this file. This should never occur.
>       //
>       if (!force && lease != null) {
>         Lease leaseFile = leaseManager.getLeaseByPath(src);
>         if ((leaseFile != null && leaseFile.equals(lease)) ||
>             lease.getHolder().equals(holder)) {
>           throw new AlreadyBeingCreatedException(
>             "failed to create file " + src + " for " + holder +
>             " for client " + clientMachine +
>             " because current leaseholder is trying to recreate file.");
>         }
>       }
>
> It seems to me that that exception will always be thrown because
> lease.getHolder().equals(holder) is always true. It should've been
> leaseFile.getHolder().equals(holder) perhaps.
>
>
> On Mon, Jun 16, 2014 at 5:47 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>>
>> Please take a look at the following method in DFSClient:
>>
>>   /**
>>
>>    * Recover a file's lease
>>
>>    * @param src a file's path
>>
>>    * @return true if the file is already closed
>>
>>    * @throws IOException
>>
>>    */
>>
>>   boolean recoverLease(String src) throws IOException {
>>
>> Cheers
>>
>>
>>
>> On Mon, Jun 16, 2014 at 8:26 AM, Anonymous <lrdbgy+hdfs@gmail.com> wrote:
>>>
>>> Hello,
>>>
>>> I have a long running application that opens a file and periodically
>>> appends to it. If this application is killed and then restarted it cannot
>>> open the same file again for some time (~ 1minute). First, it gets the
>>> AlreadyBeingCreated exception (which I guess means namenode doesn't yet know
>>> the program crashed) and then the RecoveryInProgress exception (which I
>>> guess means the namenode proceeded to close and release the file after
>>> inactivity). After about 1 minute it starts to work again.
>>>
>>> What is the correct way to recover from this? Is there API for recovering
>>> the lease and resuming appending faster? DFSClient sets a randomized client
>>> name. If it were to send the same client name as before the crash, would it
>>> receive a lease on the file faster?
>>>
>>> Thanks
>>
>>
>



-- 
Harsh J

Mime
View raw message