hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yongjun Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7342) Lease Recovery doesn't happen some times
Date Mon, 24 Nov 2014 08:09:12 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14222769#comment-14222769
] 

Yongjun Zhang commented on HDFS-7342:
-------------------------------------

Hi [~vinayrpet],

Thanks for the good catch of yours! 

Yes, I examined the version of the s/w that caused my case, it doesn't have HDFS-5558. And
so is Ravi's case.

If HDFS-5558 avoids the case that penultimate block is COMMITTED and last block is COMPLETE,
I had the following thoughts.

With HDFS-5558 fix, I assume the case that both the penultimate and the last block are COMMITTED
could be possible, which case the code pasted in https://issues.apache.org/jira/browse/HDFS-4882?focusedCommentId=14213992&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14213992
would be executed.

Even though the lease can be removed in this case based on current trunk code,  in scenario#1
described in that comment, if both blocks have minimal replication number of blocks,  there
would be an exception thrown because the method {{finalizeINodeFileUnderConstruction}} that
calls:
{code}
      Preconditions.checkState(blocks[i].isComplete(), "Failed to finalize"
          + " %s %s since blocks[%s] is non-complete, where blocks=%s.",
          getClass().getSimpleName(), this, i, Arrays.asList(blocks));
{code}
Thus the file won't be closed.

I proposed a solution here 

https://issues.apache.org/jira/browse/HDFS-7342?focusedCommentId=14218085&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14218085
(and the one that follows it)

would avoid this by forcing the penultimate and last block to complete if they already have
minimal replication number of replicas, and the file will be closed successfully.

Any comments/thoughts on this proposed solution?

Hi Ravi, to help further discussion about the fix here, would you please help consolidating
your testcase with the solution I suggested above?

Thanks.






> Lease Recovery doesn't happen some times
> ----------------------------------------
>
>                 Key: HDFS-7342
>                 URL: https://issues.apache.org/jira/browse/HDFS-7342
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.0.0-alpha
>            Reporter: Ravi Prakash
>            Assignee: Ravi Prakash
>         Attachments: HDFS-7342.1.patch, HDFS-7342.2.patch
>
>
> In some cases, LeaseManager tries to recover a lease, but is not able to. HDFS-4882 describes
a possibility of that. We should fix this



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message