hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kihwal Lee (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3032) Lease renewer tries forever even if renewal is not possible
Date Mon, 05 Mar 2012 19:21:58 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13222530#comment-13222530
] 

Kihwal Lee commented on HDFS-3032:
----------------------------------

Thanks for the review Nicholas. In that case, we can add the retry limit in LeaseRenewer where
IOException is caught and retried forever. After all, it doesn't make sense to renew after
HdfsConstants.LEASE_SOFTLIMIT_PERIOD has passed.

I will upload a new patch soon. I am adding a test case for the limited retry right now. 
                
> Lease renewer tries forever even if renewal is not possible
> -----------------------------------------------------------
>
>                 Key: HDFS-3032
>                 URL: https://issues.apache.org/jira/browse/HDFS-3032
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs client
>    Affects Versions: 0.23.0, 0.24.0, 0.23.1
>            Reporter: Kihwal Lee
>            Assignee: Kihwal Lee
>             Fix For: 0.24.0, 0.23.2, 0.23.3
>
>         Attachments: hdfs-3032.patch.txt
>
>
> When LeaseRenewer gets an IOException while attempting to renew for a client, it retries
after sleeping 500ms. If the exception is caused by a condition that will never change, it
keeps talking to the name node until the DFSClient object is closed or aborted.  With the
FileSystem cache, a DFSClient can stay alive for very long time. We've seen the cases in which
node managers and long living jobs flooding name node with this type of calls.
> The current proposal is to abort the client when RemoteException is caught during renewal.
LeaseRenewer already does abort on all clients when it sees a SocketTimeoutException.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message