hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinayakumar B (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8270) create() always retried with hardcoded timeout when file already exists
Date Tue, 02 Jun 2015 06:36:18 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14568635#comment-14568635
] 

Vinayakumar B commented on HDFS-8270:
-------------------------------------

Hi [~andreina], 
In the patch, Instead of removing all retries, need to remove retries for only method-exception.
Default retries should be there to retry on connect exceptions / namenode restart cases.

So following changes in would be enough in NameNodeProxies.java.
{code}@@ -442,22 +440,9 @@ private static ClientProtocol createNNProxyWithClientProtocol(
 
     if (withRetries) { // create the proxy with retries
 
-      RetryPolicy createPolicy = RetryPolicies
-          .retryUpToMaximumCountWithFixedSleep(5,
-              HdfsServerConstants.LEASE_SOFTLIMIT_PERIOD, TimeUnit.MILLISECONDS);
-    
-      Map<Class<? extends Exception>, RetryPolicy> remoteExceptionToPolicyMap

-                 = new HashMap<Class<? extends Exception>, RetryPolicy>();
-      remoteExceptionToPolicyMap.put(AlreadyBeingCreatedException.class,
-          createPolicy);
-
-      RetryPolicy methodPolicy = RetryPolicies.retryByRemoteException(
-          defaultPolicy, remoteExceptionToPolicyMap);
       Map<String, RetryPolicy> methodNameToPolicyMap 
                  = new HashMap<String, RetryPolicy>();
     
-      methodNameToPolicyMap.put("create", methodPolicy);
-
       ClientProtocol translatorProxy =
         new ClientNamenodeProtocolTranslatorPB(proxy);
       return (ClientProtocol) RetryProxy.create({code}

Other changes in other files is fine. I think you can also remove the configuration key from
DFSConfigKeys.java as its nowhere used after the patch.

> create() always retried with hardcoded timeout when file already exists
> -----------------------------------------------------------------------
>
>                 Key: HDFS-8270
>                 URL: https://issues.apache.org/jira/browse/HDFS-8270
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs-client
>    Affects Versions: 2.6.0
>            Reporter: Andrey Stepachev
>            Assignee: J.Andreina
>         Attachments: HDFS-8270.1.patch
>
>
> In Hbase we stumbled on unexpected behaviour, which could 
> break things. 
> HDFS-6478 fixed wrong exception
> translation, but that apparently led to unexpected bahaviour:
> clients trying to create file without override=true will be forced
> to retry hardcoded amount of time (60 seconds).
> That could break or slowdown systems, that use filesystem
> for locks (like hbase fsck did, and we got it broken HBASE-13574).
> We should make this behaviour configurable, do client really need
> to wait lease timeout to be sure that file doesn't exists, or it it should
> be enough to fail fast.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message