hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tsuyoshi OZAWA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1778) TestFSRMStateStore fails on trunk
Date Thu, 05 Feb 2015 08:52:34 GMT

    [ https://issues.apache.org/jira/browse/YARN-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14306866#comment-14306866
] 

Tsuyoshi OZAWA commented on YARN-1778:
--------------------------------------

[~zxu] cc: [~jlowe] Thank you for the investigation. DFSOutputStream#completeFile includes
the logic to retry. It's hard-coded for now:

{code}
          if (retries == 0) {
            throw new IOException("Unable to close file because the last block"
                + " does not have enough number of replicas.");
          }
          retries--;
          Thread.sleep(localTimeout);
          localTimeout *= 2;
          if (Time.now() - localstart > 5000) {
            DFSClient.LOG.info("Could not complete " + src + " retrying...");
          }
{code}

How about making these timeouts and number of retries configurable and setting via fs.state-store.num-retries
and fs.state-store.retry-interval-ms? It's simpler way to deal with this problem.

> TestFSRMStateStore fails on trunk
> ---------------------------------
>
>                 Key: YARN-1778
>                 URL: https://issues.apache.org/jira/browse/YARN-1778
>             Project: Hadoop YARN
>          Issue Type: Test
>            Reporter: Xuan Gong
>            Assignee: zhihai xu
>         Attachments: YARN-1778.000.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message