hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhijie Shen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2673) Add retry for timeline client put APIs
Date Sat, 18 Oct 2014 02:51:34 GMT

    [ https://issues.apache.org/jira/browse/YARN-2673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14175814#comment-14175814
] 

Zhijie Shen commented on YARN-2673:
-----------------------------------

1. Remove RESTful to be neutral in case the channel is changed?
{code}
+  /** Timeline client RESTful call, max retries (-1 means no limit) */
{code}

2. Add the configs to yarn-default.xml as well

3. It's not necessary any more, because the vars are only set once in the constructor. In
testCheckRetryCount, you can compose config to set the values you want to use.
{code}
    @Private
    @VisibleForTesting
    public synchronized void changeRetrySettings(int maxRetries, long interval) {
      this.maxRetries = maxRetries;
      this.retryInterval = interval;
    }
{code}
{code}
      // synchronously get a snapshot of current retry settings
      int leftRetries = 0;
      long sleepMs;
      retried = false;
      synchronized (this) {
        leftRetries = maxRetries;
        sleepMs = retryInterval;
      }
{code}

4. In testCheckRetryCount, response is not used.

BTW, YARN-2676 is going to change the TimelineClient code. This patch is subject to rebase.

> Add retry for timeline client put APIs
> --------------------------------------
>
>                 Key: YARN-2673
>                 URL: https://issues.apache.org/jira/browse/YARN-2673
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Li Lu
>            Assignee: Li Lu
>         Attachments: YARN-2673-101414-1.patch, YARN-2673-101414-2.patch, YARN-2673-101414.patch,
YARN-2673-101714.patch
>
>
> Timeline client now does not handle the case gracefully when the server is down. Jobs
from distributed shell may fail due to ATS restart. We may need to add some retry mechanisms
to the client. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message