hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tsuyoshi Ozawa (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (YARN-4348) ZKRMStateStore.syncInternal should wait for zkResyncWaitTime instead of zkSessionTimeout
Date Tue, 01 Dec 2015 03:19:11 GMT

    [ https://issues.apache.org/jira/browse/YARN-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15032845#comment-15032845
] 

Tsuyoshi Ozawa edited comment on YARN-4348 at 12/1/15 3:18 AM:
---------------------------------------------------------------

[~jianhe] good catch. Adding missing {{continue}} statement after calling {{syncInternal}}
in the following block in v4 patch. 


was (Author: ozawa):
Adding missing {{continue}} statement after calling {{syncInternal}} in the following block:

{code}
          if (shouldRetryWithNewConnection(ke.code()) && retry < numRetries) {
            LOG.info("Retrying operation on ZK with new Connection. " +
                "Retry no. " + retry);
            Thread.sleep(zkRetryInterval);
            createConnection();
            syncInternal(ke.getPath());
            continue;
          }
{code}

> ZKRMStateStore.syncInternal should wait for zkResyncWaitTime instead of zkSessionTimeout
> ----------------------------------------------------------------------------------------
>
>                 Key: YARN-4348
>                 URL: https://issues.apache.org/jira/browse/YARN-4348
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 2.7.2, 2.6.2
>            Reporter: Tsuyoshi Ozawa
>            Assignee: Tsuyoshi Ozawa
>            Priority: Blocker
>         Attachments: YARN-4348-branch-2.7.002.patch, YARN-4348-branch-2.7.003.patch,
YARN-4348-branch-2.7.004.patch, YARN-4348.001.patch, YARN-4348.001.patch, log.txt
>
>
> Jian mentioned that the current internal ZK configuration of ZKRMStateStore can cause
a following situation:
> 1. syncInternal timeouts, 
> 2. but sync succeeded later on.
> We should use zkResyncWaitTime as the timeout value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message