hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jian He (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2716) Refactor ZKRMStateStore retry code with Apache Curator
Date Fri, 05 Jun 2015 01:25:38 GMT

    [ https://issues.apache.org/jira/browse/YARN-2716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14573911#comment-14573911

Jian He commented on YARN-2716:

Thanks Karthik for working this ! This simplifies things a lot. 
Mostly good, few comments and questions:
- these two booleans not used, maybe removed.
{{private boolean create = false, delete = false;
- is this going to be done in this jira?
{code} // TODO: Check deleting appIdRemovePath works recursively
- will the safeDelete throw noNodeExist exception if deleting a non-existing zone?
- {{new RetryNTimes(numRetries, zkSessionTimeout / numRetries));}},  I think the second parameter
should be zkRetryInterval; Also, I have a question why in HA case, zkRetryInterval is calculated
as below
if (HAUtil.isHAEnabled(conf)) {
  zkRetryInterval = zkSessionTimeout / numRetries;

- I found this [thread|http://mail-archives.apache.org/mod_mbox/curator-user/201410.mbox/%3CD076BC8E.9EF1%25sreichling@chegg.com%3E]
saying that blockUntilConnect is not needed to call; Suppose it’s needed, I think the zkSessionTimeout
value is too small, it would be numRetries*numRetryInterval, otherwise RM will exit soon after
retry 10s by default.
if (!curatorFramework.blockUntilConnected(
        zkSessionTimeout, TimeUnit.MILLISECONDS)) {
      LOG.fatal("Couldn't establish connection to ZK server");
      throw new YarnRuntimeException("Couldn't connect to ZK server");
- remove this ?
//      @Override
//      public ZooKeeper getNewZooKeeper() throws IOException {
//        return client;
//      }
-  I think testZKSessionTimeout may be removed too ? it looks like a test for curator 

> Refactor ZKRMStateStore retry code with Apache Curator
> ------------------------------------------------------
>                 Key: YARN-2716
>                 URL: https://issues.apache.org/jira/browse/YARN-2716
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Jian He
>            Assignee: Karthik Kambatla
>         Attachments: yarn-2716-1.patch, yarn-2716-prelim.patch, yarn-2716-prelim.patch,
> Per suggestion by [~kasha] in YARN-2131,  it's nice to use curator to simplify the retry
logic in ZKRMStateStore.

This message was sent by Atlassian JIRA

View raw message