hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hitesh Shah (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-353) Add Zookeeper-based store implementation for RMStateStore
Date Tue, 20 Aug 2013 22:09:52 GMT

    [ https://issues.apache.org/jira/browse/YARN-353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13745516#comment-13745516
] 

Hitesh Shah commented on YARN-353:
----------------------------------

Sorry for the delay in the review. Been sidetracked by other work. 

Comments:

FS State store uses "fs.rm-state-store" where as ZK uses "zk.state-store" - this is inconsistent.

Also, the variable names seem a bit inconsistent - should they be RM_ZK_STATE_STORE* as compared
to ZK_RM_STATE_STORE* to match the actual property names? Though the property name itself
has RM defined twice instead of just once.

Use of "This must be supplied when using org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore"
in yarn-default.xml - is the whole class name required? If yes, there are 1-2 properties which
do not use the full class name. 

LOG.debug statement not encapsulated within if isDebugEnabled() in RMStateStore.java.

{code}
+    } catch (Exception e) {
+      throw e;
+    }
{code}
   - Could you add comments so that this piece of code is removed when HA handling work is
done. 

{code}
+      try {
+        zkClient = getNewZooKeeper();
+      } catch (Exception e) {
+        if (LOG.isDebugEnabled()) {
+          LOG.debug("Failed to connect to the ZooKeeper on attempt - " +
+              (retries + 1));
+        }
+      }
{code}
  - Why are all exceptions being caught instead of an explicit set? If there is a good reason
for all exceptions, why not catch Throwable to capture everything?
  - how is a failure in closing of zk connections meant to be handled in the createConnection
function?
  - The reason for why a connection to ZK failed is never logged. The only message logged
is failed to connect on attempt X and that too only in debug level.

It seems like "throws Exception" is being used in most places as compared to known types?

The tests for basic zk connect/CRUD probably belong in a separate file.

{code}
+      Assert.assertTrue(e.getMessage().contains(
+        "not of expected form scheme:id:perm"));
{code}
   - where is this message being produced? is the RM code validating the format or ZK, and
if ZK, should we be testing this in the first place? Assuming the test is to validate that
we are using the configured value properly, how about a test for a diff perm from the default
and checking that the zkNode has that permission set.









                
> Add Zookeeper-based store implementation for RMStateStore
> ---------------------------------------------------------
>
>                 Key: YARN-353
>                 URL: https://issues.apache.org/jira/browse/YARN-353
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Hitesh Shah
>            Assignee: Karthik Kambatla
>         Attachments: YARN-353.10.patch, YARN-353.11.patch, YARN-353.12.patch, yarn-353-12-wip.patch,
YARN-353.13.patch, YARN-353.1.patch, YARN-353.2.patch, YARN-353.3.patch, YARN-353.4.patch,
YARN-353.5.patch, YARN-353.6.patch, YARN-353.7.patch, YARN-353.8.patch, YARN-353.9.patch
>
>
> Add store that write RM state data to ZK

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message