hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wangda Tan (JIRA)" <j...@apache.org>
Subject [jira] (YARN-6136) Registry should avoid scanning whole ZK tree for every container/application finish
Date Tue, 31 Jan 2017 21:53:51 GMT
Wangda Tan created YARN-6136:
--------------------------------

             Summary: Registry should avoid scanning whole ZK tree for every container/application
finish
                 Key: YARN-6136
                 URL: https://issues.apache.org/jira/browse/YARN-6136
             Project: Hadoop YARN
          Issue Type: Sub-task
            Reporter: Wangda Tan
            Assignee: Wangda Tan
            Priority: Critical


In existing registry service implementation, purge operation triggered by container finish
event:

{code}
  public void onContainerFinished(ContainerId id) throws IOException {
    LOG.info("Container {} finished, purging container-level records",
        id);
    purgeRecordsAsync("/",
        id.toString(),
        PersistencePolicies.CONTAINER);
  }
{code} 
Since this happens on every container finish, so it essentially scans all (or almost) ZK node
from the root. 

We have a cluster which have hundreds of ZK nodes for service registry, and have 20K+ ZK nodes
for other purposes. The existing implementation could generate massive ZK operations and internal
Java objects (RegistryPathStatus) as well. The RM becomes very unstable when there're batch
container finish events because of full GC pause and ZK connection failure.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org


Mime
View raw message