gobblin-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Work logged] (GOBBLIN-762) Add automatic scaling for Gobblin on YARN
Date Wed, 08 May 2019 20:19:00 GMT

     [ https://issues.apache.org/jira/browse/GOBBLIN-762?focusedWorklogId=239457&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239457
]

ASF GitHub Bot logged work on GOBBLIN-762:
------------------------------------------

                Author: ASF GitHub Bot
            Created on: 08/May/19 20:18
            Start Date: 08/May/19 20:18
    Worklog Time Spent: 10m 
      Work Description: htran1 commented on pull request #2626: [GOBBLIN-762] Add automatic
scaling for Gobblin on YARN
URL: https://github.com/apache/incubator-gobblin/pull/2626#discussion_r281867296
 
 

 ##########
 File path: gobblin-yarn/src/main/java/org/apache/gobblin/yarn/YarnService.java
 ##########
 @@ -468,6 +562,13 @@ private void handleContainerCompletion(ContainerStatus containerStatus)
{
           containerStatus.getContainerId(), containerStatus.getDiagnostics()));
     }
 
+    if (this.releasedContainerSet.contains(containerStatus.getContainerId())) {
+      LOGGER.info("Container release requested, so not spawning a replacement for containerId
{}",
+          containerStatus.getContainerId());
+      this.releasedContainerSet.remove(containerStatus.getContainerId());
 
 Review comment:
   The existing code assumes `handleContainerCompletion` only gets called once per completion
since the code ```Map.Entry<Container, String> completedContainerEntry = this.containerMap.remove(containerStatus.getContainerId());
       String completedInstanceName = completedContainerEntry.getValue();	    String completedInstanceName
= completedContainerEntry.getValue();``` would otherwise hit NPE. I can't find from the documentation
what the guarantee is on this.
   
   I made a change to store the released container in a cache with TTL, so I removed the `remove`
code, but if `handleContainerCompletion` does get called multiple times then that is an existing
bug that needs to be fixed.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 239457)

> Add automatic scaling for Gobblin on YARN
> -----------------------------------------
>
>                 Key: GOBBLIN-762
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-762
>             Project: Apache Gobblin
>          Issue Type: Task
>            Reporter: Hung Tran
>            Priority: Major
>          Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Gobblin on YARN needs a way to scale up and down the containers based on the workload.
> Added `YarnAutoScalingManager` which can be started by the `GobblinApplicationMaster`
by setting the `gobblin.yarn.app.master.serviceClasses` configuration. This class runs a scheduled
task with a default interval of 60 seconds to detect the number of required partitions for
the workflows submitted to Helix. It will request the `YarnService` to scale to a computed
number of containers. If the requested number of containers is higher than the YarnService
has previously requested then it will request more containers. If the requested count is less
than the current number of allocated containers then it will free any unused containers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message