hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Shelukhin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-14608) LLAP: slow scheduling due to LlapTaskScheduler not removing nodes on kill
Date Thu, 08 Sep 2016 02:02:20 GMT

    [ https://issues.apache.org/jira/browse/HIVE-14608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15472433#comment-15472433
] 

Sergey Shelukhin commented on HIVE-14608:
-----------------------------------------

It mostly applies to reducers actually. The main problem is as indicated in the description
- for whatever reason we don't remove nodes from the node list when they die. The per-node
assignment explicitly checks active set to get around that(?) but the other path doesn't...
so reducers with no location preference will be sent to dead nodes, potentially

> LLAP: slow scheduling due to LlapTaskScheduler not removing nodes on kill 
> --------------------------------------------------------------------------
>
>                 Key: HIVE-14608
>                 URL: https://issues.apache.org/jira/browse/HIVE-14608
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>            Priority: Critical
>         Attachments: HIVE-14608.patch
>
>
> See comments; this can result in a slowdown esp. if some critical task gets unlucky.
> {noformat}
>   public void workerNodeRemoved(ServiceInstance serviceInstance) {
>      // FIXME: disabling this for now
> // instanceToNodeMap.remove(serviceInstance.getWorkerIdentity());
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message