hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Prasanth Jayachandran (Jira)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-23500) [Kubernetes] Use Extend NodeId for LLAP registration
Date Tue, 19 May 2020 08:11:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-23500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17110955#comment-17110955
] 

Prasanth Jayachandran commented on HIVE-23500:
----------------------------------------------

HIVE-23466 is the same?

> [Kubernetes] Use Extend NodeId for LLAP registration
> ----------------------------------------------------
>
>                 Key: HIVE-23500
>                 URL: https://issues.apache.org/jira/browse/HIVE-23500
>             Project: Hive
>          Issue Type: Bug
>          Components: llap
>            Reporter: Attila Magyar
>            Assignee: Attila Magyar
>            Priority: Major
>             Fix For: 4.0.0
>
>
> In kubernetes environment where pods can have same host name and port, there can be situations
where node trackers could be retaining old instance of the pod in its cache. In case of Hive
LLAP, where the llap tez task scheduler maintains the membership of nodes based on zookeeper
registry events there can be cases where NODE_ADDED followed by NODE_REMOVED event could end
up removing the node/host from node trackers because of stable hostname and service port.
The NODE_REMOVED event in this case is old stale event of the already dead pod but ZK will
send only after session timeout (in case of non-graceful shutdown). If this sequence of events
happen, a node/host is completely lost form the schedulers perspective. 
> To support this scenario, tez can extend yarn's NodeId to include uniqueIdentifier. Llap
task scheduler can construct the container object with this new NodeId that includes uniqueIdentifier
as well so that stale events like above will only remove the host/node that matches the old
uniqueIdentifier. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message