hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Attila Magyar (Jira)" <j...@apache.org>
Subject [jira] [Assigned] (HIVE-23469) Use hostname + pod UID for shuffle manager caching
Date Thu, 14 May 2020 09:17:00 GMT

     [ https://issues.apache.org/jira/browse/HIVE-23469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Attila Magyar reassigned HIVE-23469:
------------------------------------


> Use hostname + pod UID for shuffle manager caching
> --------------------------------------------------
>
>                 Key: HIVE-23469
>                 URL: https://issues.apache.org/jira/browse/HIVE-23469
>             Project: Hive
>          Issue Type: Bug
>          Components: Tez
>            Reporter: Attila Magyar
>            Assignee: Attila Magyar
>            Priority: Major
>
> When a pod restarts, it uses the same hostname and shuffle port. Now when fetcher threads
connects to download the shuffle data it will use the cached connection info and since the
pod has died it's shuffle data will also get cleaned up. When the pod restarts, it receives
connection from clients to download specific shuffle data but the daemon will not have it
because of the restart.
> In ShuffleManager.java's knownSrcHosts the key should be updated to HostInfo which is
a combination of host+port and the host's unique ID. The host host Id changes when a node
is killed or restarted.
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message