##### Site index · List index
Message view
Top
From "jiayuhan-it (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MAPREDUCE-7017) Too many times of meaningless invocation in TaskAttemptImpl#resolveHosts
Date Mon, 04 Dec 2017 11:25:00 GMT

[ https://issues.apache.org/jira/browse/MAPREDUCE-7017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

jiayuhan-it updated MAPREDUCE-7017:
-----------------------------------
Description:   MRAppMaster uses {code}TaskAttemptImpl::resolveHosts\ {code} to determine
the dataLocalHosts for each task when the location of data split is IP, which will call a
lot of times ( taskNum * dfsReplication) of function InetAddress::getByName and most of the
funcition calls are redundant.  When the job has a great number of tasks and the speed of
DNS resolution is not fast enough, it will take a lot of time at this stage before the job
running.  (was:   MRAppMaster uses {code}TaskAttemptImpl::resolveHosts{code} \to determine
the dataLocalHosts for each task when the location of data split is IP, which will call a
lot of times ( taskNum * dfsReplication) of function InetAddress::getByName and most of the
funcition calls are redundant.  When the job has a great number of tasks and the speed of
DNS resolution is not fast enough, it will take a lot of time at this stage before the job
running.)

> Too many times of meaningless invocation in TaskAttemptImpl#resolveHosts
> ------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-7017
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7017
>          Issue Type: Improvement
>          Components: mr-am
>    Affects Versions: 3.0.0-alpha4
>            Reporter: jiayuhan-it
>
>   MRAppMaster uses {code}TaskAttemptImpl::resolveHosts\ {code} to determine the dataLocalHosts
for each task when the location of data split is IP, which will call a lot of times ( taskNum
* dfsReplication) of function InetAddress::getByName and most of the funcition calls are redundant.
When the job has a great number of tasks and the speed of DNS resolution is not fast enough,
it will take a lot of time at this stage before the job running.

--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------