From "jiayuhan-it (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MAPREDUCE-7017) Too many times of meaningless invocation in TaskAttemptImpl#resolveHosts
Date Mon, 04 Dec 2017 11:25:00 GMT

Description:   MRAppMaster uses {code}TaskAttemptImpl::resolveHosts\ {code} to determine
the dataLocalHosts for each task when the location of data split is IP, which will call a
lot of times ( taskNum * dfsReplication) of function InetAddress::getByName and most of the
funcition calls are redundant.  When the job has a great number of tasks and the speed of
DNS resolution is not fast enough, it will take a lot of time at this stage before the job
running.  (was:   MRAppMaster uses {code}TaskAttemptImpl::resolveHosts{code} \to determine
the dataLocalHosts for each task when the location of data split is IP, which will call a
lot of times ( taskNum * dfsReplication) of function InetAddress::getByName and most of the
funcition calls are redundant.  When the job has a great number of tasks and the speed of
DNS resolution is not fast enough, it will take a lot of time at this stage before the job
running.)

