spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ran Tao (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (SPARK-23002) SparkUI inconsistent driver hostname compare with other executors
Date Wed, 10 Jan 2018 03:44:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-23002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16319653#comment-16319653
] 

Ran Tao edited comment on SPARK-23002 at 1/10/18 3:43 AM:
----------------------------------------------------------

I found that in the YARN mode, how to get driver hostname and executor container host name
is different. 
About AM, like this

{code:java}
private def runExecutorLauncher(securityMgr: SecurityManager): Unit = {
    val port = sparkConf.get(AM_PORT)
    rpcEnv = RpcEnv.create("sparkYarnAM", Utils.localHostName, port, sparkConf, securityMgr,
      clientMode = true)
    val driverRef = waitForSparkDriver()
    addAmIpFilter()
    registerAM(sparkConf, rpcEnv, driverRef, sparkConf.getOption("spark.driver.appUIAddress"),
      securityMgr)

    // In client mode the actor will stop the reporter thread.
    reporterThread.join()
   ....
  }
{code}

And the key point is *Utils.localHostName* method。

{code:java}
// [core] Utils.scala
/**
   * Get the local machine's hostname.
   */
  def localHostName(): String = {
    customHostname.getOrElse(localIpAddress.getHostAddress)
  }
{code}

this method comment with hostname but return address。so driver name of ui return a ip address.

{code:java}
// [yarn]YarnAllocator.scala
private def runAllocatedContainers(containersToUse: ArrayBuffer[Container]): Unit = {
    for (container <- containersToUse) {
      executorIdCounter += 1
      val executorHostname = container.getNodeId.getHost
    ...
}
{code}

but the other executors container do not use this method but report from container. 
YarnAllocator generate executor hostname from container and deliver it to ExecutorRunnable
and CoarseGrainedExecutorBackend next.
and finally report to AM by netty. And AM through ExecutorsListener.StorageStatus.BlockManagerId
set hostname within BlockManagerId.
so spark ui can get the executor hostport from blockmanagerId.

so a simple way to fix this issue is to change Utils.scala by letting localHostName return
hostname. but i agree the opinion with [~srowen] it does seem to use the IP address for some
reason. so In order to keep compatible, we can add a method return driver hostname or use
the third way, when call the hostPort in ExecutorPage of ui, manually judge driver and executor
and get driver host name correctly. 




was (Author: insomnia):
I found that in the YARN mode, how to get driver hostname and executor container host name
is different. 
About AM, like this

{code:java}
private def runExecutorLauncher(securityMgr: SecurityManager): Unit = {
    val port = sparkConf.get(AM_PORT)
    rpcEnv = RpcEnv.create("sparkYarnAM", *Utils.localHostName*, port, sparkConf, securityMgr,
      clientMode = true)
    val driverRef = waitForSparkDriver()
    addAmIpFilter()
    registerAM(sparkConf, rpcEnv, driverRef, sparkConf.getOption("spark.driver.appUIAddress"),
      securityMgr)

    // In client mode the actor will stop the reporter thread.
    reporterThread.join()
   ....
  }
{code}

And the key point is *Utils.localHostName* method。

{code:java}
// [core] Utils.scala
/**
   * Get the local machine's hostname.
   */
  def localHostName(): String = {
    customHostname.getOrElse(localIpAddress.getHostAddress)
  }
{code}

this method comment with hostname but return address。so driver name of ui return a ip address.

{code:java}
// [yarn]YarnAllocator.scala
private def runAllocatedContainers(containersToUse: ArrayBuffer[Container]): Unit = {
    for (container <- containersToUse) {
      executorIdCounter += 1
      val executorHostname = container.getNodeId.getHost
    ...
}
{code}

but the other executors container do not use this method but report from container. 
YarnAllocator generate executor hostname from container and deliver it to ExecutorRunnable
and CoarseGrainedExecutorBackend next.
and finally report to AM by netty. And AM through ExecutorsListener.StorageStatus.BlockManagerId
set hostname within BlockManagerId.
so spark ui can get the executor hostport from blockmanagerId.

so a simple way to fix this issue is to change Utils.scala by letting localHostName return
hostname. but i agree the opinion with [~srowen] it does seem to use the IP address for some
reason. so In order to keep compatible, we can add a method return driver hostname or use
the third way, when call the hostPort in ExecutorPage of ui, manually judge driver and executor
and get driver host name correctly. 



> SparkUI inconsistent driver hostname compare with other executors
> -----------------------------------------------------------------
>
>                 Key: SPARK-23002
>                 URL: https://issues.apache.org/jira/browse/SPARK-23002
>             Project: Spark
>          Issue Type: Bug
>          Components: Web UI
>    Affects Versions: 2.2.0
>            Reporter: Ran Tao
>            Priority: Minor
>
> As the picture shows, driver name is ip address and other executors are machine hostname.
> !https://raw.githubusercontent.com/Lemonjing/issues-assets/master/pics/driver.png!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message