spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chester Chen <ches...@alpinenow.com>
Subject Re: Possible bug in ClientBase.scala?
Date Thu, 17 Jul 2014 00:18:26 GMT
Hi, Sandy

    We do have some issue with this. The difference is in Yarn-Alpha and
Yarn Stable ( I noticed that in the latest build, the module name has
changed,
     yarn-alpha --> yarn
     yarn --> yarn-stable
)

For example:  MRJobConfig.class
the field:
"DEFAULT_MAPREDUCE_APPLICATION_CLASSPATH"


In Yarn-Alpha : the field returns   java.lang.String[]

  java.lang.String[] DEFAULT_MAPREDUCE_APPLICATION_CLASSPATH;

while in Yarn-Stable, it returns a String

  java.lang.String DEFAULT_MAPREDUCE_APPLICATION_CLASSPATH;

So in ClientBaseSuite.scala

The following code:

    val knownDefMRAppCP: Seq[String] =
      getFieldValue[*String*, Seq[String]](classOf[MRJobConfig],

 "DEFAULT_MAPREDUCE_APPLICATION_CLASSPATH",
                                         Seq[String]())(a => *a.split(",")*)


works for yarn-stable, but doesn't work for yarn-alpha.

This is the only failure for the SNAPSHOT I downloaded 2 weeks ago.  I
believe this can be refactored to yarn-alpha module and make different
tests according different API signatures.

 I just update the master branch and build doesn't even compile for
Yarn-Alpha (yarn) model. Yarn-Stable compile with no error and test passed.


Does the Spark Jenkins job run against yarn-alpha ?





Here is output from yarn-alpha compilation:

I got the 40 compilation errors.

sbt/sbt clean yarn/test:compile

Using /Library/Java/JavaVirtualMachines/jdk1.7.0_51.jdk/Contents/Home as
default JAVA_HOME.

Note, this will be overridden by -java-home if it is set.

[info] Loading project definition from
/Users/chester/projects/spark/project/project

[info] Loading project definition from
/Users/chester/.sbt/0.13/staging/ec3aa8f39111944cc5f2/sbt-pom-reader/project

[warn] Multiple resolvers having different access mechanism configured with
same name 'sbt-plugin-releases'. To avoid conflict, Remove duplicate
project resolvers (`resolvers`) or rename publishing resolver (`publishTo`).

[info] Loading project definition from /Users/chester/projects/spark/project

NOTE: SPARK_HADOOP_VERSION is deprecated, please use
-Dhadoop.version=2.0.5-alpha

NOTE: SPARK_YARN is deprecated, please use -Pyarn flag.

[info] Set current project to spark-parent (in build
file:/Users/chester/projects/spark/)

[success] Total time: 0 s, completed Jul 16, 2014 5:13:06 PM

[info] Updating {file:/Users/chester/projects/spark/}core...

[info] Resolving org.fusesource.jansi#jansi;1.4 ...

[info] Done updating.

[info] Updating {file:/Users/chester/projects/spark/}yarn...

[info] Updating {file:/Users/chester/projects/spark/}yarn-stable...

[info] Resolving org.fusesource.jansi#jansi;1.4 ...

[info] Done updating.

[info] Resolving commons-net#commons-net;3.1 ...

[info] Compiling 358 Scala sources and 34 Java sources to
/Users/chester/projects/spark/core/target/scala-2.10/classes...

[info] Resolving org.fusesource.jansi#jansi;1.4 ...

[info] Done updating.

[warn]
/Users/chester/projects/spark/core/src/main/scala/org/apache/hadoop/mapred/SparkHadoopMapRedUtil.scala:43:
constructor TaskAttemptID in class TaskAttemptID is deprecated: see
corresponding Javadoc for more information.

[warn]     new TaskAttemptID(jtIdentifier, jobId, isMap, taskId, attemptId)

[warn]     ^

[warn]
/Users/chester/projects/spark/core/src/main/scala/org/apache/spark/SparkContext.scala:501:
constructor Job in class Job is deprecated: see corresponding Javadoc for
more information.

[warn]     val job = new NewHadoopJob(hadoopConfiguration)

[warn]               ^

[warn]
/Users/chester/projects/spark/core/src/main/scala/org/apache/spark/SparkContext.scala:634:
constructor Job in class Job is deprecated: see corresponding Javadoc for
more information.

[warn]     val job = new NewHadoopJob(conf)

[warn]               ^

[warn]
/Users/chester/projects/spark/core/src/main/scala/org/apache/spark/SparkHadoopWriter.scala:167:
constructor TaskID in class TaskID is deprecated: see corresponding Javadoc
for more information.

[warn]         new TaskAttemptID(new TaskID(jID.value, true, splitID),
attemptID))

[warn]                           ^

[warn]
/Users/chester/projects/spark/core/src/main/scala/org/apache/spark/SparkHadoopWriter.scala:188:
method makeQualified in class Path is deprecated: see corresponding Javadoc
for more information.

[warn]     outputPath.makeQualified(fs)

[warn]                ^

[warn]
/Users/chester/projects/spark/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala:84:
method isDir in class FileStatus is deprecated: see corresponding Javadoc
for more information.

[warn]     if (!fs.getFileStatus(path).isDir) {

[warn]                                 ^

[warn]
/Users/chester/projects/spark/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala:118:
method isDir in class FileStatus is deprecated: see corresponding Javadoc
for more information.

[warn]       val logDirs = if (logStatus != null)
logStatus.filter(_.isDir).toSeq else Seq[FileStatus]()

[warn]                                                               ^

[warn]
/Users/chester/projects/spark/core/src/main/scala/org/apache/spark/input/WholeTextFileInputFormat.scala:56:
method isDir in class FileStatus is deprecated: see corresponding Javadoc
for more information.

[warn]       if (file.isDir) 0L else file.getLen

[warn]                ^

[warn]
/Users/chester/projects/spark/core/src/main/scala/org/apache/spark/rdd/CheckpointRDD.scala:110:
method getDefaultReplication in class FileSystem is deprecated: see
corresponding Javadoc for more information.

[warn]       fs.create(tempOutputPath, false, bufferSize,
fs.getDefaultReplication, blockSize)

[warn]                                                       ^

[warn]
/Users/chester/projects/spark/core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala:267:
constructor TaskID in class TaskID is deprecated: see corresponding Javadoc
for more information.

[warn]     val taId = new TaskAttemptID(new TaskID(jobID, true, splitId),
attemptId)

[warn]                                  ^

[warn]
/Users/chester/projects/spark/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala:767:
constructor Job in class Job is deprecated: see corresponding Javadoc for
more information.

[warn]     val job = new NewAPIHadoopJob(hadoopConf)

[warn]               ^

[warn]
/Users/chester/projects/spark/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala:830:
constructor Job in class Job is deprecated: see corresponding Javadoc for
more information.

[warn]     val job = new NewAPIHadoopJob(hadoopConf)

[warn]               ^

[warn]
/Users/chester/projects/spark/core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala:185:
method isDir in class FileStatus is deprecated: see corresponding Javadoc
for more information.

[warn]           fileStatuses.filter(!_.isDir).map(_.getPath).toSeq

[warn]                                  ^

[warn]
/Users/chester/projects/spark/core/src/main/scala/org/apache/spark/scheduler/InputFormatInfo.scala:106:
constructor Job in class Job is deprecated: see corresponding Javadoc for
more information.

[warn]     val job = new Job(conf)

[warn]               ^

[warn] 14 warnings found

[warn] Note:
/Users/chester/projects/spark/core/src/main/java/org/apache/spark/api/java/JavaSparkContextVarargsWorkaround.java
uses unchecked or unsafe operations.

[warn] Note: Recompile with -Xlint:unchecked for details.

[info] Compiling 15 Scala sources to
/Users/chester/projects/spark/yarn/stable/target/scala-2.10/classes...

[error]
/Users/chester/projects/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/Client.scala:26:
object api is not a member of package org.apache.hadoop.yarn.client

[error] import org.apache.hadoop.yarn.client.api.YarnClient

[error]                                      ^

[error]
/Users/chester/projects/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/Client.scala:40:
not found: value YarnClient

[error]   val yarnClient = YarnClient.createYarnClient

[error]                    ^

[error]
/Users/chester/projects/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:32:
object api is not a member of package org.apache.hadoop.yarn.client

[error] import org.apache.hadoop.yarn.client.api.AMRMClient

[error]                                      ^

[error]
/Users/chester/projects/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:33:
object api is not a member of package org.apache.hadoop.yarn.client

[error] import org.apache.hadoop.yarn.client.api.AMRMClient.ContainerRequest

[error]                                      ^

[error]
/Users/chester/projects/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:36:
object util is not a member of package org.apache.hadoop.yarn.webapp

[error] import org.apache.hadoop.yarn.webapp.util.WebAppUtils

[error]                                      ^

[error]
/Users/chester/projects/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:63:
value RM_AM_MAX_ATTEMPTS is not a member of object
org.apache.hadoop.yarn.conf.YarnConfiguration

[error]     YarnConfiguration.RM_AM_MAX_ATTEMPTS,
YarnConfiguration.DEFAULT_RM_AM_MAX_ATTEMPTS)

[error]                       ^

[error]
/Users/chester/projects/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:65:
not found: type AMRMClient

[error]   private var amClient: AMRMClient[ContainerRequest] = _

[error]                         ^

[error]
/Users/chester/projects/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:91:
not found: value AMRMClient

[error]     amClient = AMRMClient.createAMRMClient()

[error]                ^

[error]
/Users/chester/projects/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:136:
not found: value WebAppUtils

[error]     val proxy = WebAppUtils.getProxyHostAndPort(conf)

[error]                 ^

[error]
/Users/chester/projects/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:40:
object api is not a member of package org.apache.hadoop.yarn.client

[error] import org.apache.hadoop.yarn.client.api.AMRMClient

[error]                                      ^

[error]
/Users/chester/projects/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:618:
not found: type AMRMClient

[error]       amClient: AMRMClient[ContainerRequest],

[error]                 ^

[error]
/Users/chester/projects/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:596:
not found: type AMRMClient

[error]       amClient: AMRMClient[ContainerRequest],

[error]                 ^

[error]
/Users/chester/projects/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:577:
not found: type AMRMClient

[error]       amClient: AMRMClient[ContainerRequest],

[error]                 ^

[error]
/Users/chester/projects/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:452:
value CONTAINER_ID is not a member of object
org.apache.hadoop.yarn.api.ApplicationConstants.Environment

[error]     val containerIdString = System.getenv(
ApplicationConstants.Environment.CONTAINER_ID.name())

[error]
        ^

[error]
/Users/chester/projects/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/Client.scala:128:
value setTokens is not a member of
org.apache.hadoop.yarn.api.records.ContainerLaunchContext

[error]     amContainer.setTokens(ByteBuffer.wrap(dob.getData()))

[error]                 ^

[error]
/Users/chester/projects/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ExecutorLauncher.scala:36:
object api is not a member of package org.apache.hadoop.yarn.client

[error] import org.apache.hadoop.yarn.client.api.AMRMClient

[error]                                      ^

[error]
/Users/chester/projects/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ExecutorLauncher.scala:37:
object api is not a member of package org.apache.hadoop.yarn.client

[error] import org.apache.hadoop.yarn.client.api.AMRMClient.ContainerRequest

[error]                                      ^

[error]
/Users/chester/projects/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ExecutorLauncher.scala:39:
object util is not a member of package org.apache.hadoop.yarn.webapp

[error] import org.apache.hadoop.yarn.webapp.util.WebAppUtils

[error]                                      ^

[error]
/Users/chester/projects/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ExecutorLauncher.scala:62:
not found: type AMRMClient

[error]   private var amClient: AMRMClient[ContainerRequest] = _

[error]                         ^

[error]
/Users/chester/projects/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ExecutorLauncher.scala:99:
not found: value AMRMClient

[error]     amClient = AMRMClient.createAMRMClient()

[error]                ^

[error]
/Users/chester/projects/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ExecutorLauncher.scala:158:
not found: value WebAppUtils

[error]     val proxy = WebAppUtils.getProxyHostAndPort(conf)

[error]                 ^

[error]
/Users/chester/projects/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ExecutorRunnable.scala:31:
object ProtoUtils is not a member of package
org.apache.hadoop.yarn.api.records.impl.pb

[error] import org.apache.hadoop.yarn.api.records.impl.pb.ProtoUtils

[error]        ^

[error]
/Users/chester/projects/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ExecutorRunnable.scala:33:
object api is not a member of package org.apache.hadoop.yarn.client

[error] import org.apache.hadoop.yarn.client.api.NMClient

[error]                                      ^

[error]
/Users/chester/projects/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ExecutorRunnable.scala:53:
not found: type NMClient

[error]   var nmClient: NMClient = _

[error]                 ^

[error]
/Users/chester/projects/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ExecutorRunnable.scala:59:
not found: value NMClient

[error]     nmClient = NMClient.createNMClient()

[error]                ^

[error]
/Users/chester/projects/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ExecutorRunnable.scala:79:
value setTokens is not a member of
org.apache.hadoop.yarn.api.records.ContainerLaunchContext

[error]     ctx.setTokens(ByteBuffer.wrap(dob.getData()))

[error]         ^

[error]
/Users/chester/projects/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:35:
object ApplicationMasterProtocol is not a member of package
org.apache.hadoop.yarn.api

[error] import org.apache.hadoop.yarn.api.ApplicationMasterProtocol

[error]        ^

[error]
/Users/chester/projects/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:41:
object api is not a member of package org.apache.hadoop.yarn.client

[error] import org.apache.hadoop.yarn.client.api.AMRMClient.ContainerRequest

[error]                                      ^

[error]
/Users/chester/projects/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:65:
not found: type AMRMClient

[error]     val amClient: AMRMClient[ContainerRequest],

[error]                   ^

[error]
/Users/chester/projects/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:389:
not found: type ContainerRequest

[error]     ): ArrayBuffer[ContainerRequest] = {

[error]                    ^

[error]
/Users/chester/projects/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:388:
not found: type ContainerRequest

[error]       hostContainers: ArrayBuffer[ContainerRequest]

[error]                                   ^

[error]
/Users/chester/projects/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:405:
not found: type ContainerRequest

[error]     val requestedContainers = new
ArrayBuffer[ContainerRequest](rackToCounts.size)

[error]                                               ^

[error]
/Users/chester/projects/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:434:
not found: type ContainerRequest

[error]     val containerRequests: List[ContainerRequest] =

[error]                                 ^

[error]
/Users/chester/projects/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:508:
not found: type ContainerRequest

[error]     ): ArrayBuffer[ContainerRequest] = {

[error]                    ^

[error]
/Users/chester/projects/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:446:
not found: type ContainerRequest

[error]         val hostContainerRequests = new
ArrayBuffer[ContainerRequest](preferredHostToCount.size)

[error]                                                     ^

[error]
/Users/chester/projects/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:458:
not found: type ContainerRequest

[error]         val rackContainerRequests: List[ContainerRequest] =
createRackResourceRequests(

[error]                                         ^

[error]
/Users/chester/projects/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:467:
not found: type ContainerRequest

[error]         val containerRequestBuffer = new
ArrayBuffer[ContainerRequest](

[error]                                                      ^

[error]
/Users/chester/projects/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:542:
not found: type ContainerRequest

[error]     ): ArrayBuffer[ContainerRequest] = {

[error]                    ^

[error]
/Users/chester/projects/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:545:
value newInstance is not a member of object
org.apache.hadoop.yarn.api.records.Resource

[error]     val resource = Resource.newInstance(memoryRequest,
executorCores)

[error]                             ^

[error]
/Users/chester/projects/spark/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:550:
not found: type ContainerRequest

[error]     val requests = new ArrayBuffer[ContainerRequest]()

[error]                                    ^

[error] 40 errors found

[error] (yarn-stable/compile:compile) Compilation failed

[error] Total time: 98 s, completed Jul 16, 2014 5:14:44 PM













On Wed, Jul 16, 2014 at 4:19 PM, Sandy Ryza <sandy.ryza@cloudera.com> wrote:

> Hi Ron,
>
> I just checked and this bug is fixed in recent releases of Spark.
>
> -Sandy
>
>
> On Sun, Jul 13, 2014 at 8:15 PM, Chester Chen <chester@alpinenow.com>
> wrote:
>
>> Ron,
>>     Which distribution and Version of Hadoop are you using ?
>>
>>      I just looked at CDH5 (  hadoop-mapreduce-client-core-
>> 2.3.0-cdh5.0.0),
>>
>> MRJobConfig does have the field :
>>
>> java.lang.String DEFAULT_MAPREDUCE_APPLICATION_CLASSPATH;
>>
>> Chester
>>
>>
>>
>> On Sun, Jul 13, 2014 at 6:49 PM, Ron Gonzalez <zlgonzalez@yahoo.com>
>> wrote:
>>
>>> Hi,
>>>   I was doing programmatic submission of Spark yarn jobs and I saw code
>>> in ClientBase.getDefaultYarnApplicationClasspath():
>>>
>>> val field =
>>> classOf[MRJobConfig].getField("DEFAULT_YARN_APPLICATION_CLASSPATH)
>>> MRJobConfig doesn't have this field so the created launch env is
>>> incomplete. Workaround is to set yarn.application.classpath with the value
>>> from YarnConfiguration.DEFAULT_YARN_APPLICATION_CLASSPATH.
>>>
>>> This results in having the spark job hang if the submission config is
>>> different from the default config. For example, if my resource manager port
>>> is 8050 instead of 8030, then the spark app is not able to register itself
>>> and stays in ACCEPTED state.
>>>
>>> I can easily fix this by changing this to YarnConfiguration instead of
>>> MRJobConfig but was wondering what the steps are for submitting a fix.
>>>
>>> Thanks,
>>> Ron
>>>
>>> Sent from my iPhone
>>
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message