spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "liyunzhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-22660) Use position() and limit() to fix ambiguity issue in scala-2.12
Date Thu, 14 Dec 2017 08:19:01 GMT

    [ https://issues.apache.org/jira/browse/SPARK-22660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16290518#comment-16290518
] 

liyunzhang commented on SPARK-22660:
------------------------------------

[~srowen]:  there is another modification about limit() in TaskSetManager.scala.  Very Sorry
for not including it in last commit. 
{code}
      abort(s"$msg Exception during serialization: $e")
             throw new TaskNotSerializableException(e)
          }		          }
 -        if (serializedTask.limit > TaskSetManager.TASK_SIZE_TO_WARN_KB * 1024 &&

 +        if (serializedTask.limit() > TaskSetManager.TASK_SIZE_TO_WARN_KB * 1024 &&
            !emittedTaskSizeWarning) {		           
            emittedTaskSizeWarning = true		           
            logWarning(s"Stage ${task.stageId} contains a task of very large size " +		  
       
 -            s"(${serializedTask.limit / 1024} KB). The maximum recommended task size is
" +		
 +            s"(${serializedTask.limit() / 1024} KB). The maximum recommended task size is
" +
              s"${TaskSetManager.TASK_SIZE_TO_WARN_KB} KB.")		             
          }		          }
          addRunningTask(taskId)		         
 @@ -502,7 +502,7 @@ private[spark] class TaskSetManager(
          // val timeTaken = clock.getTime() - startTime		      
          val taskName = s"task ${info.id} in stage ${taskSet.id}"		                   	 
     
		  logInfo(s"Starting $taskName (TID $taskId, $host, executor ${info.executorId}, " +
 -          s"partition ${task.partitionId}, $taskLocality, ${serializedTask.limit} bytes)")

 +          s"partition ${task.partitionId}, $taskLocality, ${serializedTask.limit()} bytes)")
  		  
          sched.dagScheduler.taskStarted(task, info)		        
          new TaskDescription(		         

{code}

Can you help review? 

> Use position() and limit() to fix ambiguity issue in scala-2.12
> ---------------------------------------------------------------
>
>                 Key: SPARK-22660
>                 URL: https://issues.apache.org/jira/browse/SPARK-22660
>             Project: Spark
>          Issue Type: Improvement
>          Components: Build
>    Affects Versions: 2.2.0
>            Reporter: liyunzhang
>            Assignee: liyunzhang
>            Priority: Minor
>             Fix For: 2.3.0
>
>
> build with scala-2.12 with following steps
> 1. change the pom.xml with scala-2.12
>  ./dev/change-scala-version.sh 2.12
> 2.build with -Pscala-2.12
> for hive on spark
> {code}
> ./dev/make-distribution.sh   --tgz -Pscala-2.12 -Phadoop-2.7  -Pyarn -Pparquet-provided
-Dhadoop.version=2.7.3
> {code}
> for spark sql
> {code}
> ./dev/make-distribution.sh  --tgz -Pscala-2.12 -Phadoop-2.7  -Pyarn -Phive -Dhadoop.version=2.7.3>log.sparksql
2>&1
> {code}
> get following error
> #Error1
> {code}
> /common/unsafe/src/main/java/org/apache/spark/unsafe/Platform.java:172: error: cannot
find       symbol
>     Cleaner cleaner = Cleaner.create(buffer, () -> freeMemory(memory));
> {code}
> This is because sun.misc.Cleaner has been moved to new location in JDK9. HADOOP-12760
will be the long term fix
> #Error2
> {code}
> spark_source/core/src/main/scala/org/apache/spark/executor/Executor.scala:455: ambiguous
reference to overloaded definition, method limit in class ByteBuffer of type (x$1: Int)java.nio.ByteBuffer
> method limit in class Buffer of type ()Int
> match expected type ?
>      val resultSize = serializedDirectResult.limit
> error                                         
> {code}
> The limit method was moved from ByteBuffer to the superclass Buffer and it can no longer
be called without (). The same reason for position method.
> #Error3
> {code}
> home/zly/prj/oss/jdk9_HOS_SOURCE/spark_source/sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/ScriptTransformationExec.scala:415:
ambiguous reference to overloaded definition, [error] both method putAll in class Properties
of type (x$1: java.util.Map[_, _])Unit [error] and  method putAll in class Hashtable of type
(x$1: java.util.Map[_ <: Object, _ <: Object])Unit [error] match argument types (java.util.Map[String,String])
>  [error]     properties.putAll(propsMap.asJava)
>  [error]                ^
> [error] /home/zly/prj/oss/jdk9_HOS_SOURCE/spark_source/sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/ScriptTransformationExec.scala:427:
ambiguous reference to overloaded definition, [error] both method putAll in class Properties
of type (x$1: java.util.Map[_, _])Unit [error] and  method putAll in class Hashtable of type
(x$1: java.util.Map[_ <: Object, _ <: Object])Unit [error] match argument types (java.util.Map[String,String])
>  [error]       props.putAll(outputSerdeProps.toMap.asJava)
>  [error]             ^
>  {code}
>  This is because the key type is Object instead of String which is unsafe.
> After solving these 3 errors, compile successfully.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message