spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Justin Miller (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SPARK-17936) "CodeGenerator - failed to compile: org.codehaus.janino.JaninoRuntimeException: Code of" method Error
Date Fri, 14 Oct 2016 13:38:20 GMT
Justin Miller created SPARK-17936:
-------------------------------------

             Summary: "CodeGenerator - failed to compile: org.codehaus.janino.JaninoRuntimeException:
Code of" method Error
                 Key: SPARK-17936
                 URL: https://issues.apache.org/jira/browse/SPARK-17936
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
    Affects Versions: 2.0.1
            Reporter: Justin Miller


Greetings. I'm currently in the process of migrating a project I'm working on from Spark 1.6.2
to 2.0.1. The project uses Spark Streaming to convert Thrift structs coming from Kafka into
Parquet files stored in S3. This conversion process works fine in 1.6.2 but I think there
may be a bug in 2.0.1. I'll paste the stack trace below.

org.codehaus.janino.JaninoRuntimeException: Code of method "(Lorg/apache/spark/sql/catalyst/expressions/GeneratedClass;[Ljava/lang/Object;)V"
of class "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection"
grows beyond 64 KB
	at org.codehaus.janino.CodeContext.makeSpace(CodeContext.java:941)
	at org.codehaus.janino.CodeContext.write(CodeContext.java:854)
	at org.codehaus.janino.UnitCompiler.writeShort(UnitCompiler.java:10242)
	at org.codehaus.janino.UnitCompiler.writeLdc(UnitCompiler.java:9058)

Also, later on:
07:35:30.191 ERROR o.a.s.u.SparkUncaughtExceptionHandler - Uncaught exception in thread Thread[Executor
task launch worker-6,5,run-main-group-0]
java.lang.OutOfMemoryError: Java heap space

I've seen similar issues posted, but those were always on the query side. I have a hunch that
this is happening at write time as the error occurs after batchDuration. Here's the write
snippet.

stream.
      flatMap {
        case Success(row) =>
          thriftParseSuccess += 1
          Some(row)
        case Failure(ex) =>
          thriftParseErrors += 1
          logger.error("Error during deserialization: ", ex)
          None
      }.foreachRDD { rdd =>
        val sqlContext = SQLContext.getOrCreate(rdd.context)
        transformer(sqlContext.createDataFrame(rdd, converter.schema))
          .coalesce(coalesceSize)
          .write
          .mode(Append)
          .partitionBy(partitioning: _*)
          .parquet(parquetPath)
      }

Please let me know if you can be of assistance and if there's anything I can do to help.

Best,
Justin



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message