spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ZhengYaofeng (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-10529) When creating multiple HiveContext objects in one jvm, jdbc connections to metastore cann't be released and it may cause PermGen OutOfMemoryError.
Date Thu, 10 Sep 2015 09:37:45 GMT

     [ https://issues.apache.org/jira/browse/SPARK-10529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

ZhengYaofeng updated SPARK-10529:
---------------------------------
    Description: 
Test code as follows:

object SqlTest {
  def main(args: Array[String]) {

    def createSc = {
      val sparkConf = new SparkConf().setAppName(s"SqlTest")
        .setMaster("spark://zdh221:7077")
        .set("spark.executor.memory", "4g")
        .set("spark.executor.cores", "2")
        .set("spark.cores.max", "6")
      new SparkContext(sparkConf)
    }

    for (index <- 1 to 200) {
      println(s"============Current Index:${index}=============")
      val hc = new HiveContext(createSc)
      hc.sql("show databases").collect().foreach(println)
      hc.sparkContext.stop()

      Thread.sleep(3000)
    }
    Thread.sleep(1000000)
  }

}	

Testing on spark 1.4.1 with run cmd bellow.
	export CLASSPATH="$CLASSPATH:/home/hadoop/spark/conf:/home/hadoop/spark/lib/*:/home/hadoop/zyf/lib/*"
	java -Xmx8096m -Xms1024m -XX:MaxPermSize=1024m -cp $CLASSPATH SqlTest

Files list:
	/home/hadoop/spark/conf:core-site.xml;hdfs-site.xml;hive-site.xml;slaves;spark-defaults.conf;spark-env.sh
	/home/hadoop/zyf/lib:hadoop-lzo-0.4.20.jar;mysql-connector-java-5.1.28-bin.jar;sqltest-1.0-SNAPSHOT.jar
	
MySQL is used as the metastore. You can obviously see that jdbc connections to MySQL grow
constantly through command 'show status like 'Threads_connected';' when my test app is running.
Even if you invoke 'Hive.closeCurrent()', it cann't release current jdbc connections. Besides
I can not find another possible way. If you take spark 1.3.1 to test, jdbc connections won't
grow.

Meanwhile, it ends with 'java.lang.OutOfMemoryError: PermGen space' when cycling 45 times,
which means 45 HiveContext objects are created. It's interesting that if you set MaxPermSize
to '2048m', it can cycle 93 times, if you set MaxPermSize to '3072m', it can cycle 141 times.
So,it indicates that each time creating one HiveContext object, it loads the same amount of
new classes and they won't be released.

  was:
Test code as follows:

object SqlTest {
  def main(args: Array[String]) {

    def createSc = {
      val sparkConf = new SparkConf().setAppName(s"SqlTest")
        .setMaster("spark://zdh221:7077")
        .set("spark.executor.memory", "4g")
        .set("spark.executor.cores", "2")
        .set("spark.cores.max", "6")
      new SparkContext(sparkConf)
    }

    for (index <- 1 to 200) {
      println(s"============Current Index:${index}=============")
      val hc = new HiveContext(createSc)
      hc.sql("show databases").collect().foreach(println)
      hc.sparkContext.stop()

      Thread.sleep(3000)
    }
    Thread.sleep(1000000)
  }

}
	

Testing on spark 1.4.1 with run cmd bellow.
	export CLASSPATH="$CLASSPATH:/home/hadoop/spark/conf:/home/hadoop/spark/lib/*:/home/hadoop/zyf/lib/*"
	java -Xmx8096m -Xms1024m -XX:MaxPermSize=1024m -cp $CLASSPATH SqlTest

Files list:
	/home/hadoop/spark/conf:core-site.xml;hdfs-site.xml;hive-site.xml;slaves;spark-defaults.conf;spark-env.sh
	/home/hadoop/zyf/lib:hadoop-lzo-0.4.20.jar;mysql-connector-java-5.1.28-bin.jar;sqltest-1.0-SNAPSHOT.jar
	
MySQL is used as the metastore. You can obviously see that jdbc connections to MySQL grow
constantly through command 'show status like 'Threads_connected';' when my test app is running.
Even if you invoke 'Hive.closeCurrent()', it cann't release current jdbc connections. Besides
I can not find another possible way. If you take spark 1.3.1 to test, jdbc connections won't
grow.

Meanwhile, it ends with 'java.lang.OutOfMemoryError: PermGen space' when cycling 45 times,
which means 45 HiveContext objects are created. It's interesting that if you set MaxPermSize
to '2048m', it can cycle 93 times, if you set MaxPermSize to '3072m', it can cycle 141 times.
So,it indicates that each time creating one HiveContext object, it loads the same amount of
new classes and they won't be released.


> When creating multiple HiveContext objects in one jvm, jdbc connections to metastore
cann't be released and it may cause PermGen OutOfMemoryError.
> --------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-10529
>                 URL: https://issues.apache.org/jira/browse/SPARK-10529
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.4.1
>            Reporter: ZhengYaofeng
>         Attachments: IsolatedClientLoader.scala
>
>
> Test code as follows:
> object SqlTest {
>   def main(args: Array[String]) {
>     def createSc = {
>       val sparkConf = new SparkConf().setAppName(s"SqlTest")
>         .setMaster("spark://zdh221:7077")
>         .set("spark.executor.memory", "4g")
>         .set("spark.executor.cores", "2")
>         .set("spark.cores.max", "6")
>       new SparkContext(sparkConf)
>     }
>     for (index <- 1 to 200) {
>       println(s"============Current Index:${index}=============")
>       val hc = new HiveContext(createSc)
>       hc.sql("show databases").collect().foreach(println)
>       hc.sparkContext.stop()
>       Thread.sleep(3000)
>     }
>     Thread.sleep(1000000)
>   }
> }	
> Testing on spark 1.4.1 with run cmd bellow.
> 	export CLASSPATH="$CLASSPATH:/home/hadoop/spark/conf:/home/hadoop/spark/lib/*:/home/hadoop/zyf/lib/*"
> 	java -Xmx8096m -Xms1024m -XX:MaxPermSize=1024m -cp $CLASSPATH SqlTest
> Files list:
> 	/home/hadoop/spark/conf:core-site.xml;hdfs-site.xml;hive-site.xml;slaves;spark-defaults.conf;spark-env.sh
> 	/home/hadoop/zyf/lib:hadoop-lzo-0.4.20.jar;mysql-connector-java-5.1.28-bin.jar;sqltest-1.0-SNAPSHOT.jar
> 	
> MySQL is used as the metastore. You can obviously see that jdbc connections to MySQL
grow constantly through command 'show status like 'Threads_connected';' when my test app is
running. Even if you invoke 'Hive.closeCurrent()', it cann't release current jdbc connections.
Besides I can not find another possible way. If you take spark 1.3.1 to test, jdbc connections
won't grow.
> Meanwhile, it ends with 'java.lang.OutOfMemoryError: PermGen space' when cycling 45 times,
which means 45 HiveContext objects are created. It's interesting that if you set MaxPermSize
to '2048m', it can cycle 93 times, if you set MaxPermSize to '3072m', it can cycle 141 times.
So,it indicates that each time creating one HiveContext object, it loads the same amount of
new classes and they won't be released.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message