spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shixiong Zhu <zsxw...@gmail.com>
Subject Re: [SQL] Memory leak with spark streaming and spark sql in spark 1.5.1
Date Thu, 15 Oct 2015 06:31:04 GMT
Thanks for reporting it Terry. I submitted a PR to fix it:
https://github.com/apache/spark/pull/9132

Best Regards,
Shixiong Zhu

2015-10-15 2:39 GMT+08:00 Reynold Xin <rxin@databricks.com>:

> +dev list
>
> On Wed, Oct 14, 2015 at 1:07 AM, Terry Hoo <hujie.eagle@gmail.com> wrote:
>
>> All,
>>
>> Does anyone meet memory leak issue with spark streaming and spark sql in
>> spark 1.5.1? I can see the memory is increasing all the time when running
>> this simple sample:
>>
>>         val sc = new SparkContext(conf)
>>         val sqlContext = new HiveContext(sc)
>>         import sqlContext.implicits._
>>         val ssc = new StreamingContext(sc, Seconds(1))
>>         val s1 = ssc.socketTextStream("localhost", 9999).map(x =>
>> (x,1)).reduceByKey((x : Int, y : Int) => x + y)
>>         s1.print
>>         s1.foreachRDD(rdd => {
>>           rdd.foreach(_ => Unit)
>>           sqlContext.createDataFrame(rdd).registerTempTable("A")
>>           sqlContext.sql("""select * from A""").show(1)
>>         })
>>
>> After dump the the java heap, I can see there is about 22K entries
>> in SQLListener._stageIdToStageMetrics after 2 hour running (other maps in
>> this SQLListener has about 1K entries), is this a leak in SQLListener?
>>
>> Thanks!
>> Terry
>>
>
>

Mime
View raw message