Thanks for reporting it Terry. I submitted a PR to fix it: https://github.com/apache/spark/pull/9132

Best Regards,

Shixiong Zhu

2015-10-15 2:39 GMT+08:00 Reynold Xin <rxin@databricks.com>:
+dev list 

On Wed, Oct 14, 2015 at 1:07 AM, Terry Hoo <hujie.eagle@gmail.com> wrote:
All,

Does anyone meet memory leak issue with spark streaming and spark sql in spark 1.5.1? I can see the memory is increasing all the time when running this simple sample: 

        val sc = new SparkContext(conf)
        val sqlContext = new HiveContext(sc)
        import sqlContext.implicits._
        val ssc = new StreamingContext(sc, Seconds(1))
        val s1 = ssc.socketTextStream("localhost", 9999).map(x => (x,1)).reduceByKey((x : Int, y : Int) => x + y)
        s1.print
        s1.foreachRDD(rdd => {
          rdd.foreach(_ => Unit)
          sqlContext.createDataFrame(rdd).registerTempTable("A")
          sqlContext.sql("""select * from A""").show(1)
        })

After dump the the java heap, I can see there is about 22K entries in SQLListener._stageIdToStageMetrics after 2 hour running (other maps in this SQLListener has about 1K entries), is this a leak in SQLListener? 

Thanks!
Terry