spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "vishal agrawal (JIRA)" <>
Subject [jira] [Commented] (SPARK-18857) SparkSQL ThriftServer hangs while extracting huge data volumes in incremental collect mode
Date Mon, 26 Dec 2016 09:48:58 GMT


vishal agrawal commented on SPARK-18857:

we have built Spark from 2.0.2 source code by changing SparkExecuteStatementOperation.scala
to pre SPARK-16563 version. this version works fine without causing any thrift server issues.

> SparkSQL ThriftServer hangs while extracting huge data volumes in incremental collect
> ------------------------------------------------------------------------------------------
>                 Key: SPARK-18857
>                 URL:
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.0.2
>            Reporter: vishal agrawal
>         Attachments: GC-spark-1.6.3, GC-spark-2.0.2
> We are trying to run a sql query on our spark cluster and extracting around 200 million
records through SparkSQL ThriftServer interface. This query works fine for Spark 1.6.3 version,
however for spark 2.0.2, thrift server hangs after fetching data from a few partitions (we
are using incremental collect mode with 400 partitions). As per documentation max memory taken
up by thrift server should be what is required by the biggest data partition. But we observed
that Thrift server is not releasing the old partitions memory whenever the GC occurs even
though it has moved to next partition data fetches. which is not the case with 1.6.3 version.
> On further investigation we found that SparkExecuteStatementOperation.scala was modified
for "[SPARK-16563][SQL] fix spark sql thrift server FetchResults bug" and result set iterator
was duplicated to keep a reference to the first set.
> +      val (itra, itrb) = iter.duplicate
> +      iterHeader = itra
> +      iter = itrb
> We suspect that this is resulting in the memory not being cleared on GC. To confirm this
we created an iterator in our test class and fetched the data once without duplicating and
second time with creating a duplicate. we could see that in first instance it ran fine and
fetched the entire data set while in second instance driver hanged after fetching data from
a few partitions.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message