spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kazuaki Ishizaki (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-23427) spark.sql.autoBroadcastJoinThreshold causing OOM exception in the driver
Date Mon, 19 Feb 2018 00:49:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-23427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16368728#comment-16368728
] 

Kazuaki Ishizaki commented on SPARK-23427:
------------------------------------------

Thank you. I ran this program several times with 64GB heap size. I saw the following OOM in
both cases `-1` or default (`10*1024`*1024`). I am running the program with other heap sizes.
Is this OOM what you are seeing?  If not, I would appreciate if you could upload stack trace
when OOM occurred.

{code:java}
[info] org.apache.spark.sql.MyTest *** ABORTED *** (2 hours, 14 minutes, 36 seconds)
[info] java.lang.OutOfMemoryError:
[info] at java.lang.AbstractStringBuilder.hugeCapacity(AbstractStringBuilder.java:161)
[info] at java.lang.AbstractStringBuilder.newCapacity(AbstractStringBuilder.java:155)
[info] at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:125)
[info] at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)
[info] at java.lang.StringBuilder.append(StringBuilder.java:136)
[info] at java.lang.StringBuilder.append(StringBuilder.java:131)
[info] at scala.StringContext.standardInterpolator(StringContext.scala:125)
[info] at scala.StringContext.s(StringContext.scala:95)
[info] at org.apache.spark.sql.execution.QueryExecution.toString(QueryExecution.scala:199)
[info] at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:74)
[info] at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3252)
[info] at org.apache.spark.sql.Dataset.<init>(Dataset.scala:190)
[info] at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75)
[info] at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$withPlan(Dataset.scala:3295)
[info] at org.apache.spark.sql.Dataset.createOrReplaceTempView(Dataset.scala:3033)
[info] at org.apache.spark.sql.MyTest$$anonfun$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(MyTest.scala:87)
[info] at org.apache.spark.sql.catalyst.plans.PlanTestBase$class.withSQLConf(PlanTest.scala:176)
[info] at org.apache.spark.sql.MyTest.org$apache$spark$sql$test$SQLTestUtilsBase$$super$withSQLConf(MyTest.scala:27)
[info] at org.apache.spark.sql.test.SQLTestUtilsBase$class.withSQLConf(SQLTestUtils.scala:167)
[info] at org.apache.spark.sql.MyTest.withSQLConf(MyTest.scala:27)
[info] at org.apache.spark.sql.MyTest$$anonfun$1.apply$mcV$sp(MyTest.scala:65)
[info] at org.apache.spark.sql.MyTest$$anonfun$1.apply(MyTest.scala:65)
[info] at org.apache.spark.sql.MyTest$$anonfun$1.apply(MyTest.scala:65)
...
{code:java}


> spark.sql.autoBroadcastJoinThreshold causing OOM exception in the driver 
> -------------------------------------------------------------------------
>
>                 Key: SPARK-23427
>                 URL: https://issues.apache.org/jira/browse/SPARK-23427
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.0.0
>         Environment: SPARK 2.0 version
>            Reporter: Dhiraj
>            Priority: Critical
>
> We are facing issue around value of spark.sql.autoBroadcastJoinThreshold.
> With spark.sql.autoBroadcastJoinThreshold -1 ( disable) we seeing driver memory used
flat.
> With any other values 10MB, 5MB, 2 MB, 1MB, 10K, 1K we see driver memory used goes up
with rate depending upon the size of the autoBroadcastThreshold and getting OOM exception.
The problem is memory used by autoBroadcast is not being free up in the driver.
> Application imports oracle tables as master dataframes which are persisted. Each job
applies filter to these tables and then registered them as tempViewTable . Then sql query
are using to process data further. At the end all the intermediate dataFrame are unpersisted.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message