spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Reynold Xin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-15391) Spark executor OOM during TimSort
Date Wed, 25 May 2016 18:00:16 GMT

    [ https://issues.apache.org/jira/browse/SPARK-15391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15300529#comment-15300529
] 

Reynold Xin commented on SPARK-15391:
-------------------------------------

We should look into this -- do proper memory management for timsort buffers.

If it ends up being too difficult, we can also switch to quicksort since most quicksort implementations
can sort in place. However, based on our testing, quicksort is often slower.


> Spark executor OOM during TimSort
> ---------------------------------
>
>                 Key: SPARK-15391
>                 URL: https://issues.apache.org/jira/browse/SPARK-15391
>             Project: Spark
>          Issue Type: Bug
>            Reporter: Sital Kedia
>
> While running a query, we are seeing a lot of executor OOM while doing TimSort.
> Stack trace - 
> {code}
> at org.apache.spark.util.collection.unsafe.sort.UnsafeSortDataFormat.allocate(UnsafeSortDataFormat.java:86)
> 	at org.apache.spark.util.collection.unsafe.sort.UnsafeSortDataFormat.allocate(UnsafeSortDataFormat.java:32)
> 	at org.apache.spark.util.collection.TimSort$SortState.ensureCapacity(TimSort.java:951)
> 	at org.apache.spark.util.collection.TimSort$SortState.mergeLo(TimSort.java:699)
> 	at org.apache.spark.util.collection.TimSort$SortState.mergeAt(TimSort.java:525)
> 	at org.apache.spark.util.collection.TimSort$SortState.mergeCollapse(TimSort.java:453)
> 	at org.apache.spark.util.collection.TimSort$SortState.access$200(TimSort.java:325)
> 	at org.apache.spark.util.collection.TimSort.sort(TimSort.java:153)
> 	at org.apache.spark.util.collection.Sorter.sort(Sorter.scala:37)
> 	at org.apache.spark.util.collection.unsafe.sort.UnsafeInMemorySorter.getSortedIterator(UnsafeInMemorySorter.java:235)
> 	at org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.spill(UnsafeExternalSorter.java:198)
> 	at org.apache.spark.memory.MemoryConsumer.spill(MemoryConsumer.java:58)
> 	at org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.insertRecord(UnsafeExternalSorter.java:356)
> 	at org.apache.spark.sql.execution.UnsafeExternalRowSorter.insertRow(UnsafeExternalRowSorter.java:91)
> {code}
> Out of total 32g available to the executors, we are allocating 24g onheap and 8g onheap
memory. Looking at the code (https://github.com/apache/spark/blob/master/core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeSortDataFormat.java#L87),
we see that during TimSort we are allocating the memory buffer onheap irrespective of memory
mode that is reason of executor OOM. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message