spark-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From hvanhov...@apache.org
Subject spark git commit: [SPARK-18208][SHUFFLE] Executor OOM due to a growing LongArray in BytesToBytesMap
Date Wed, 07 Dec 2016 12:33:55 GMT
Repository: spark
Updated Branches:
  refs/heads/branch-2.1 51754d6df -> 4432a2a83


[SPARK-18208][SHUFFLE] Executor OOM due to a growing LongArray in BytesToBytesMap

## What changes were proposed in this pull request?

BytesToBytesMap currently does not release the in-memory storage (the longArray variable)
after it spills to disk. This is typically not a problem during aggregation because the longArray
should be much smaller than the pages, and because we grow the longArray at a conservative
rate.

However this can lead to an OOM when an already running task is allocated more than its fair
share, this can happen because of a scheduling delay. In this case the longArray can grow
beyond the fair share of memory for the task. This becomes problematic when the task spills
and the long array is not freed, that causes subsequent memory allocation requests to be denied
by the memory manager resulting in an OOM.

This PR fixes this issuing by freeing the longArray when the BytesToBytesMap spills.

## How was this patch tested?

Existing tests and tested on realworld workloads.

Author: Jie Xiong <jiexiong@fb.com>
Author: jiexiong <jiexiong@gmail.com>

Closes #15722 from jiexiong/jie_oom_fix.

(cherry picked from commit c496d03b5289f7c604661a12af86f6accddcf125)
Signed-off-by: Herman van Hovell <hvanhovell@databricks.com>


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/4432a2a8
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/4432a2a8
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/4432a2a8

Branch: refs/heads/branch-2.1
Commit: 4432a2a8386f951775957f352e4ba223c6ce4fa3
Parents: 51754d6
Author: Jie Xiong <jiexiong@fb.com>
Authored: Wed Dec 7 04:33:30 2016 -0800
Committer: Herman van Hovell <hvanhovell@databricks.com>
Committed: Wed Dec 7 04:33:50 2016 -0800

----------------------------------------------------------------------
 .../java/org/apache/spark/unsafe/map/BytesToBytesMap.java     | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/4432a2a8/core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java
----------------------------------------------------------------------
diff --git a/core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java b/core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java
index d2fcdea..44120e5 100644
--- a/core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java
+++ b/core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java
@@ -170,6 +170,8 @@ public final class BytesToBytesMap extends MemoryConsumer {
 
   private long peakMemoryUsedBytes = 0L;
 
+  private final int initialCapacity;
+
   private final BlockManager blockManager;
   private final SerializerManager serializerManager;
   private volatile MapIterator destructiveIterator = null;
@@ -202,6 +204,7 @@ public final class BytesToBytesMap extends MemoryConsumer {
       throw new IllegalArgumentException("Page size " + pageSizeBytes + " cannot exceed "
+
         TaskMemoryManager.MAXIMUM_PAGE_SIZE_BYTES);
     }
+    this.initialCapacity = initialCapacity;
     allocate(initialCapacity);
   }
 
@@ -902,12 +905,12 @@ public final class BytesToBytesMap extends MemoryConsumer {
   public void reset() {
     numKeys = 0;
     numValues = 0;
-    longArray.zeroOut();
-
+    freeArray(longArray);
     while (dataPages.size() > 0) {
       MemoryBlock dataPage = dataPages.removeLast();
       freePage(dataPage);
     }
+    allocate(initialCapacity);
     currentPage = null;
     pageCursor = 0;
   }


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org


Mime
View raw message