spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xing Shi (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (SPARK-17465) Inappropriate memory management in `org.apache.spark.storage.MemoryStore` may lead to memory leak
Date Mon, 03 Oct 2016 09:17:23 GMT

    [ https://issues.apache.org/jira/browse/SPARK-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15541946#comment-15541946
] 

Xing Shi edited comment on SPARK-17465 at 10/3/16 9:17 AM:
-----------------------------------------------------------

Resolved.

In every task, method _currentUnrollMemory_ will be called several times. It will scan all
keys of _unrollMemoryMap_ and _pendingUnrollMemoryMap_, so the processing time is proportional
to the map size.
https://github.com/apache/spark/blob/v1.6.0/core/src/main/scala/org/apache/spark/storage/MemoryStore.scala#L540-L542

I have checked the processing time of _currentUnrollMemory_. It just equals to the time increased
from before.

Hope this will help someone who has a similar issue of increasing processing time when upgrade
Spark to 1.6.0 :)


was (Author: saturday_s):
Resolved.

In every task, method _currentUnrollMemory_ will be called several times. It will scan all
keys of _unrollMemoryMap_ and _ pendingUnrollMemoryMap_ , so the processing time is proportional
to the map size.
https://github.com/apache/spark/blob/v1.6.0/core/src/main/scala/org/apache/spark/storage/MemoryStore.scala#L540-L542

I have checked the processing time of _currentUnrollMemory_. It just equals to the time increased
from before.

Hope this will help someone who has a similar issue of increasing processing time when upgrade
Spark to 1.6.0 :)

> Inappropriate memory management in `org.apache.spark.storage.MemoryStore` may lead to
memory leak
> -------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-17465
>                 URL: https://issues.apache.org/jira/browse/SPARK-17465
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.6.0, 1.6.1, 1.6.2
>            Reporter: Xing Shi
>            Assignee: Xing Shi
>             Fix For: 1.6.3, 2.0.1, 2.1.0
>
>
> After updating Spark from 1.5.0 to 1.6.0, I found that it seems to have a memory leak
on my Spark streaming application.
> Here is the head of the heap histogram of my application, which has been running about
160 hours:
> {code:borderStyle=solid}
>  num     #instances         #bytes  class name
> ----------------------------------------------
>    1:         28094       71753976  [B
>    2:       1188086       28514064  java.lang.Long
>    3:       1183844       28412256  scala.collection.mutable.DefaultEntry
>    4:        102242       13098768  <methodKlass>
>    5:        102242       12421000  <constMethodKlass>
>    6:          8184        9199032  <constantPoolKlass>
>    7:            38        8391584  [Lscala.collection.mutable.HashEntry;
>    8:          8184        7514288  <instanceKlassKlass>
>    9:          6651        4874080  <constantPoolCacheKlass>
>   10:         37197        3438040  [C
>   11:          6423        2445640  <methodDataKlass>
>   12:          8773        1044808  java.lang.Class
>   13:         36869         884856  java.lang.String
>   14:         15715         848368  [[I
>   15:         13690         782808  [S
>   16:         18903         604896  java.util.concurrent.ConcurrentHashMap$HashEntry
>   17:            13         426192  [Lscala.concurrent.forkjoin.ForkJoinTask;
> {code}
> It shows that *scala.collection.mutable.DefaultEntry* and *java.lang.Long* have unexpected
big numbers of instances. In fact, the numbers started growing at streaming process began,
and keep growing proportional to total number of tasks.
> After some further investigation, I found that the problem is caused by some inappropriate
memory management in _releaseUnrollMemoryForThisTask_ and _unrollSafely_ method of class [org.apache.spark.storage.MemoryStore|https://github.com/apache/spark/blob/branch-1.6/core/src/main/scala/org/apache/spark/storage/MemoryStore.scala].
> In Spark 1.6.x, a _releaseUnrollMemoryForThisTask_ operation will be processed only with
the parameter _memoryToRelease_ > 0:
> https://github.com/apache/spark/blob/branch-1.6/core/src/main/scala/org/apache/spark/storage/MemoryStore.scala#L530-L537
> But in fact, if a task successfully unrolled all its blocks in memory by _unrollSafely_
method, the memory saved in _unrollMemoryMap_ would be set to zero:
> https://github.com/apache/spark/blob/branch-1.6/core/src/main/scala/org/apache/spark/storage/MemoryStore.scala#L322
> So the result is, the memory saved in _unrollMemoryMap_ will be released, but the key
of that part of memory will never be removed from the hash map. The hash table will keep increasing,
while new tasks keep incoming. Although the speed of increase is comparatively slow (about
dozens of bytes per task), it is possible that result into OOM after weeks or months.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message