spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apache Spark (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-20237) Spark-1.6 current and later versions of memory management issues
Date Thu, 06 Apr 2017 07:34:41 GMT

    [ https://issues.apache.org/jira/browse/SPARK-20237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15958465#comment-15958465
] 

Apache Spark commented on SPARK-20237:
--------------------------------------

User 'zhangwei72' has created a pull request for this issue:
https://github.com/apache/spark/pull/17547

> Spark-1.6 current and later versions of memory management issues
> ----------------------------------------------------------------
>
>                 Key: SPARK-20237
>                 URL: https://issues.apache.org/jira/browse/SPARK-20237
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.6.0, 1.6.1, 1.6.2, 1.6.3, 2.0.0, 2.0.1, 2.0.2, 2.1.0
>         Environment: java 1.7.0  scala-2.10.5   maven-3.3.9    hadoop-2.2.0  spark-1.6.2
>            Reporter: zhangwei72
>            Priority: Critical
>              Labels: security
>             Fix For: 1.6.2
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> In spark-1.6 and later versions, there is a problem with its memory management UnifiedMemoryManager.
> Spark.memory.storageFraction configuration should be at least storage Memory memory.
> In the memory management UnifiedMemoryManager, the calculation of Execution memory can
be up to storage how much memory can borrow,using val memoryReclaimableFromStorage = math.max(storageMemoryPool.memoryFree,storageMemoryPool.poolSize
> - storageRegionSize).
> When storageMemoryPool.memoryFree > storageMemoryPool.poolSize - storageRegionSize,
the size of the a will be chosen, that is,storage Memory will reduce the storageMemoryPool.memoryFree
so much.
> Because of storageMemoryPool.memoryFree > storageMemoryPool.poolSize - storageRegionSize,
so storageMemoryPool.poolSize - storageMemoryPool.memoryFree < storageRegionSize
> Now storageMemoryPool.poolSize < storageRegionSize,storageRegionSize is the smallest
proportion of frame definition,so there is a problem.
> To solve this problem, we define the function as  val memoryReclaimableFromStorage =
storageMemoryPool.poolSize - storageRegionSize.
> Experimental proof:
> I added some log information to the UnifiedMemoryManager file as follows:
> logInfo("storageMemoryPool.memoryFree %f".format(storageMemoryPool.memoryFree/1024.0/1024.0))
              logInfo("onHeapExecutionMemoryPool.memoryFree %f".format(onHeapExecutionMemoryPool.memoryFree/1024.0/1024.0))
            logInfo("storageMemoryPool.memoryUsed %f".format( storageMemoryPool.memoryUsed/1024.0/1024.0))
            logInfo("onHeapExecutionMemoryPool.memoryUsed %f".format(onHeapExecutionMemoryPool.memoryUsed/1024.0/1024.0))
            logInfo("storageMemoryPool.poolSize %f".format( storageMemoryPool.poolSize/1024.0/1024.0))
           logInfo("onHeapExecutionMemoryPool.poolSize %f".format(onHeapExecutionMemoryPool.poolSize/1024.0/1024.0))
>   When I run the PageRank program, the input file for PageRank is generated by the BigDataBench-Chinese
Academy of Sciences and is used to evaluate large data analysis system tools with a size of
676M. The information submitted is as follows:
> ./bin/spark-submit --class org.apache.spark.examples.SparkPageRank \
>     --master yarn \
>     --deploy-mode cluster \
>     --num-executors 1 \
>     --driver-memory 4g \
>     --executor-memory 7g \
>     --executor-cores 6 \
>     --queue thequeue \
>     ./examples/target/scala-2.10/spark-examples-1.6.2-hadoop2.2.0.jar \
>      /test/Google_genGraph_23.txt 6
> The configuration is as follows:
> spark.memory.useLegacyMode=false
> spark.memory.fraction=0.75
> spark.memory.storageFraction=0.2
> Log information is as follows:
> 17/02/28 11:07:34 INFO memory.UnifiedMemoryManager: storageMemoryPool.memoryFree 0.000000
> 17/02/28 11:07:34 INFO memory.UnifiedMemoryManager: onHeapExecutionMemoryPool.memoryFree
5663.325877
> 17/02/28 11:07:34 INFO memory.UnifiedMemoryManager: storageMemoryPool.memoryUsed 0.299123
M
> 17/02/28 11:07:34 INFO memory.UnifiedMemoryManager: onHeapExecutionMemoryPool.memoryUsed
0.000000
> 17/02/28 11:07:34 INFO memory.UnifiedMemoryManager: storageMemoryPool.poolSize 0.299123
> 17/02/28 11:07:34 INFO memory.UnifiedMemoryManager: onHeapExecutionMemoryPool.poolSize
5663.325877
> According to the configuration, storageMemoryPool.poolSize at least 1G or more, but the
log information is only 0.299123 M, so there is an error.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message