Mailing-List: contact issues-help@spark.apache.org; run by ezmlm
Precedence: bulk
Date: Thu, 6 Apr 2017 07:34:41 +0000 (UTC)
From: "Apache Spark (JIRA)" <jira@apache.org>
To: issues@spark.apache.org
Message-ID: <JIRA.13062011.1491461185000.221241.1491464081740@Atlassian.JIRA>
In-Reply-To: <JIRA.13062011.1491461185000@Atlassian.JIRA>
References: <JIRA.13062011.1491461185000@Atlassian.JIRA> <JIRA.13062011.1491461185256@jira-lw-us.apache.org>
Subject: [jira] [Commented] (SPARK-20237) Spark-1.6 current and later
 versions of memory management issues
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
archived-at: Thu, 06 Apr 2017 07:34:46 -0000


    [ https://issues.apache.org/jira/browse/SPARK-20237?page=3Dcom.atlassia=
n.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D159=
58465#comment-15958465 ]=20

Apache Spark commented on SPARK-20237:
--------------------------------------

User 'zhangwei72' has created a pull request for this issue:
https://github.com/apache/spark/pull/17547

> Spark-1.6 current and later versions of memory management issues
> ----------------------------------------------------------------
>
>                 Key: SPARK-20237
>                 URL: https://issues.apache.org/jira/browse/SPARK-20237
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.6.0, 1.6.1, 1.6.2, 1.6.3, 2.0.0, 2.0.1, 2.0.2, 2.1=
.0
>         Environment: java 1.7.0  scala-2.10.5   maven-3.3.9    hadoop-2.2=
.0  spark-1.6.2
>            Reporter: zhangwei72
>            Priority: Critical
>              Labels: security
>             Fix For: 1.6.2
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> In spark-1.6 and later versions, there is a problem with its memory manag=
ement UnifiedMemoryManager.
> Spark.memory.storageFraction configuration should be at least storage Mem=
ory memory.
> In the memory management UnifiedMemoryManager, the calculation of Executi=
on memory can be up to storage how much memory can borrow,using val memoryR=
eclaimableFromStorage =3D math.max(storageMemoryPool.memoryFree,storageMemo=
ryPool.poolSize
> - storageRegionSize=EF=BC=89.
> When storageMemoryPool.memoryFree > storageMemoryPool.poolSize - storageR=
egionSize, the size of the a will be chosen, that is,storage Memory will re=
duce the storageMemoryPool.memoryFree so much.
> Because of storageMemoryPool.memoryFree > storageMemoryPool.poolSize - st=
orageRegionSize, so storageMemoryPool.poolSize - storageMemoryPool.memoryFr=
ee < storageRegionSize
> Now storageMemoryPool.poolSize < storageRegionSize,storageRegionSize is t=
he smallest proportion of frame definition,so there is a problem.
> To solve this problem, we define the function as  val memoryReclaimableFr=
omStorage =3D storageMemoryPool.poolSize - storageRegionSize.
> Experimental proof=EF=BC=9A
> I added some log information to the UnifiedMemoryManager file as follows:
> logInfo("storageMemoryPool.memoryFree %f".format(storageMemoryPool.memory=
Free/1024.0/1024.0))               logInfo("onHeapExecutionMemoryPool.memor=
yFree %f".format(onHeapExecutionMemoryPool.memoryFree/1024.0/1024.0))      =
       logInfo("storageMemoryPool.memoryUsed %f".format( storageMemoryPool.=
memoryUsed/1024.0/1024.0))             logInfo("onHeapExecutionMemoryPool.m=
emoryUsed %f".format(onHeapExecutionMemoryPool.memoryUsed/1024.0/1024.0))  =
           logInfo("storageMemoryPool.poolSize %f".format( storageMemoryPoo=
l.poolSize/1024.0/1024.0))            logInfo("onHeapExecutionMemoryPool.po=
olSize %f".format(onHeapExecutionMemoryPool.poolSize/1024.0/1024.0))
>   When I run the PageRank program, the input file for PageRank is generat=
ed by the BigDataBench-Chinese Academy of Sciences and is used to evaluate =
large data analysis system tools with a size of 676M. The information submi=
tted is as follows:
> ./bin/spark-submit --class org.apache.spark.examples.SparkPageRank \
>     --master yarn \
>     --deploy-mode cluster \
>     --num-executors 1 \
>     --driver-memory 4g \
>     --executor-memory 7g \
>     --executor-cores 6 \
>     --queue thequeue \
>     ./examples/target/scala-2.10/spark-examples-1.6.2-hadoop2.2.0.jar \
>      /test/Google_genGraph_23.txt 6
> The configuration is as follows=EF=BC=9A
> spark.memory.useLegacyMode=3Dfalse
> spark.memory.fraction=3D0.75
> spark.memory.storageFraction=3D0.2
> Log information is as follows=EF=BC=9A
> 17/02/28 11:07:34 INFO memory.UnifiedMemoryManager: storageMemoryPool.mem=
oryFree 0.000000
> 17/02/28 11:07:34 INFO memory.UnifiedMemoryManager: onHeapExecutionMemory=
Pool.memoryFree 5663.325877
> 17/02/28 11:07:34 INFO memory.UnifiedMemoryManager: storageMemoryPool.mem=
oryUsed 0.299123 M
> 17/02/28 11:07:34 INFO memory.UnifiedMemoryManager: onHeapExecutionMemory=
Pool.memoryUsed 0.000000
> 17/02/28 11:07:34 INFO memory.UnifiedMemoryManager: storageMemoryPool.poo=
lSize 0.299123
> 17/02/28 11:07:34 INFO memory.UnifiedMemoryManager: onHeapExecutionMemory=
Pool.poolSize 5663.325877
> According to the configuration, storageMemoryPool.poolSize at least 1G or=
 more, but the log information is only 0.299123 M, so there is an error.


--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org