spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apache Spark (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-6157) Unrolling with MEMORY_AND_DISK should always release memory
Date Wed, 09 Mar 2016 21:43:40 GMT

    [ https://issues.apache.org/jira/browse/SPARK-6157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15188102#comment-15188102
] 

Apache Spark commented on SPARK-6157:
-------------------------------------

User 'JoshRosen' has created a pull request for this issue:
https://github.com/apache/spark/pull/11613

> Unrolling with MEMORY_AND_DISK should always release memory
> -----------------------------------------------------------
>
>                 Key: SPARK-6157
>                 URL: https://issues.apache.org/jira/browse/SPARK-6157
>             Project: Spark
>          Issue Type: Bug
>          Components: Block Manager
>    Affects Versions: 1.2.1
>            Reporter: SuYan
>
> === EDIT by andrewor14 ===
> The existing description was somewhat confusing, so here's a more succinct version of
it.
> If unrolling a block with MEMORY_AND_DISK was unsuccessful, we will drop the block to
disk
> directly. After doing so, however, we don't need the underlying array that held the partial
> values anymore, so we should release the pending unroll memory for other tasks on the
same
> executor. Otherwise, other tasks may unnecessarily drop their blocks to disk due to the
lack
> of unroll space, resulting in worse performance.
> === Original comment ===
> Current code:
> Now we want to cache a Memory_and_disk level block
> 1. Try to put in memory and unroll unsuccessful. then reserved unroll memory because
we got a iterator from an unroll Array 
> 2. Then put into disk.
> 3. Get value from get(blockId), and iterator from that value, and then nothing with an
unroll Array. So here we should release the reserved unroll memory instead will release  until
the task is end.
> and also, have somebody already pull a request, for get Memory_and_disk level block,
while cache in memory from disk, we should, use file.length to check if we can put in memory
store instead just allocate a file.length buffer, may lead to OOM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message