spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Owen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-4808) Spark fails to spill with small number of large objects
Date Sun, 22 Feb 2015 08:58:11 GMT

    [ https://issues.apache.org/jira/browse/SPARK-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14332079#comment-14332079
] 

Sean Owen commented on SPARK-4808:
----------------------------------

[~andrewor14] Is this resolved for 1.2.2 / 1.3.0?
https://github.com/apache/spark/pull/4420#issuecomment-75178396

> Spark fails to spill with small number of large objects
> -------------------------------------------------------
>
>                 Key: SPARK-4808
>                 URL: https://issues.apache.org/jira/browse/SPARK-4808
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.0.2, 1.1.0, 1.2.0, 1.2.1
>            Reporter: Dennis Lawler
>
> Spillable's maybeSpill does not allow spill to occur until at least 1000 elements have
been spilled, and then will only evaluate spill every 32nd element thereafter.  When there
is a small number of very large items being tracked, out-of-memory conditions may occur.
> I suspect that this and the every-32nd-element behavior was to reduce the impact of the
estimateSize() call.  This method was extracted into SizeTracker, which implements its own
exponential backup for size estimation, so now we are only avoiding using the resulting estimated
size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message