spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yuri Makhno (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-8147) Add ability to decorate RDD iterators
Date Sun, 07 Jun 2015 13:11:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-8147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14576230#comment-14576230
] 

Yuri Makhno commented on SPARK-8147:
------------------------------------

[~srowen] we want something similar to the following logic applied to every iterator:
{code}
 iterator.map(x => {
         if (! jvmHasEnoughFreeMemory()) {
              throw new NotEnoughExecutorMemoryException();   
         }
         x 
 });
{code}

I can wrap iterators in my own RDD implementations, but when I use SparkSQL it creates RDD
chains for queries behind the scene. And it's imposible to handle situations where:
 * there are enough memory on executor to load data by our custom RDD implementations
 * not enough memory to do join (or different sql query)


> Add ability to decorate RDD iterators
> -------------------------------------
>
>                 Key: SPARK-8147
>                 URL: https://issues.apache.org/jira/browse/SPARK-8147
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 1.3.1
>            Reporter: Yuri Makhno
>
> In Spark all computations are done through iterators which are created by RDD.iterator
method. It would be good if we can specify some RDDIteratorDecoratorFactory in SparkConf and
be able to decorate all RDD iterators created in executor JVM. 
> For us it would be extremely useful because we want to control executor's memory and
prevent OutOfMemory on executor but instead fail job with NotEnoughMemory reason in case when
we see that we don't have more memory to do this. Also we want to collect some computation
statistics on executor.
> I can provide PR in case this improvement is approved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message