spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Enrico Rotundo <enrico.rotu...@gmail.com>
Subject Re: Saving intermediate results in mapPartitions
Date Sat, 19 Mar 2016 00:03:40 GMT
Try to set MEMORY_AND_DISK as RDD’s storage persistence level.
http://spark.apache.org/docs/latest/programming-guide.html#rdd-persistence <http://spark.apache.org/docs/latest/programming-guide.html#rdd-persistence>
> On 19 Mar 2016, at 00:55, Krishna <research800@gmail.com> wrote:
> 
> Hi,
> 
> I've a situation where the number of elements output by each partition from mapPartitions
don't fit into the RAM even with the lowest number of rows in the partition (there is a hard
lower limit on this value). What's the best way to address this problem? During the mapPartition
phase, is there a way to convert intermediate results to a DF and save to a database? Rows
saved to database don't need to be part of the output results from mapPartitions.
> 
> 


Mime
View raw message