spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Ehrlich <and...@aehrlich.com>
Subject Re: send transformed RDD to s3 from slaves
Date Sat, 14 Nov 2015 17:24:16 GMT
Maybe you want to be using rdd.saveAsTextFile() ?

> On Nov 13, 2015, at 4:56 PM, Walrus theCat <walrusthecat@gmail.com> wrote:
> 
> Hi,
> 
> I have an RDD which crashes the driver when being collected.  I want to send the data
on its partitions out to S3 without bringing it back to the driver. I try calling rdd.foreachPartition,
but the data that gets sent has not gone through the chain of transformations that I need.
 It's the data as it was ingested initially.  After specifying my chain of transformations,
but before calling foreachPartition, I call rdd.count in order to force the RDD to transform.
 The data it sends out is still not transformed.  How do I get the RDD to send out transformed
data when calling foreachPartition?
> 
> Thanks



---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message