spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kan Zhang <kzh...@apache.org>
Subject Re: hdfs replication on saving RDD
Date Wed, 16 Jul 2014 01:00:48 GMT
Andrew, there are overloaded versions of saveAsHadoopFile or
saveAsNewAPIHadoopFile that allow you to pass in a per-job Hadoop conf.
saveAsTextFile is just a convenience wrapper on top of saveAsHadoopFile.


On Mon, Jul 14, 2014 at 11:22 PM, Andrew Ash <andrew@andrewash.com> wrote:

> In general it would be nice to be able to configure replication on a
> per-job basis.  Is there a way to do that without changing the config
> values in the Hadoop conf/ directory between jobs?  Maybe by modifying
> OutputFormats or the JobConf ?
>
>
> On Mon, Jul 14, 2014 at 11:12 PM, Matei Zaharia <matei.zaharia@gmail.com>
> wrote:
>
>> You can change this setting through SparkContext.hadoopConfiguration, or
>> put the conf/ directory of your Hadoop installation on the CLASSPATH when
>> you launch your app so that it reads the config values from there.
>>
>> Matei
>>
>> On Jul 14, 2014, at 8:06 PM, valgrind_girl <124411065@qq.com> wrote:
>>
>> > eager to know this issue too,does any one knows how?
>> >
>> >
>> >
>> > --
>> > View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/hdfs-replication-on-saving-RDD-tp289p9700.html
>> > Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>>
>

Mime
View raw message