spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Richard Eggert <richard.egg...@gmail.com>
Subject Re: Pass spark partition explicitly ?
Date Sun, 18 Oct 2015 18:05:06 GMT
If you want to override the default partitioning behavior,  you have to do
so in your code where you create each RDD. Different RDDs usually have
different numbers of partitions (except when one RDD is directly derived
from another without shuffling) because they usually have different sizes,
so it wouldn't make sense to have some sort of "global" notion of how many
partitions to create.  You could,  if you wanted,  pass partition counts in
as command line options to your application and use those values in your
code that creates the RDDs, of course.

Rich
On Oct 18, 2015 1:57 PM, "Kali.tummala@gmail.com" <Kali.tummala@gmail.com>
wrote:

> Hi All,
>
> can I pass number of partitions to all the RDD explicitly while submitting
> the spark Job or di=o I need to mention in my spark code itself ?
>
> Thanks
> Sri
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Pass-spark-partition-explicitly-tp25113.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Mime
View raw message