spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jags Ramnarayanan <jramnara...@pivotal.io>
Subject Re: Broadcast table
Date Mon, 26 Oct 2015 20:43:21 GMT
If you are using Spark SQL and joining two dataFrames the optimizer would
automatically broadcast the smaller table (You can configure the size if
the default is too small).

Else, in code, you can collect any RDD to the driver and broadcast using
the context.broadcast method.
http://ampcamp.berkeley.edu/wp-content/uploads/2012/06/matei-zaharia-amp-camp-2012-advanced-spark.pdf

-- Jags
(www.snappydata.io)


On Mon, Oct 26, 2015 at 11:17 AM, Younes Naguib <
Younes.Naguib@tritondigital.com> wrote:

> Hi all,
>
>
>
> I use the thrift server, and I cache a table using “cache table mytab”.
>
> Is there any sql to broadcast it too?
>
>
>
> *Thanks*
>
> *Younes Naguib*
>
> Triton Digital | 1440 Ste-Catherine W., Suite 1200 | Montreal, QC  H3G 1R8
>
> Tel.: +1 514 448 4037 x2688 | Tel.: +1 866 448 4037 x2688 | younes.naguib
> @tritondigital.com <younes.naguib@streamtheworld.com>
>
>
>

Mime
View raw message