cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Carl Yeksigian <c...@yeksigian.com>
Subject Re: RDD partitions per executor in Cassandra Spark Connector
Date Tue, 03 Mar 2015 14:50:19 GMT
These questions would be better addressed to the Spark Cassandra Connector
mailing list, which can be found here:
https://github.com/datastax/spark-cassandra-connector/#community

Thanks,
Carl

On Tue, Mar 3, 2015 at 4:42 AM, Pavel Velikhov <pavel.velikhov@gmail.com>
wrote:

> Hi, is there a paper or a document where one can read how Spark reads
> Cassandra data in parallel? And how it writes data back from RDDs? Its a
> bit hard to have a clear picture in mind.
>
> Thank you,
> Pavel Velikhov
>
> On Mar 3, 2015, at 1:08 AM, Rumph, Frens Jan <mail@frensjan.nl> wrote:
>
> Hi all,
>
> I didn't find the *issues* button on
> https://github.com/datastax/spark-cassandra-connector/ so posting here.
>
> Any one have an idea why token ranges are grouped into one partition per
> executor? I expected at least one per core. Any suggestions on how to work
> around this? Doing a repartition is way to expensive as I just want more
> partitions for parallelism, not reshuffle ...
>
> Thanks in advance!
> Frens Jan
>
>
>

Mime
View raw message