cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nitan Kainth <nitankai...@gmail.com>
Subject Re: Partition key with 300K rows can it be queried and distributed using Spark
Date Thu, 17 Jan 2019 20:29:22 GMT
Not sure about spark data distribution but yeah spark can be used to retrieve such data from
Cassandra.


Regards,
Nitan
Cell: 510 449 9629

> On Jan 17, 2019, at 2:15 PM, Goutham reddy <goutham.chirutha@gmail.com> wrote:
> 
> Hi,
> As each partition key can hold up to 2 Billion rows, even then it is an anti-pattern
to have such huge data set for one partition key in our case it is 300k rows only, but when
trying to query for one particular key we are getting timeout exception. If I use Spark to
get the 300k rows for a particular key does it solve the problem of timeouts and distribute
the data across the spark nodes or will it still throw timeout exceptions. Can you please
help me with the best practice to retrieve the data for the key with 300k rows. Any help is
highly appreciated.
> 
> Regards
> Goutham.

Mime
View raw message