cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Goutham reddy <>
Subject Partition key with 300K rows can it be queried and distributed using Spark
Date Thu, 17 Jan 2019 20:15:02 GMT
As each partition key can hold up to 2 Billion rows, even then it is an
anti-pattern to have such huge data set for one partition key in our case
it is 300k rows only, but when trying to query for one particular key we
are getting timeout exception. If I use Spark to get the 300k rows for a
particular key does it solve the problem of timeouts and distribute the
data across the spark nodes or will it still throw timeout exceptions. Can
you please help me with the best practice to retrieve the data for the key
with 300k rows. Any help is highly appreciated.


View raw message