cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From DuyHai Doan <doanduy...@gmail.com>
Subject Re: estimated row count for a pk range
Date Sun, 20 Jul 2014 21:03:35 GMT
1) Use separate counter to count number of entries in each column family
but it will require you to manage the counting manually
2) SELECT DISTINCT partitionKey FROM ....  Normally this query is optimized
and is much faster than a SELECT *. However if you have a very big number
of distinct partitions it can be slow


On Sun, Jul 20, 2014 at 6:48 PM, tommaso barbugli <tbarbugli@gmail.com>
wrote:

> Hello,
> Lately I collapsed several (around 1k) column families in a bunch (100) of
> column families.
> To keep data separated I have added an extra column (family) which is part
> of the PK.
>
> While previous approach allowed me to always have a clear picture of every
> column family's size; now I have no other option than select all the rows
> and make some estimation to guess the overall size used by one of the
> grouped data in this CFs.
>
> eg.
> SELECT * FROM cf_shard1 WHERE family = '1';
>
> Of course this does not work really well when cf_shard1 has some data in
> it; is there some way perhaps to get an estimated count for rows matching
> this query?
>
> Thanks,
> Tommaso
>

Mime
View raw message