You should read multiple "batches" specifying last key received from previous batch as first key for next one.For large databases I'd recommend you to use statistical approach (if it's feasible). With random parittioner it works well.
Don't read the whole db. Knowing whole keyspace you can read part, get number of records per key (<1), then multiply by keyspace size and get your total.
You can even implement an algorithm that will work until required precision is obtained (simply after each batch compare you previous and current estimate).
For me it's enough to read ~1% of DB to get good result.
Best regards, Vitalii Tymchyshyn
2012/5/24 Prakrati Agrawal <Prakrati.Agrawal@mu-sigma.com>
I am trying to learn Cassandra and I have one doubt. I am using the Thrift API, to count the number of row keys I am using KeyRange to specify the row keys. To count all of them, I specify the start and end as “new byte”. But the count
is set to 100 by default. How do I use this method to count the keys if I don’t know the actual number of keys in my Cassandra database? Please help me