incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephen Connolly <stephen.alan.conno...@gmail.com>
Subject Re: newbie question: how do I know the total number of rows of a cf?
Date Mon, 28 Mar 2011 13:20:43 GMT
for #2 you could pipe through wc -l to get the answer

sort -n keys.txt | uniq | wc -l

but both examples are just refinements of iterate.

#1 is just a distributed iterate
#2 is just an optimized iterate based on knowledge of the on-disk
format (and my give inaccurate results... tombstones...)

On 28 March 2011 14:16, Or Yanay <or@peer39.com> wrote:
> I use one of two ways to achieve that:
>  1. run a map reduce. Pig is really helpful in these cases. Make sure you run your MR
using Hadoop task tracker on your nodes - or your performance will take a hit.
>  2. dump all keys using sstablekeys script from relevant files on all machines and count
unique values. I do that using "sort -n  keys.txt |uniq >> unique_keys.txt"
>
> Dumping all keys is much faster but less elegant and can be more annoying if you want
do that from your application.
>
> Hope that do the trick for you.
> -Orr
>
> -----Original Message-----
> From: Joshua Partogi [mailto:joshua.java@gmail.com]
> Sent: Monday, March 28, 2011 2:39 PM
> To: user@cassandra.apache.org
> Subject: Re: newbie question: how do I know the total number of rows of a cf?
>
> Not all NoSQL is like that. Or perhaps the term NoSQL has became vague
> these days.
>
> On Mon, Mar 28, 2011 at 6:16 PM, Stephen Connolly
> <stephen.alan.connolly@gmail.com> wrote:
>> iterate.
>>
>> otherwise if that will be too slow and you will do it often, the nosql way
>> is to create a separate column family updated with each row add/delete to
>> hold the answer for you.
>>
>> - Stephen
>>
>> ---
>> Sent from my Android phone, so random spelling mistakes, random nonsense
>> words and other nonsense are a direct result of using swype to type on the
>> screen
>>
>> On 28 Mar 2011 07:40, "Sheng Chen" <chensheng2010@gmail.com> wrote:
>>> Hi all,
>>> I want to know how many records I am holding in Cassandra, just like
>>> count(*) in sql.
>>> What can I do ? Thank you.
>>>
>>> Sheng
>>
>
>
>
> --
> http://twitter.com/jpartogi
>

Mime
View raw message