Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (nike.apache.org: domain of static.void.dev@gmail.com
 designates 209.85.210.44 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=message-id:date:from:user-agent:mime-version:to:subject:references
         :in-reply-to:content-type:content-transfer-encoding;
        b=AAic4BJA3b2OQs7yxYqNi08VARQnfR8hlHnhLVJunnGXp1G02wKyNfQ9ZYlLZrVYGb
         qhUaRXLquXiYrw34VlcWtFmVhOHmC/6IHTLrGv1EltiXTQsnzQTLQf9nIPO0JSV2XwNf
         3d8bf4qecA/C0qkmTFXxRYfdgIIIwyy/CkFvU=
Message-ID: <4C51FAD1.3010309@gmail.com>
Date: Thu, 29 Jul 2010 15:04:01 -0700
From: Mark <static.void.dev@gmail.com>
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US;
 rv:1.9.1.11) Gecko/20100711 Thunderbird/3.0.6
MIME-Version: 1.0
To: user@cassandra.apache.org
Subject: Re: Index/Count/Order by syntax
References: <c3fe6704-b503-e207-d5e1-fab724a33c4b@me.com>
In-Reply-To: <c3fe6704-b503-e207-d5e1-fab724a33c4b@me.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

Ok so basically an "array" of words grouped by their count?

Something like this?

{
    SearchLogs : {
       ALL : {
            999: { word1:word1, word2:word2, word3:word3 }
            998: { word1:word1, word2:word2, word3:word3 }
       }
    }
}

On 7/29/10 2:50 PM, Aaron Morton wrote:
> One method would be to use a Super Column Family. Have one row, in 
> that create a column family for each count value you have, and then in 
> the super column create a column for each word.
>
> Set the CompareWith for the super col to be LongType and the 
> CompareSubcolumnsWith to be AsciiTyoe or UTFType.
>
> You could then use get_slice to read super columns in that row.
>
> This may not be the most efficient model, it will depend how how much 
> data you have and what your read patterns are like. Also be remember 
> that pre 0.7 you cannot atomically increment counters in cassandra.
>
> Have a play and see what works for you.
>
> Aaron
>
> On 29 Jul, 2010,at 02:36 PM, Mark <static.void.dev@gmail.com> wrote:
>
>> I know there is no native support for "order by", "group by" etc but I
>> was wondering how it could be accomplished with some custom indexes?
>>
>> For example, say I have a list of word counts like (notice 2 words have
>> the same count):
>>
>> "cassandra" => 100
>> "foo" => 999
>> "bar" => 1
>> "baz" => 500
>> "fooz" => 999
>>
>> How can I store then retrieve these words ordered by their count/values?
>>
>> Thanks.