Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 58661 invoked from network); 29 Jul 2010 22:04:44 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 29 Jul 2010 22:04:44 -0000 Received: (qmail 68613 invoked by uid 500); 29 Jul 2010 22:04:43 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 68420 invoked by uid 500); 29 Jul 2010 22:04:42 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 68412 invoked by uid 99); 29 Jul 2010 22:04:42 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Jul 2010 22:04:42 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of static.void.dev@gmail.com designates 209.85.210.44 as permitted sender) Received: from [209.85.210.44] (HELO mail-pz0-f44.google.com) (209.85.210.44) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Jul 2010 22:04:33 +0000 Received: by pzk6 with SMTP id 6so367439pzk.31 for ; Thu, 29 Jul 2010 15:04:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from :user-agent:mime-version:to:subject:references:in-reply-to :content-type:content-transfer-encoding; bh=Lb9eTJwDFUKDF960v95x5CKcrtc6mHxiE0pn6J0MQ/E=; b=Ng1yAnAeImbb8oX2wzMa082PurvnIVXXJAzqBI2NiuFmBj/OwLAKDCP2FiWTjS6LE0 Wvp+EOb1lqGgsvTLtunlN7UPC/v0Ay9EfKKhmqUrs2b5XCRWuaF9y5yIhT4KhtKn6q9e 7CTrpnCc2mis0lyeAzaxYSH3hLb972mX5G6Nw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; b=AAic4BJA3b2OQs7yxYqNi08VARQnfR8hlHnhLVJunnGXp1G02wKyNfQ9ZYlLZrVYGb qhUaRXLquXiYrw34VlcWtFmVhOHmC/6IHTLrGv1EltiXTQsnzQTLQf9nIPO0JSV2XwNf 3d8bf4qecA/C0qkmTFXxRYfdgIIIwyy/CkFvU= Received: by 10.114.201.18 with SMTP id y18mr1101991waf.37.1280441052105; Thu, 29 Jul 2010 15:04:12 -0700 (PDT) Received: from Robert-Zotters-MacBook-Pro.local ([208.66.27.203]) by mx.google.com with ESMTPS id s5sm2315575wak.12.2010.07.29.15.04.01 (version=SSLv3 cipher=RC4-MD5); Thu, 29 Jul 2010 15:04:02 -0700 (PDT) Message-ID: <4C51FAD1.3010309@gmail.com> Date: Thu, 29 Jul 2010 15:04:01 -0700 From: Mark User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.1.11) Gecko/20100711 Thunderbird/3.0.6 MIME-Version: 1.0 To: user@cassandra.apache.org Subject: Re: Index/Count/Order by syntax References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Ok so basically an "array" of words grouped by their count? Something like this? { SearchLogs : { ALL : { 999: { word1:word1, word2:word2, word3:word3 } 998: { word1:word1, word2:word2, word3:word3 } } } } On 7/29/10 2:50 PM, Aaron Morton wrote: > One method would be to use a Super Column Family. Have one row, in > that create a column family for each count value you have, and then in > the super column create a column for each word. > > Set the CompareWith for the super col to be LongType and the > CompareSubcolumnsWith to be AsciiTyoe or UTFType. > > You could then use get_slice to read super columns in that row. > > This may not be the most efficient model, it will depend how how much > data you have and what your read patterns are like. Also be remember > that pre 0.7 you cannot atomically increment counters in cassandra. > > Have a play and see what works for you. > > Aaron > > On 29 Jul, 2010,at 02:36 PM, Mark wrote: > >> I know there is no native support for "order by", "group by" etc but I >> was wondering how it could be accomplished with some custom indexes? >> >> For example, say I have a list of word counts like (notice 2 words have >> the same count): >> >> "cassandra" => 100 >> "foo" => 999 >> "bar" => 1 >> "baz" => 500 >> "fooz" => 999 >> >> How can I store then retrieve these words ordered by their count/values? >> >> Thanks.