Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7BA4141D3 for ; Sun, 12 Jun 2011 19:10:43 +0000 (UTC) Received: (qmail 87501 invoked by uid 500); 12 Jun 2011 19:10:41 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 87477 invoked by uid 500); 12 Jun 2011 19:10:41 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 87466 invoked by uid 99); 12 Jun 2011 19:10:40 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 12 Jun 2011 19:10:40 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a53.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 12 Jun 2011 19:10:34 +0000 Received: from homiemail-a53.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a53.g.dreamhost.com (Postfix) with ESMTP id 3D094138062 for ; Sun, 12 Jun 2011 12:10:12 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=thelastpickle.com; h=content-type :mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to; q=dns; s= thelastpickle.com; b=pm56WTr9lAmJ10WZc7E0eaqvdi+uUvqCkCQHyOyoqPR UYQZCKbDwp5FdZ9OzH/KoSDyIWvc8SKET2l7XZ2zpj+Lbd/6S/tz17YGn2gi1vZf PiqTYGNmEMmtNxXUfhMfwL2KoxUmA2H4eL+tt1chwlKbfCxvC9OWapXNb1KwiIqg = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h= content-type:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to; s= thelastpickle.com; bh=Rqyiq4hqIfmK5sgzfQKHcEyXyQs=; b=vD6LnGHCL+ 2rYhNrW1miq4qDHRbJ22TiXQ8OjlhrbLW7Ro/tR4Ls93WIJGKrAWjvc9Q/M0kSFv BdopcGmaEH0hyTIM41psoEuJBpvTq19/o476uR13jhs4ulF/KJUILAn/lgNTKzf/ MvUsXYJcGxKOfIImsOzyGxmzKwxECyAg4= Received: from [10.0.1.151] (121-73-157-230.cable.telstraclear.net [121.73.157.230]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a53.g.dreamhost.com (Postfix) with ESMTPSA id 5E11E138058 for ; Sun, 12 Jun 2011 12:10:11 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Apple Message framework v1084) Subject: Re: need some help with counters From: aaron morton In-Reply-To: <8894F90A-9E13-450B-ABB4-DAA269733F52@Holsman.NET> Date: Mon, 13 Jun 2011 07:10:07 +1200 Content-Transfer-Encoding: quoted-printable Message-Id: <9E38D4F9-AFE6-4C22-B9D4-1B97E55C6BB8@thelastpickle.com> References: <19BA1832-2346-4FAB-B8CD-F9C25177F2E5@Holsman.NET> <5E753A9D-21CC-429F-8EE5-291A6C98F2BE@holsman.net> <8894F90A-9E13-450B-ABB4-DAA269733F52@Holsman.NET> To: user@cassandra.apache.org X-Mailer: Apple Mail (2.1084) X-Virus-Checked: Checked by ClamAV on apache.org > I am wondering how to index on the most recent hour as well. (ie show = me top 5 URLs type query)..=20 AFAIK thats not a great application for counters. You would need range = support in the secondary indexes so you could get the first X rows = ordered by a column value.=20 To be honest, depending on scale, I'd consider a sorted set in redis for = that.=20 Hope that helps.=20 =20 ----------------- Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 11 Jun 2011, at 00:36, Ian Holsman wrote: >=20 > On Jun 9, 2011, at 10:04 PM, aaron morton wrote: >=20 >> I may be missing something but could you use a column for each of the = last 48 hours all in the same row for a url ? >>=20 >> e.g.=20 >> { >> "/url.com/hourly" : { >> "20110609T01:00:00" : 456, >> "20110609T02:00:00" : 4567, >> } >> } >=20 > yes.. that would work better... I was storing all the different times = in the same row. > { > "/url.com" : { > "H-20110609T01:00:00" : 456, > "H-0110609T02:00:00" : 4567, > "D-0110609" : 5678, > } > } >=20 > I am wondering how to index on the most recent hour as well. (ie show = me top 5 URLs type query)..=20 >=20 >>=20 >> Increment the current hour only. Delete the older columns either when = a read detects there are old values or as a maintenance job. Or as part = of writing values for the first 5 minutes of any hour.=20 >=20 > yes.. I thought of that. The problem with doing it on read is there = may be a case where a old URL never gets read.. so it will just sit = there taking up space.. the maintenance job is the route I went down. >=20 >>=20 >> The row will get spread out over a lot of sstables which may reduce = read speed. If this is a problem consider a separate CF with more = aggressive GC and compaction settings.=20 >=20 > Thanks! >>=20 >> Cheers >>=20 >>=20 >> ----------------- >> Aaron Morton >> Freelance Cassandra Developer >> @aaronmorton >> http://www.thelastpickle.com >>=20 >> On 10 Jun 2011, at 09:28, Ian Holsman wrote: >>=20 >>> So would doing something like storing it in reverse (so I know what = to delete) work? Or is storing a million columns in a supercolumn = impossible.=20 >>>=20 >>> I could always use a logfile and run the archiver off that as a = worst case I guess.=20 >>> Would doing so many deletes screw up the db/cause other problems? >>>=20 >>> --- >>> Ian Holsman - 703 879-3128 >>>=20 >>> I saw the angel in the marble and carved until I set him free -- = Michelangelo >>>=20 >>> On 09/06/2011, at 4:22 PM, Ryan King wrote: >>>=20 >>>> On Thu, Jun 9, 2011 at 1:06 PM, Ian Holsman = wrote: >>>>> Hi Ryan. >>>>> you wouldn't have your version of cassandra up on github would = you?? >>>>=20 >>>> No, and the patch isn't in our version yet either. We're still = working on it. >>>>=20 >>>> -ryan >>=20 >=20