Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 28692 invoked from network); 14 Oct 2010 19:45:48 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 14 Oct 2010 19:45:48 -0000 Received: (qmail 43916 invoked by uid 500); 14 Oct 2010 19:45:45 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 43892 invoked by uid 500); 14 Oct 2010 19:45:45 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 43839 invoked by uid 99); 14 Oct 2010 19:45:45 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Oct 2010 19:45:45 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,MIME_QP_LONG_LINE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a50.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Oct 2010 19:45:39 +0000 Received: from homiemail-a50.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a50.g.dreamhost.com (Postfix) with ESMTP id 92F126F8062 for ; Thu, 14 Oct 2010 12:45:18 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=thelastpickle.com; h=to:cc:from :subject:date:message-id:content-type:mime-version:in-reply-to; q=dns; s=thelastpickle.com; b=iP+AUsE2biK/mS6lhiSKUGJNFnhDTwRig VNV+pplyFyijbT0pKZ01p87SFEkBI4XUCh3SqzZMVNKMCjORyZK0sbJm1NAuShz9 2NuArgcJLZiV6KjlfQY4NWb5hLbK52smbTiF0ePKq8IHsDVOiYcfzPK5xLzlhwc+ MoA8LQ/4XU= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h=to:cc :from:subject:date:message-id:content-type:mime-version: in-reply-to; s=thelastpickle.com; bh=1Rf1gefCy7YOe3r0Do/JqQ4R5SI =; b=PgOYTDPWS7eaWwfm81H+OmYsjEt1GPJRx0HJxdnbMvJaVayOwbKC47cBwtH xX97UHqdvr2+D4n3p6EiXAV0LiLPkGZt5YftFe1Q7ibsBC/ITUP/rvm3eyn0R7Vx vYDJyRbVnxDcRjfOdFTn+FwgwB2Z/L3Z1tgSvHrVyY2qe7Uw= Received: from localhost (webms.mac.com [17.148.16.116]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a50.g.dreamhost.com (Postfix) with ESMTPSA id 8051C6F8060; Thu, 14 Oct 2010 12:45:18 -0700 (PDT) To: user@cassandra.apache.org Cc: "'user@cassandra.apache.org'" From: Aaron Morton Subject: Re: deletion Date: Thu, 14 Oct 2010 19:45:16 GMT X-Mailer: MobileMe Mail (1C3203) Message-id: <314068c7-a762-e6cb-afcb-b7b9063bf600@me.com> Content-Type: multipart/alternative; boundary=Apple-Webmail-42--f2741abd-72ab-4b4e-f70d-c330e199637e MIME-Version: 1.0 In-Reply-To: <6143522CDE27514BB78C92BCD7190003032DA81224@DNLEXCH01> --Apple-Webmail-42--f2741abd-72ab-4b4e-f70d-c330e199637e Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=windows-1252; format=flowed I would recommend using epoch time for your timestamp and comparing as Lon= gType. The version 1 UUID includes the MAC of the machine that generated i= t, it two different machines will create different UUID's for the some tim= e. They are meant to be unique after all=A0http://en.wikipedia.org/wiki/Un= iversally_Unique_Identifier#Version_1_.28MAC_address.29=0A=0AYou may also = want to adjust your model, see the discussion on supercolumn limitations h= ere=A0http://wiki.apache.org/cassandra/CassandraLimitations=A0. Your curre= nt model is going to create very big super columns, which will degrade in = performance over time. Perhaps use a standard CF and use "ticket:measure" = as the row key, then you can add 2billion (i think) columns on there for e= ach time. You may still want to break the rows up further depending on you= r use case, e.g. ticket:measure:day then perhaps pull back the entire row = to get every value for the day or delete the entire day easily.=0A=0AFor y= our deletion issue, batch_mutate is your friend. The Deletion struct lets = you delete:=0A- a row, by excluding the predicate and super_column=0A- a s= uper_column by including super_column and not predicate=A0=0A- a column=0A= =0ASome of the things that were not implemented were fixed in 0.6.4 i thin= k. Anyway they all work AFAIK.=A0=0A=0AHope that helps.=A0=0AAaron=0A=0A=0A= On 15 Oct, 2010,at 07:55 AM, Koert Kuipers wrote:=0A=0AHello All,=0A=A0=0AI am testing Cassandra 0.7 with the Avro = api on a single machine as a financial time series server, so my setup loo= ks something like this:=0Akeyspace =3D timeseries, column family =3D tickd= ata, key =3D ticker, super column =3D field (price, volume, high, low), co= lumn =3D timestamp.=0A=A0=0ASo a single value, say a price of 140.72 for I= BM today at 14:00 would be stored as=0Atickdata[=93IBM=94][=93price=94][=93= 2010-10-14 14:00=94] =3D 140.72 (well of course everything needs to be enc= oded properly but you get the point).=0A=A0=0AMy subcomparator type is Tim= eUUIDType so that I can do queries over time ranges. Inserting and queryin= g all work reasonably well so far.=0A=A0=0ABut sometimes I have a need to = wipe out all the data for all day. To be more precise: I need to delete th= e stored values for all keys (tickers) and all super-columns (fields) for = a given time period (condition on column). How would I go about doing that= ? First a multiget_slice and then a remove command for each value? Or am I= missing an easier way?=0A=A0=0AIs slice deletion within batch_mutate stil= l scheduled to be implemented?=0A=A0=0AThanks for your help,=0AKoert=0A=A0 --Apple-Webmail-42--f2741abd-72ab-4b4e-f70d-c330e199637e Content-Type: multipart/related; type="text/html"; boundary=Apple-Webmail-86--f2741abd-72ab-4b4e-f70d-c330e199637e --Apple-Webmail-86--f2741abd-72ab-4b4e-f70d-c330e199637e Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=windows-1252;
I would recommend using epoch time for your timestamp and comparing a= s LongType. The version 1 UUID includes the MAC of the machine that genera= ted it, it two different machines will create different UUID's for the som= e time. They are meant to be unique after all 

You may also want to= adjust your model, see the discussion on supercolumn limitations here&nbs= p;http:/= /wiki.apache.org/cassandra/CassandraLimitations . Your current mo= del is going to create very big super columns, which will degrade in perfo= rmance over time. Perhaps use a standard CF and use "ticket:measure" as th= e row key, then you can add 2billion (i think) columns on there for each t= ime. You may still want to break the rows up further depending on your use= case, e.g. ticket:measure:day then perhaps pull back the entire row to ge= t every value for the day or delete the entire day easily.
=

For your deletion issue, batch_mutate is your friend. = The Deletion struct lets you delete:
- a row, by excluding the p= redicate and super_column
- a super_column by including super_co= lumn and not predicate 
- a column

Some of the things that were not implemented were fixed in 0.6.4 i think.= Anyway they all work AFAIK. 

Hope that help= s. 
Aaron


On 15 Oct, 2010,at 0= 7:55 AM, Koert Kuipers <Koert.Kuipers@diamondnotch.com> wrote:
=0A=0A=0A=0A=0A=0A=0A=0A=0A=0A=0A
=0A= =0A

Hello All,

=0A=0A

 <= /span>

=0A=0A

I am testing Cassandra 0.7 with the = Avro api on a single=0Amachine as a financial time series server, so my se= tup looks something like this:

=0A=0A

keysp= ace =3D timeseries, column family =3D tickdata, key =3D=0Aticker, super co= lumn =3D field (price, volume, high, low), column =3D timestamp.

=0A=0A

 

=0A=0A

S= o a single value, say a price of 140.72 for IBM today at=0A14:00 would be = stored as

=0A=0A

tickdata[=93IBM=94][=93pri= ce=94][=932010-10-14=0A14:00=94] =3D 140.72 (well of course everything nee= ds to be encoded properly=0Abut you get the point).

=0A=0A=

 

=0A=0A

My subcomparat= or type is TimeUUIDType so that I can do=0Aqueries over time ranges. Inser= ting and querying all work reasonably well so=0Afar.

=0A=0A=

 

=0A=0A

But sometimes = I have a need to wipe out all the data for all=0Aday. To be more precise: = I need to delete the stored values for all keys (tickers)=0Aand all super-= columns (fields) for a given time period (condition on column). How=0Awoul= d I go about doing that? First a multiget_slice and then a remove command=0A= for each value? Or am I missing an easier way?

=0A=0A

 

=0A=0A

Is slice deletion = within batch_mutate still scheduled to be=0Aimplemented?

= =0A=0A

 

=0A=0A

<= font size=3D"2" face=3D"Arial">Thanks fo= r your help,

=0A=0A

Koert

= =0A=0A

 

=0A=0A
=0A=0A=0A=0A=0A
--Apple-Webmail-86--f2741abd-72ab-4b4e-f70d-c330e199637e-- --Apple-Webmail-42--f2741abd-72ab-4b4e-f70d-c330e199637e--