Return-Path: Delivered-To: apmail-incubator-cassandra-user-archive@minotaur.apache.org Received: (qmail 34107 invoked from network); 22 Dec 2009 13:29:28 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 22 Dec 2009 13:29:28 -0000 Received: (qmail 61010 invoked by uid 500); 22 Dec 2009 13:29:27 -0000 Delivered-To: apmail-incubator-cassandra-user-archive@incubator.apache.org Received: (qmail 60957 invoked by uid 500); 22 Dec 2009 13:29:27 -0000 Mailing-List: contact cassandra-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: cassandra-user@incubator.apache.org Delivered-To: mailing list cassandra-user@incubator.apache.org Received: (qmail 60944 invoked by uid 99); 22 Dec 2009 13:29:27 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Dec 2009 13:29:27 +0000 X-ASF-Spam-Status: No, hits=-2.5 required=5.0 tests=AWL,BAYES_00,HTML_MESSAGE,SUBJECT_FUZZY_TION X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of richiesgr@gmail.com designates 209.85.220.216 as permitted sender) Received: from [209.85.220.216] (HELO mail-fx0-f216.google.com) (209.85.220.216) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Dec 2009 13:29:20 +0000 Received: by fxm8 with SMTP id 8so5754626fxm.27 for ; Tue, 22 Dec 2009 05:28:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=cpkyOQceW3PgJ2AAWEE5Ktjq6mmmrONJdVhJ765dqoI=; b=URDgM12pi/31hAmh0+jI7OkY54d332lj5qw3Xa82KafOW/TzatXcjCuSjloiA4Iepy g9Q/vybMWNp+KhmHZO2/RblprEWyRaASNbonMVOXWsHOyYH6UT02eSje2261Dnydw+Wv z/GRz9diLSyCX5f4QQkJZuLvhA5w7ajUkbGFU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=xrpZUUrMyBoRjDJ/nrmncwQiLqWp0rNSZFVMwgfV3e+ka2nZUr2PDK0yNh9NrUfZqC +Q3G9vqTpe9QupV3XuuFFbZVR1whJeIMeGzmPp8IKu6kCwdb/brJsQCobZAJ4qpQvKJR TOWPW69YassvUKvlDOeMV0hu/9atIwQooyJeo= MIME-Version: 1.0 Received: by 10.239.145.13 with SMTP id q13mr908189hba.125.1261488538786; Tue, 22 Dec 2009 05:28:58 -0800 (PST) In-Reply-To: <4B30C011.5070801@eintr.org> References: <4B30C011.5070801@eintr.org> Date: Tue, 22 Dec 2009 15:28:58 +0200 Message-ID: <468b21170912220528o11555999u7dbffedd94b50a24@mail.gmail.com> Subject: Re: TimeUUID Partitioning From: Richard Grossman To: cassandra-user@incubator.apache.org Content-Type: multipart/alternative; boundary=001485f7728c3e1b3a047b51308c --001485f7728c3e1b3a047b51308c Content-Type: text/plain; charset=ISO-8859-1 Same problem here. But can't understand why to use TimeUUID instead just long. it make the same job and much more simple On Tue, Dec 22, 2009 at 2:48 PM, Daniel Lundin wrote: > I'm pondering order preservation and TimeUUID keys, in particular how to > get distribution across the cluster while maintaining "rangeability". > > Basically, I'm working on a logging app, where rows are TimeUUIDs. To be > able to do range scans we're using OrderPreservingPartitioner. > > To get partitioning working, I've currently transformed keys, prepending > a partitioning token (in my testcase, the day-of-week). > Basically, this means two range queries to get data for a set spanning > two days. Crude, but kinda works, and the specialization is alright for > my case. But it feels a bit hackish, so I begun studying the partitioner > code a bit, seeking enlightenment. > > Has anybody already spent energy + time thinking about generic TimeUUID > partitioning? Seems like it could be a useful thing, since time series > data is quite common. > > Perhaps a TimeUUIDPartitioner with configurable time resolution for > tokenization (token = uuid.time % resolution, more or less) would be > sufficient? > > Or could it be even more general, i e no configuration necessary? > > /d > --001485f7728c3e1b3a047b51308c Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Same problem here.
But can't understand why to use= TimeUUID instead just long. it make the same job and much more simple


On Tue, Dec 22, 2009 at 2:48 PM, D= aniel Lundin <dln@ein= tr.org> wrote:
I'm pondering order preservation and Ti= meUUID keys, in particular how to
get distribution across the cluster while maintaining "rangeability&qu= ot;.

Basically, I'm working on a logging app, where rows are TimeUUIDs. To b= e
able to do range scans we're using OrderPreservingPartitioner.

To get partitioning working, I've currently transformed keys, prependin= g
a partitioning token (in my testcase, the day-of-week).
Basically, this means two range queries to get data for a set spanning
two days. Crude, but kinda works, and the specialization is alright for
my case. But it feels a bit hackish, so I begun studying the partitioner code a bit, seeking enlightenment.

Has anybody already spent energy + time thinking about generic TimeUUID
partitioning? Seems like it could be a useful thing, since time series
data is quite common.

Perhaps a TimeUUIDPartitioner with configurable time resolution for
tokenization (token =3D uuid.time % resolution, more or less) would be
sufficient?

Or could it be even more general, i e no configuration necessary?

/d

--001485f7728c3e1b3a047b51308c--