Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id ACBFCFA03 for ; Wed, 27 Mar 2013 16:06:05 +0000 (UTC) Received: (qmail 83685 invoked by uid 500); 27 Mar 2013 16:06:03 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 83656 invoked by uid 500); 27 Mar 2013 16:06:03 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 83646 invoked by uid 99); 27 Mar 2013 16:06:03 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 27 Mar 2013 16:06:03 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of cperezmig@gmail.com designates 209.85.214.175 as permitted sender) Received: from [209.85.214.175] (HELO mail-ob0-f175.google.com) (209.85.214.175) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 27 Mar 2013 16:05:56 +0000 Received: by mail-ob0-f175.google.com with SMTP id va7so5163734obc.20 for ; Wed, 27 Mar 2013 09:05:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:mime-version:in-reply-to:references:from:date:message-id :subject:to:content-type; bh=jni156YZSwUMLZ84KiHgB8ah8KURw9BXE+nSwEiHoeU=; b=NHstZt4ac5F9ej0kEoGuC5UZap+M5EtEwOz01Q1WwWms8nW35nfyHdod3ZhJS1I82h pYcknywTgJaT6aseE4HYrqAHJAiHqfoMZqGesB6HMpzxnIcB0H6FtBHoTxNmQfPdXsAm 2GK36LYo7z9aSK3/IXfIo5HJW+tvzU51gkz/Y7qTND9ecj6XKKDG+iTnfHKpqvAazRHI zamUciYaEmJ+6aJOmRK5O/DtIa5MkeqckMP3wvwKYiAv0CZW6WiJihapO/gcpyVZ6h+r ps1gpYb9rW0kcX0IXAKmXINTmMmkj6BFHFplukvGxvOs33BCaH6KRgduf5ykOReo2oM3 Ftlw== X-Received: by 10.182.66.41 with SMTP id c9mr4008190obt.76.1364400335710; Wed, 27 Mar 2013 09:05:35 -0700 (PDT) MIME-Version: 1.0 Received: by 10.76.12.65 with HTTP; Wed, 27 Mar 2013 09:05:15 -0700 (PDT) In-Reply-To: <3DCD2AC3-4579-4A5C-82F8-5BCD7E412686@spotright.com> References: <39B786E2-1CE7-468E-9BD3-DDD004A308A7@thelastpickle.com> <8B95E7E0-58BD-46FA-AF8E-FBD06999A8AD@spotright.com> <3DCD2AC3-4579-4A5C-82F8-5BCD7E412686@spotright.com> From: =?ISO-8859-1?Q?Carlos_P=E9rez_Miguel?= Date: Thu, 28 Mar 2013 00:05:15 +0800 Message-ID: Subject: Re: TimeUUID Order Partitioner To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=e89a8fb1f21a576f4d04d8ea35bc X-Virus-Checked: Checked by ClamAV on apache.org --e89a8fb1f21a576f4d04d8ea35bc Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Thanks, Lanny. That is what I am doing. Actually I'm having another problem. My UUIDOrderedPartitioner doesn't order by time. Instead, it orders by byte order and I cannot find why. Which are the functions that control ordering between tokens? I have implemented time ordering in the "compareTo" function of my UUID token class, but it seems that Cassandra is ignoring it. For example: Let's suppouse that I have a Users CF where each row represents a user in a cluster of 1 node. Rows are ordered by TimeUUID. I create some users in the next order: user a created with user_id: eac850fa-96f4-11e2-9f22-72ad6af0e500 user b created with user_id: f17f9ae8-96f4-11e2-98aa-421151417092 user c created with user_id: f82fccfa-96f4-11e2-8d99-26f8461d074c user d created with user_id: fee21cec-96f4-11e2-945b-f9a2a2e32308 user e created with user_id: 058ec180-96f5-11e2-8c88-4aaf94e4f04e user f created with user_id: 0c5032ba-96f5-11e2-95a5-60a128c0b3f4 user g created with user_id: 13036b86-96f5-11e2-80dd-566654c686cb user h created with user_id: 19b245f6-96f5-11e2-9c8f-b315f455e5e0 That is the order I would expect to find if I read the CF, but if I do, I obtain (with any client or library I've tried): user_id: 058ec180-96f5-11e2-8c88-4aaf94e4f04e name:"e" user_id: 0c5032ba-96f5-11e2-95a5-60a128c0b3f4 name:"f" user_id: 13036b86-96f5-11e2-80dd-566654c686cb name:"g" user_id: 19b245f6-96f5-11e2-9c8f-b315f455e5e0 name:"h" user_id: eac850fa-96f4-11e2-9f22-72ad6af0e500 name:"a" user_id: f17f9ae8-96f4-11e2-98aa-421151417092 name:"b" user_id: f82fccfa-96f4-11e2-8d99-26f8461d074c name:"c" user_id: fee21cec-96f4-11e2-945b-f9a2a2e32308 name:"d" Any idea what's happening? Carlos P=E9rez Miguel 2013/3/27 Lanny Ripple > Ah. TimeUUID. Not as useful for you then but still something for the > toolbox. > > On Mar 27, 2013, at 8:42 AM, Lanny Ripple wrote: > > > A type 4 UUID can be created from two Longs. You could MD5 your string= s > giving you 128 hashed bits and then make UUIDs out of that. Using Scala: > > > > import java.nio.ByteBuffer > > import java.security.MessageDigest > > import java.util.UUID > > > > val key =3D "Hello, World!" > > > > val md =3D MessageDigest.getInstance("MD5") > > val dig =3D md.digest(key.getBytes("UTF-8")) > > val bb =3D ByteBuffer.wrap(dig) > > > > val msb =3D bb.getLong > > val lsb =3D bb.getLong > > > > val uuid =3D new UUID(msb, lsb) > > > > > > On Mar 26, 2013, at 3:22 PM, aaron morton > wrote: > > > >>> Any idea? > >> Not off the top of my head. > >> > >> Cheers > >> > >> ----------------- > >> Aaron Morton > >> Freelance Cassandra Consultant > >> New Zealand > >> > >> @aaronmorton > >> http://www.thelastpickle.com > >> > >> On 26/03/2013, at 2:13 AM, Carlos P=E9rez Miguel > wrote: > >> > >>> Yes it does. Thank you Aaron. > >>> > >>> Now I realized that the system keyspace uses string as keys, like > "Ring" or "ClusterName", and I don't know how to convert these type of ke= ys > into UUID. Any idea? > >>> > >>> > >>> Carlos P=E9rez Miguel > >>> > >>> > >>> 2013/3/25 aaron morton > >>> The best thing to do is start with a look at ByteOrderedPartitoner an= d > AbstractByteOrderedPartitioner. > >>> > >>> You'll want to create a new TimeUUIDToken extends Token and a > new UUIDPartitioner that extends AbstractPartitioner<> > >>> > >>> Usual disclaimer that ordered partitioners cause problems with load > balancing. > >>> > >>> Hope that helps. > >>> > >>> ----------------- > >>> Aaron Morton > >>> Freelance Cassandra Consultant > >>> New Zealand > >>> > >>> @aaronmorton > >>> http://www.thelastpickle.com > >>> > >>> On 25/03/2013, at 1:12 AM, Carlos P=E9rez Miguel > wrote: > >>> > >>>> Hi, > >>>> > >>>> I store in my system rows where the key is a UUID version1, TimeUUID= . > I would like to maintain rows ordered by time. I know that in this case, = it > is recomended to use an external CF where column names are UUID ordered b= y > time. But in my use case this is not possible, so I would like to use a > custom Partitioner in order to do this. If I use ByteOrderedPartitioner > rows are not correctly ordered because of the way a UUID stores the > timestamp. What is needed in order to implement my own Partitioner? > >>>> > >>>> Thank you. > >>>> > >>>> Carlos P=E9rez Miguel > >>> > >>> > >> > > > > --e89a8fb1f21a576f4d04d8ea35bc Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Thanks, Lanny. That is what I am doing.

A= ctually I'm having another problem. My UUIDOrderedPartitioner doesn'= ;t order by time. Instead, it orders by byte order and I cannot find why. W= hich are the functions that control ordering between tokens? I have impleme= nted time ordering in the "compareTo" function of my UUID token c= lass, but it seems that Cassandra is ignoring it. For example:

Let's suppouse that I have a Users CF where each row represen= ts a user in a cluster of 1 node. Rows are ordered by TimeUUID. I create so= me users in the next order:

user a created with user_id: eac850fa-96= f4-11e2-9f22-72ad6af0e500
user b created with user_id: f17f9ae8-96f4-11e2-98aa-421151417092
user c= created with user_id: f82fccfa-96f4-11e2-8d99-26f8461d074c
user d creat= ed with user_id: fee21cec-96f4-11e2-945b-f9a2a2e32308
user e created wit= h user_id: 058ec180-96f5-11e2-8c88-4aaf94e4f04e
user f created with user_id: 0c5032ba-96f5-11e2-95a5-60a128c0b3f4
user g= created with user_id: 13036b86-96f5-11e2-80dd-566654c686cb
user h creat= ed with user_id: 19b245f6-96f5-11e2-9c8f-b315f455e5e0

That is the order I would expect to find if I read the CF, but if I do, I o= btain (with any client or library I've tried):

user_id: 058ec180= -96f5-11e2-8c88-4aaf94e4f04e name:"e"
user_id: 0c5032ba-96f5-1= 1e2-95a5-60a128c0b3f4 name:"f"
user_id: 13036b86-96f5-11e2-80dd-566654c686cb name:"g"
user_id= : 19b245f6-96f5-11e2-9c8f-b315f455e5e0 name:"h"
user_id: eac85= 0fa-96f4-11e2-9f22-72ad6af0e500 name:"a"
user_id: f17f9ae8-96f= 4-11e2-98aa-421151417092 name:"b"
user_id: f82fccfa-96f4-11e2-8d99-26f8461d074c name:"c"
user_id= : fee21cec-96f4-11e2-945b-f9a2a2e32308 name:"d"

Any idea w= hat's happening?


Carlos P=E9rez Miguel


2013/3/27 Lanny Ripple <lanny@spotrig= ht.com>
Ah. TimeUUID. =A0Not as useful for you then but still something for the too= lbox.

On Mar 27, 2013, at 8:42 AM, Lanny Ripple <lanny@spotright.com> wrote:

> A type 4 UUID can be created from two Longs. =A0You could MD5 your str= ings giving you 128 hashed bits and then make UUIDs out of that. =A0Using S= cala:
>
> =A0 import java.nio.ByteBuffer
> =A0 import java.security.MessageDigest
> =A0 import java.util.UUID
>
> =A0 val key =3D "Hello, World!"
>
> =A0 val md =3D MessageDigest.getInstance("MD5")
> =A0 val dig =3D md.digest(key.getBytes("UTF-8"))
> =A0 val bb =3D ByteBuffer.wrap(dig)
>
> =A0 val msb =3D bb.getLong
> =A0 val lsb =3D bb.getLong
>
> =A0 val uuid =3D new UUID(msb, lsb)
>
>
> On Mar 26, 2013, at 3:22 PM, aaron morton <aaron@thelastpickle.com> wrote:
>
>>> Any idea?
>> Not off the top of my head.
>>
>> Cheers
>>
>> -----------------
>> Aaron Morton
>> Freelance Cassandra Consultant
>> New Zealand
>>
>> @aaronmorton
>> http://= www.thelastpickle.com
>>
>> On 26/03/2013, at 2:13 AM, Carlos P=E9rez Miguel <cperezmig@gmail.com> wrote:
>>
>>> Yes it does. Thank you Aaron.
>>>
>>> Now I realized that the system keyspace uses string as keys, l= ike "Ring" or "ClusterName", and I don't know how t= o convert these type of keys into UUID. Any idea?
>>>
>>>
>>> Carlos P=E9rez Miguel
>>>
>>>
>>> 2013/3/25 aaron morton <aaron@thelastpickle.com>
>>> The best thing to do is start with a look at ByteOrderedPartit= oner and AbstractByteOrderedPartitioner.
>>>
>>> You'll want to create a new TimeUUIDToken extends Token<= ;UUID> and a new UUIDPartitioner that extends AbstractPartitioner<>= ;
>>>
>>> Usual disclaimer that ordered partitioners cause problems with= load balancing.
>>>
>>> Hope that helps.
>>>
>>> -----------------
>>> Aaron Morton
>>> Freelance Cassandra Consultant
>>> New Zealand
>>>
>>> @aaronmorton
>>> htt= p://www.thelastpickle.com
>>>
>>> On 25/03/2013, at 1:12 AM, Carlos P=E9rez Miguel <cperezmig@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I store in my system rows where the key is a UUID version1= , TimeUUID. I would like to maintain rows ordered by time. I know that in t= his case, it is recomended to use an external CF where column names are UUI= D ordered by time. But in my use case this is not possible, so I would like= to use a custom Partitioner in order to do this. If I use ByteOrderedParti= tioner rows are not correctly ordered because of the way a UUID stores the = timestamp. What is needed in order to implement my own Partitioner?
>>>>
>>>> Thank you.
>>>>
>>>> Carlos P=E9rez Miguel
>>>
>>>
>>
>


--e89a8fb1f21a576f4d04d8ea35bc--