Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2AA0C107ED for ; Thu, 6 Jun 2013 10:09:12 +0000 (UTC) Received: (qmail 7390 invoked by uid 500); 6 Jun 2013 10:09:09 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 7140 invoked by uid 500); 6 Jun 2013 10:09:09 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 7124 invoked by uid 99); 6 Jun 2013 10:09:08 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 06 Jun 2013 10:09:08 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of sylvain@datastax.com designates 209.85.192.175 as permitted sender) Received: from [209.85.192.175] (HELO mail-pd0-f175.google.com) (209.85.192.175) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 06 Jun 2013 10:09:02 +0000 Received: by mail-pd0-f175.google.com with SMTP id 4so3149181pdd.20 for ; Thu, 06 Jun 2013 03:08:41 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:x-gm-message-state; bh=PA0HstIDuvBO3XatLROgDk9lbE6FLaDcGkzgqydJA0c=; b=d/FD6rnucR2H1YfsrvPMCydPBnsVFyemDr24msXXIDa/aLvxYsPkRDEDhhrcrCihZ8 o0Qv07+cY+rfjrVbZUjHebJkI35zDtQ6b0XKadggiOrI0k4vqghQ64rznEwKRon8GzfM BR57cHLYkJf5PeCsDU98iezcs1a6YDwlkU7g5nKj2bKOVcwWifFp2Xb43hiGzfXM39Wl MAM/1du4vWJaTujdMwYFkQ3wiC+98/DwblRFqIy9ygm3Let4aYHIO9qQeVDPohGzLYWw WuQELV277Q5WiCnNuDkHr7v6zFs9thagxQKIaGjudFNn+W6dzskfbsG2A8o5Ig/QABFZ 2aaQ== MIME-Version: 1.0 X-Received: by 10.66.145.201 with SMTP id sw9mr37998654pab.63.1370513321374; Thu, 06 Jun 2013 03:08:41 -0700 (PDT) Received: by 10.68.136.67 with HTTP; Thu, 6 Jun 2013 03:08:41 -0700 (PDT) In-Reply-To: References: Date: Thu, 6 Jun 2013 12:08:41 +0200 Message-ID: Subject: Re: smallest/largest UUIDs for LexicalUUIDType From: Sylvain Lebresne To: "user@cassandra.apache.org" Content-Type: multipart/alternative; boundary=047d7b678754ae4a2804de797f0d X-Gm-Message-State: ALoCoQlB8I8L6IoHteZJmwt0wYJNdtEh3wZXgWYA0UkStWUfxCP5C08evZaAHKQAL4TmJuwhSnxW X-Virus-Checked: Checked by ClamAV on apache.org --047d7b678754ae4a2804de797f0d Content-Type: text/plain; charset=ISO-8859-1 > I'm trying to use composite column names to organize 10**8 records. Each > record has a unique pair of UUIDs. The first UUID is often repeated, so I > want to use column_start and column_finish to find all the records that > have a given UUID as the first UUID in the pair. > > I thought a simple way to get *all* of the columns would be to use > > start = uuid.UUID(int=0) -> 00000000-0000-0000-0000-** > 000000000000 > finish = uuid.UUID(int=2**128-1) -> ffffffff-ffff-ffff-ffff-** > ffffffffffff > > But strangely, this fails to find *any* of the columns, and it requires > that column_reversed=True -- otherwise it raises an error about range > finish not coming after start. If I use ints that are much larger/smaller > than these extremes, then reversed is not required! > > Can anyone explain why LexicalUUIDType() does not treat these extremal > UUIDs like other UUIDs? > LexicalUUIDType compares the uuid using Java UUID compare method ( http://docs.oracle.com/javase/6/docs/api/java/util/UUID.html#compareTo(java.util.UUID) ). As it happens this method consider a UUID as 2 longs and when comparing 2 uuids, it compares those longs lexicographically. But java longs are signed. So for that method, 00000000-0000-0000-0000-000000000000 > ffffffff-ffff-ffff-ffff-ffffffffffff (but for instance, 00000000-0000-0000-0000-000000000000 < 7fffffff-ffff-ffff-ffff-ffffffffffff (because the first "long" of that 2nd uuid is now positive)). That's an historical accident, LexicalUUIDType should probably not have use that comparison as it's arguably not very intuitive. However it's too late to change it (as changing it now would basically corrupt all data for people using LexicalUUIDType today). I'll note that if you have the choice, you can use UUIDType rather than LexicalUUIDType. UUIDType fixes that behavior and use a proper lexical comparison for non-type-1 uuids (the other behavior of UUIDType is that for type 1 uuid, it compares them by time first, i.e. it is equivalent to TimeUUIDType for type 1 uuid). -- Sylvain --047d7b678754ae4a2804de797f0d Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable

=
I'm trying to use composite column names to organize 10**8 records. =A0= Each record has a unique pair of UUIDs. =A0The first UUID is often repeated= , so I want to use column_start and column_finish to find all the records t= hat have a given UUID as the first UUID in the pair.

I thought a simple way to get *all* of the columns would be to use

=A0start =A0=3D uuid.UUID(int=3D0) =A0 =A0 =A0 =A0-> 00000000-0000-0000-= 0000-000000000000
=A0finish =3D uuid.UUID(int=3D2**128-1) -> ffffffff-ffff-ffff-ffff-ffffffffffff

But strangely, this fails to find *any* of the columns, and it requires tha= t column_reversed=3DTrue -- otherwise it raises an error about range finish= not coming after start. =A0If I use ints that are much larger/smaller than= these extremes, then reversed is not required!

Can anyone explain why LexicalUUIDType() does not treat these extremal UUID= s like other UUIDs?

LexicalUUIDTy= pe compares the uuid using Java UUID compare method (