Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 77424 invoked from network); 1 Mar 2011 16:35:40 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 1 Mar 2011 16:35:40 -0000 Received: (qmail 72992 invoked by uid 500); 1 Mar 2011 16:35:38 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 72976 invoked by uid 500); 1 Mar 2011 16:35:36 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 72968 invoked by uid 99); 1 Mar 2011 16:35:35 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Mar 2011 16:35:35 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of mdennis@riptano.com designates 209.85.218.44 as permitted sender) Received: from [209.85.218.44] (HELO mail-yi0-f44.google.com) (209.85.218.44) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Mar 2011 16:35:29 +0000 Received: by yic13 with SMTP id 13so282304yic.31 for ; Tue, 01 Mar 2011 08:35:08 -0800 (PST) MIME-Version: 1.0 Received: by 10.236.125.131 with SMTP id z3mr4204739yhh.56.1298997307893; Tue, 01 Mar 2011 08:35:07 -0800 (PST) Sender: mdennis@riptano.com Received: by 10.236.105.209 with HTTP; Tue, 1 Mar 2011 08:35:07 -0800 (PST) X-Originating-IP: [107.24.140.87] In-Reply-To: References: <6368527.1405601298326178208.JavaMail.defaultUser@defaultHost> Date: Tue, 1 Mar 2011 10:35:07 -0600 X-Google-Sender-Auth: _p_eXjeaFsvN5l9IgL89O15qvhk Message-ID: Subject: Re: I: Re: Are row-keys sorted by the compareWith? From: Matthew Dennis To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=485b397dcd3d19f302049d6e6156 X-Virus-Checked: Checked by ClamAV on apache.org --485b397dcd3d19f302049d6e6156 Content-Type: text/plain; charset=ISO-8859-1 I'm not really familiar with pelops code, but I found two implementations (~ line 454 and ~ line 559) of getColumnsFromRows in Selector.java in pelops trunk. The first uses a HashMap so it clearly isn't ordered, the second uses a LinkedHashMap but it inserts the keys in the order returned by C* which we already know isn't ordered. See http://bit.ly/egZaXi for relevant code. Like I said, I'm not really familiar with pelops so I could be completely off on this, but it looks like if pelops was intending to preserve the order of the requested keys that it's not actually doing it... On Wed, Feb 23, 2011 at 3:44 PM, Dan Washusen wrote: > Hi Matthew, > As you mention the map returned from multiget_slice is not order > preserving, Pelops is doing this on the client side... > > Cheers, > Dan > > -- > Dan Washusen > Sent with Sparrow > > On Wednesday, 23 February 2011 at 8:38 PM, Matthew Dennis wrote: > > The map returned by multiget_slice (what I suspect is the underlying thrift > call for getColumnsFromRows) is not a order preserving map, it's a HashMap > so the order of the returned results cannot be depended on. Even if it was > a order preserving map, not all languages would be able to make use of the > results since not all languages have ordered maps (though many, including > Java, certainly do). > > That being said, it would be fairly easy to change this on the C* side to > preserve the order the keys were requested in, though as mentioned not all > clients could take advantage of it. > > On Mon, Feb 21, 2011 at 4:09 PM, cbertu81@libero.it wrote: > > > *As Jonathan mentions the compareWith on a column family def. is defines > the order for the columns *within* a row... In order to control the ordering > of rows you'll need to use the OrderPreservingPartitioner ( > http://www.datastax.com/docs/0.7/operations/clustering#tokens-partitioners-ring > ).* > > Thanks for your answer and for your time, I will take a look at this. > > *As for getColumnsFromRows; it should be returning you a map of lists. > The map is insertion-order-preserving and populated based on the provided > list of row keys (so if you iterate over the entries in the map they should > be in the same order as the list of row keys). * > > > mmm ... well it didn't happen like this. In my code I had a CF named > comments and also a CF called usercomments. UserComments use an uuid as > row-key to keep, TimeUUID sorted, the "pointers" to the comments of the > user. When I get the sorted list of keys from the UserComments and I use > this list as row-keys-list in the GetColumnsFromRows I don't get back the > data sorted as I expect them to be*.* > > It looks like if Cassandra/Pelops does not care on how I provide the > row-keys-list. I am sure about that cause I did something different: I > iterate over my row-keys-list and made many GetColumnFromRow instead of one > GetColumnsFromRows and when I iterate data are correctly sorted. But this > can not be a solution ... > > > I am using Cassandra 0.6.9 > > > I profit of your knownledge of Pelops to ask you something: I am evaluating > the migration to Cassandra 0.7 ... as far as you know, in terms of written > code, is it an heavy job? > > > Best Regards > > > Carlo > > > ----Messaggio originale---- > Da: dan@reactive.org > > On Saturday, 19 February 2011 at 8:16 AM, cbertu81@libero.it wrote: > > Hi all, > I created a CF in which i need to get, sorted by time, the Rows inside. > Each > Row represents a comment. > > > > I've created a few rows using as Row Key a generated TimeUUID but when I > call > the Pelops method "GetColumnsFromRows" I don't get the data back as I > expect: > rows are not sorted by TimeUUID. > I though it was probably cause of the random-part of the TimeUUID so I > create > a new CF ... > > > > This time I created a few rows using the java System.CurrentTimeMillis() > that > retrieve a long. I call again the "GetColumnsFromRows" and again the same > results: data are not sorted! > I've read many times that Rows are sorted as specified in the compareWith > but > I can't see it. > To solve this problem for the moment I've used a SuperColumnFamily with an > UNIQUE ROW ... but I think this is just a workaround and not the solution. > > CompareSubcolumnsWith="BytesType"/ > > > Now when I call the "GetSuperColumnsFromRow" I get all the SuperColumns as > I > expected: sorted by TimeUUID. Why it does not happen the same with the > Rows? > I'm confused. > > TIA for any help. > > Best Regards > > Carlo > > > > > > > > > --485b397dcd3d19f302049d6e6156 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable I'm not really familiar with pelops code, but I found two implementatio= ns (~ line 454 and ~ line 559) of getColumnsFromRows in Selector.java in pe= lops trunk.

The first uses a HashMap so it clearly isn't ordered= , the second uses a LinkedHashMap but it inserts the keys in the order retu= rned by C* which we already know isn't ordered.

See http://bit.ly/egZaXi for relev= ant code.

Like I said, I'm not really familiar with pelops so I = could be completely off on this, but it looks like if pelops was intending = to preserve the order of the requested keys that it's not actually doin= g it...

On Wed, Feb 23, 2011 at 3:44 PM, Dan Washuse= n <dan@reactive.or= g> wrote:
Hi Matthew,
As you mention the map returned from= =A0multiget_slice is not order preserving,=A0Pelops is doing this on the cl= ient side...

Cheers,
Dan

--=A0
Dan Washusen
= Sent with Sparr= ow
=20

On Wednesday, 23 Fe= bruary 2011 at 8:38 PM, Matthew Dennis wrote:

The map returned by multiget_slice (wha= t I suspect is the underlying thrift call for getColumnsFromRows) is not a = order preserving map, it's a HashMap so the order of the returned resul= ts cannot be depended on.=A0 Even if it was a order preserving map, not all= languages would be able to make use of the results since not all languages= have ordered maps (though many, including Java, certainly do).

That being said, it would be fairly easy to change this on the C* side = to preserve the order the keys were requested in, though as mentioned not a= ll clients could take advantage of it.

On Mon, Feb 21, 2011 at 4:09 PM, cbertu81@libero.it <cbertu81@libero.it> w= rote:

As Jon= athan mentions the compareWith on a column family def. is defines the order= for the columns *within* a row... In order to control the ordering of rows= you'll need to use the OrderPreservingPartitioner (http://www.datastax.com/docs/0.7/operations/clustering#toke= ns-partitioners-ring).

Thanks for your answer and for your time, I will ta= ke a look at this.

As for=A0getColumnsFromRows; it should be returning you a map of l= ists. =A0The map is insertion-order-preserving and populated based on the p= rovided list of row keys (so if you iterate over the entries in the map the= y should be in the same order as the list of row keys). =A0


mmm ... well it didn't happen like this. In my code= I had a CF named comments and also a CF called usercomments. UserComments = use an uuid as row-key to keep, TimeUUID sorted, the "pointers" t= o the comments of the user. When I get the sorted list of keys from the Use= rComments and I use this list as row-keys-list in the GetColumnsFromRows I = don't get back the data sorted as I expect them to be.

It looks like if Cassandra/Pelops d= oes not care on how I provide the row-keys-list. I am sure about that cause= I did something different: I iterate over my row-keys-list and made many G= etColumnFromRow instead of one GetColumnsFromRows and when I iterate data a= re correctly sorted. But this can not be a solution ...


I am using Cassandra 0.6.9


I profit of your kn= ownledge of Pelops to ask you something: I am evaluating the migration to C= assandra 0.7 ... as far as you know, in terms of written code, is it an hea= vy job?


Best Regards


Carlo


----Messaggio originale----
Da: dan@reactive.org<= /a>

=20

On Saturday, 19 Feb= ruary 2011 at 8:16 AM, cbertu81@libero.it wrote:

Hi all,
I created a CF in which i ne= ed to get, sorted by time, the Rows inside. Each
Row represents a comme= nt.

<ColumnFamily name=3D"Comments" compareWith=3D"= ;TimeUUIDType" / >

I've created a few rows using as Row Key a generated TimeUUID but w= hen I call
the Pelops method "GetColumnsFromRows" I don't= get the data back as I expect:
rows are not sorted by TimeUUID.
I though it was probably cause of the random-part of the TimeUUID so I crea= te
a new CF ...

<ColumnFamily name=3D"Comments2" co= mpareWith=3D"LongType" / >

This time I created a few ro= ws using the java System.CurrentTimeMillis() that
retrieve a long. I call again the "GetColumnsFromRows" and again = the same
results: data are not sorted!
I've read many times that= Rows are sorted as specified in the compareWith but
I can't see it= .
To solve this problem for the moment I've used a SuperColumnFamily with= an
UNIQUE ROW ... but I think this is just a workaround and not the so= lution.

<ColumnFamily name=3D"Comments" type=3D"Su= per" compareWith=3D"TimeUUIDType"
CompareSubcolumnsWith=3D"BytesType"/ >

Now when I call = the "GetSuperColumnsFromRow" I get all the SuperColumns as I
= expected: sorted by TimeUUID. Why it does not happen the same with the Rows= ?
I'm confused.

TIA for any help.

Best Regards

Carlo=
=20 =20 =20 =20






=20 =20 =20 =20
=20


--485b397dcd3d19f302049d6e6156--