From user-return-14119-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Tue Mar 01 20:46:24 2011 Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 4847 invoked from network); 1 Mar 2011 20:46:24 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 1 Mar 2011 20:46:24 -0000 Received: (qmail 65760 invoked by uid 500); 1 Mar 2011 20:46:22 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 65712 invoked by uid 500); 1 Mar 2011 20:46:21 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 65704 invoked by uid 99); 1 Mar 2011 20:46:21 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Mar 2011 20:46:21 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [209.85.213.172] (HELO mail-yx0-f172.google.com) (209.85.213.172) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Mar 2011 20:46:17 +0000 Received: by yxk30 with SMTP id 30so2424406yxk.31 for ; Tue, 01 Mar 2011 12:45:55 -0800 (PST) Received: by 10.151.19.10 with SMTP id w10mr9662362ybi.39.1299012355548; Tue, 01 Mar 2011 12:45:55 -0800 (PST) Received: from marvin.local (CPE-58-160-88-92.bqyn1.lon.bigpond.net.au [58.160.88.92]) by mx.google.com with ESMTPS id z5sm826802yhc.35.2011.03.01.12.45.52 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 01 Mar 2011 12:45:54 -0800 (PST) Date: Wed, 2 Mar 2011 07:45:48 +1100 From: Dan Washusen To: user@cassandra.apache.org Message-ID: <23366D9EEE8B4C23A53B9F918187D147@reactive.org> In-Reply-To: References: <6368527.1405601298326178208.JavaMail.defaultUser@defaultHost> Subject: Re: I: Re: Are row-keys sorted by the compareWith? X-Mailer: sparrow 1.0.1 (build 589.15) MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="4d6d5afc_6b94764_1e9" Content-Transfer-Encoding: 8bit --4d6d5afc_6b94764_1e9 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit Content-Disposition: inline Pelops moved to github several months ago... https://github.com/s7/scale7-pelops/blob/master/src/main/java/org/scale7/cassandra/pelops/Selector.java#L1179 Cheers, -- Dan Washusen On Wednesday, 2 March 2011 at 3:35 AM, Matthew Dennis wrote: > I'm not really familiar with pelops code, but I found two implementations (~ line 454 and ~ line 559) of getColumnsFromRows in Selector.java in pelops trunk. > > The first uses a HashMap so it clearly isn't ordered, the second uses a LinkedHashMap but it inserts the keys in the order returned by C* which we already know isn't ordered. > > See http://bit.ly/egZaXi for relevant code. > > Like I said, I'm not really familiar with pelops so I could be completely off on this, but it looks like if pelops was intending to preserve the order of the requested keys that it's not actually doing it... > > On Wed, Feb 23, 2011 at 3:44 PM, Dan Washusen wrote: > > Hi Matthew, > > As you mention the map returned from multiget_slice is not order preserving, Pelops is doing this on the client side... > > > > Cheers, > > Dan > > > > -- > > Dan Washusen > > Sent with Sparrow > > On Wednesday, 23 February 2011 at 8:38 PM, Matthew Dennis wrote: > > > The map returned by multiget_slice (what I suspect is the underlying thrift call for getColumnsFromRows) is not a order preserving map, it's a HashMap so the order of the returned results cannot be depended on. Even if it was a order preserving map, not all languages would be able to make use of the results since not all languages have ordered maps (though many, including Java, certainly do). > > > > > > That being said, it would be fairly easy to change this on the C* side to preserve the order the keys were requested in, though as mentioned not all clients could take advantage of it. > > > > > > On Mon, Feb 21, 2011 at 4:09 PM, cbertu81@libero.it wrote: > > > > > > > > > > As Jonathan mentions the compareWith on a column family def. is defines the order for the columns *within* a row... In order to control the ordering of rows you'll need to use the OrderPreservingPartitioner (http://www.datastax.com/docs/0.7/operations/clustering#tokens-partitioners-ring). > > > > > > > > > > Thanks for your answer and for your time, I will take a look at this. > > > > > > > > > > As for getColumnsFromRows; it should be returning you a map of lists. The map is insertion-order-preserving and populated based on the provided list of row keys (so if you iterate over the entries in the map they should be in the same order as the list of row keys). > > > > > > > > > > mmm ... well it didn't happen like this. In my code I had a CF named comments and also a CF called usercomments. UserComments use an uuid as row-key to keep, TimeUUID sorted, the "pointers" to the comments of the user. When I get the sorted list of keys from the UserComments and I use this list as row-keys-list in the GetColumnsFromRows I don't get back the data sorted as I expect them to be. > > > > > It looks like if Cassandra/Pelops does not care on how I provide the row-keys-list. I am sure about that cause I did something different: I iterate over my row-keys-list and made many GetColumnFromRow instead of one GetColumnsFromRows and when I iterate data are correctly sorted. But this can not be a solution ... > > > > > > > > > > I am using Cassandra 0.6.9 > > > > > > > > > > I profit of your knownledge of Pelops to ask you something: I am evaluating the migration to Cassandra 0.7 ... as far as you know, in terms of written code, is it an heavy job? > > > > > > > > > > Best Regards > > > > > > > > > > Carlo > > > > > > > > > > > ----Messaggio originale---- > > > > > > Da: dan@reactive.org > > > > > > > > > > > > On Saturday, 19 February 2011 at 8:16 AM, cbertu81@libero.it wrote: > > > > > > > Hi all, > > > > > > > I created a CF in which i need to get, sorted by time, the Rows inside. Each > > > > > > > Row represents a comment. > > > > > > > > > > > > > > > > > > > > > > > > > > > > I've created a few rows using as Row Key a generated TimeUUID but when I call > > > > > > > the Pelops method "GetColumnsFromRows" I don't get the data back as I expect: > > > > > > > rows are not sorted by TimeUUID. > > > > > > > I though it was probably cause of the random-part of the TimeUUID so I create > > > > > > > a new CF ... > > > > > > > > > > > > > > > > > > > > > > > > > > > > This time I created a few rows using the java System.CurrentTimeMillis() that > > > > > > > retrieve a long. I call again the "GetColumnsFromRows" and again the same > > > > > > > results: data are not sorted! > > > > > > > I've read many times that Rows are sorted as specified in the compareWith but > > > > > > > I can't see it. > > > > > > > To solve this problem for the moment I've used a SuperColumnFamily with an > > > > > > > UNIQUE ROW ... but I think this is just a workaround and not the solution. > > > > > > > > > > > > > > > > > > > > CompareSubcolumnsWith="BytesType"/ > > > > > > > > > > > > > > > Now when I call the "GetSuperColumnsFromRow" I get all the SuperColumns as I > > > > > > > expected: sorted by TimeUUID. Why it does not happen the same with the Rows? > > > > > > > I'm confused. > > > > > > > > > > > > > > TIA for any help. > > > > > > > > > > > > > > Best Regards > > > > > > > > > > > > > > Carlo > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --4d6d5afc_6b94764_1e9 Content-Type: text/html; charset="utf-8" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline
Pelops moved to github several months ago...
<= br>
Cheers,
-- 
= Dan Washusen
=20

On Wednesday, 2 March = 2011 at 3:35 AM, Matthew Dennis wrote:

I'm not really familiar with pelops c= ode, but I found two implementations (=7E line 454 and =7E line 559) of g= etColumns=46romRows in Selector.java in pelops trunk.

The first us= es a HashMap so it clearly isn't ordered, the second uses a LinkedHashMap= but it inserts the keys in the order returned by C* which we already kno= w isn't ordered.

See http://bit.ly/egZaXi for= relevant code.

Like I said, I'm not really familiar with pelops s= o I could be completely off on this, but it looks like if pelops was inte= nding to preserve the order of the requested keys that it's not actually = doing it...

On Wed, =46eb 23, 2011 at 3:44 PM, D= an Washusen <dan=40reactive.org> wrote:
Hi Matthew,
As you mention the map returned fro= m multiget=5Fslice is not order preserving, Pelops is doing thi= s on the client side...

Cheers,
Dan
-- 
Dan W= ashusen
Sent with Sparrow
=20

On Wednesday,= 23 =46ebruary 2011 at 8:38 PM, Matthew Dennis wrote:

The map returned by multiget=5Fslice = (what I suspect is the underlying thrift call for getColumns=46romRows) i= s not a order preserving map, it's a HashMap so the order of the returned= results cannot be depended on.  Even if it was a order preserving m= ap, not all languages would be able to make use of the results since not = all languages have ordered maps (though many, including Java, certainly d= o).

That being said, it would be fairly easy to change this on the C* sid= e to preserve the order the keys were requested in, though as mentioned n= ot all clients could take advantage of it.

On Mon, =46eb 21, 2011 at 4:09 PM, cbertu81=40libero.it <cbertu81=40libero.it> wrote:

As Jonathan mentions t= he compareWith on a column family def. is defines the order for the colum= ns *within* a row... In order to control the ordering of rows you'll need= to use the OrderPreservingPartitioner (http://www.datastax.com/docs/0.7/operations/clustering=23t= okens-partitioners-ring).

Thanks for your answer and for your time= , I will take a look at this.

As for getColumns=46romRows; it should be returning you a m= ap of lists.  The map is insertion-order-preserving and populated ba= sed on the provided list of row keys (so if you iterate over the entries = in the map they should be in the same order as the list of row keys). &nb= sp;


mmm ... well it didn't happen like this. In= my code I had a C=46 named comments and also a C=46 called usercomments.= UserComments use an uuid as row-key to keep, TimeUUID sorted, the =22poi= nters=22 to the comments of the user. When I get the sorted list of keys = from the UserComments and I use this list as row-keys-list in the GetColu= mns=46romRows I don't get back the data sorted as I expect them to be.=

It looks like if Cassandra/Pe= lops does not care on how I provide the row-keys-list. I am sure about th= at cause I did something different: I iterate over my row-keys-list and m= ade many GetColumn=46romRow instead of one GetColumns=46romRows and when = I iterate data are correctly sorted. But this can not be a solution ...


I am using Cassandra 0.6.9


I profit of your knownledge of Pelops to ask you something: I am evaluat= ing the migration to Cassandra 0.7 ... as far as you know, in terms of wr= itten code, is it an heavy job=3F


Best Regards


Carlo


----Messaggio originale----
Da: dan= =40reactive.org

=20

On Saturday, = 19 =46ebruary 2011 at 8:16 AM, cbertu81=40libero.it wrote:

Hi all,
I created a C=46 in which = i need to get, sorted by time, the Rows inside. Each
Row represents a= comment.

<Column=46amily name=3D=22Comments=22 compareWith=3D=22= TimeUUIDType=22 / >

I've created a few rows using as Row Key a generated TimeUUID but whe= n I call
the Pelops method =22GetColumns=46romRows=22 I don't get the= data back as I expect:
rows are not sorted by TimeUUID.
I though it was probably cause of the random-part of the TimeUUID so I cr= eate
a new C=46 ...

<Column=46amily name=3D=22Comments2=22 = compareWith=3D=22LongType=22 / >

This time I created a few rows= using the java System.CurrentTimeMillis() that
retrieve a long. I call again the =22GetColumns=46romRows=22 and again th= e same
results: data are not sorted=21
I've read many times that R= ows are sorted as specified in the compareWith but
I can't see it. To solve this problem for the moment I've used a SuperColumn=46amily with= an
UNIQUE ROW ... but I think this is just a workaround and not the = solution.

<Column=46amily name=3D=22Comments=22 type=3D=22Super= =22 compareWith=3D=22TimeUUIDType=22
CompareSubcolumnsWith=3D=22BytesType=22/ >

Now when I call the = =22GetSuperColumns=46romRow=22 I get all the SuperColumns as I
expect= ed: sorted by TimeUUID. Why it does not happen the same with the Rows=3F =
I'm confused.

TIA for any help.

Best Regards

Carlo
=20 =20 =20 =20






=20 =20 =20 =20


=20 =20 =20 =20
=20

--4d6d5afc_6b94764_1e9--