incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeremiah Jordan <JEREMIAH.JOR...@morningstar.com>
Subject RE: Composite keys and range queries
Date Wed, 14 Mar 2012 19:53:39 GMT
Right, so until the new CQL stuff exists to actually query with something smart enough to know
about "composite keys" , You have to define and query on your own.

Row Key = UUID
Column = CompositeColumn(string, string)

You want to then use COLUMN slicing, not row ranges to query the data.  Where you slice in
priority as the first part of a Composite Column Name.

See the "Under the hood and historical notes" section of the blog post.  You want to layout
your data per the "Physical representation of the denormalized timeline rows" diagram.
Where your UUID is the "user_id" from the example, and your priority is the "tweet_id"

-Jeremiah


________________________________
From: John Laban [john@pagerduty.com]
Sent: Wednesday, March 14, 2012 12:37 PM
To: user@cassandra.apache.org
Subject: Re: Composite keys and range queries

Hmm, now I'm really confused.

> This may be of use to you http://www.datastax.com/dev/blog/schema-in-cassandra-1-1

This article is what I actually used to come up with my schema here.  In the "Clustering,
composite keys, and more" section they're using a schema very similarly to how I'm trying
to use it.  They define a composite key with two parts, expecting the first part to be used
as the partition key and the second part to be used for ordering.

> The hash for (uuid-1 , p1) may be 100 and the hash for (uuid-1, p2) may be 1 .

Why?  Shouldn't only "uuid-1" be used as the partition key?  (So shouldn't those two hash
to the same location?)

I'm thinking of using supercolumns for this instead as I know they'll work (where the row
key is the uuid and the supercolumn name is the priority), but aren't composite row keys supposed
to essentially replace the need for supercolumns?

Thanks, and sorry if I'm getting this all wrong,
John



On Wed, Mar 14, 2012 at 12:52 AM, aaron morton <aaron@thelastpickle.com<mailto:aaron@thelastpickle.com>>
wrote:
You are seeing this http://wiki.apache.org/cassandra/FAQ#range_rp

The hash for (uuid-1 , p1) may be 100 and the hash for (uuid-1, p2) may be 1 .

You cannot do what you want to. Even if you passed a start of (uuid1,<empty>) and no
finish, you would not only get rows where the key starts with uuid1.

This may be of use to you http://www.datastax.com/dev/blog/schema-in-cassandra-1-1

Or you can store all the priorities that are valid for an ID in another row.

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 14/03/2012, at 1:05 PM, John Laban wrote:

> Forwarding to the Cassandra mailing list as well, in case this is more of an issue on
how I'm using Cassandra.
>
> Am I correct to assume that I can use range queries on composite row keys, even when
using a RandomPartitioner, if I make sure that the first part of the composite key is fixed?
>
> Any help would be appreciated,
> John
>
>
>
> On Tue, Mar 13, 2012 at 12:15 PM, John Laban <john@pagerduty.com<mailto:john@pagerduty.com>>
wrote:
> Hi,
>
> I have a column family that uses a composite key:
>
> (ID, priority) -> ...
>
> Where the ID is a UUID and the priority is an integer.
>
> I'm trying to perform a range query now:  I want all the rows where the ID matches some
fixed UUID, but within a range of priorities.  This is supported even if I'm using a RandomPartitioner,
right?  (Because the first key in the composite key is the partition key, and the second part
of the composite key is automatically ordered?)
>
> So I perform a range slices query:
>
> val rangeQuery = HFactory.createRangeSlicesQuery(keyspace, new CompositeSerializer, StringSerializer.get,
BytesArraySerializer.get)
> rangeQuery.setColumnFamily(RouteColumnFamilyName).
>             setKeys( new Composite(id, priorityStart), new Composite(id, priorityEnd)
).
>             setRange( null, null, false, Int.MaxValue )
>
>
> But I get this error:
>
> me.prettyprint.hector.api.exceptions.HInvalidRequestException: InvalidRequestException(why:start
key's md5 sorts after end key's md5.  this is not allowed; you probably should not specify
end key at all, under RandomPartitioner)
>
> Shouldn't they have the same md5, since they have the same partition key?
>
> Am I using the wrong query here, or does Hector not support composte range queries, or
am I making some mistake in how I think Cassandra's composite keys work?
>
> Thanks,
> John
>
>



Mime
View raw message