cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Composite keys and range queries
Date Thu, 15 Mar 2012 08:57:47 GMT
>  is there any disadvantage to using supercolumns here? 
There are some http://wiki.apache.org/cassandra/CassandraLimitations

I would avoid them if you can. The one thing you cannot do when using CompositeTypes for column
names is  a range delete. If you delete a super column, then you delete all the sub columns.
However if you have a two part column name you cannot delete everything that matches "foo:*"

> They seem a little cleaner and more straightforward for my use case, since I don't have
the advantage of the CQL composite key thing.
If they scratch your it's grab the 1.1 beta and give them a try and let us know how they work
for you. 
http://cassandra.apache.org/download/

Cheers


-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 15/03/2012, at 10:23 AM, John Laban wrote:

> Ahhh, ok, I thought that CQL was just being brought up to date with the functionality
already built into composite keys, but I guess I was mistaken there.  
> 
> But I guess it's just providing a convenient abstraction, using composite column names
under the hood.  That's where I was confused, thanks.
> 
> So, in terms of composite column names vs supercolumns:  is the only advantage to composite
column names that you can do column slicing on subsets of the "subcolumns"? I.e. if I don't
mind loading all of the subcolumns for a given supercolumn name in memory at once (since I
need them all anyway), is there any disadvantage to using supercolumns here?  They seem a
little cleaner and more straightforward for my use case, since I don't have the advantage
of the CQL composite key thing.
> 
> Thanks,
> John
> 
> 
> On Wed, Mar 14, 2012 at 12:53 PM, Jeremiah Jordan <JEREMIAH.JORDAN@morningstar.com>
wrote:
> Right, so until the new CQL stuff exists to actually query with something smart enough
to know about "composite keys" , You have to define and query on your own.
> 
> Row Key = UUID
> Column = CompositeColumn(string, string)
> 
> You want to then use COLUMN slicing, not row ranges to query the data.  Where you slice
in priority as the first part of a Composite Column Name.
> 
> See the "Under the hood and historical notes" section of the blog post.  You want to
layout your data per the "Physical representation of the denormalized timeline rows" diagram.
> Where your UUID is the "user_id" from the example, and your priority is the "tweet_id"
> 
> -Jeremiah
> 
> 
> From: John Laban [john@pagerduty.com]
> Sent: Wednesday, March 14, 2012 12:37 PM
> To: user@cassandra.apache.org
> Subject: Re: Composite keys and range queries
> 
> Hmm, now I'm really confused.
> 
> > This may be of use to you http://www.datastax.com/dev/blog/schema-in-cassandra-1-1
> 
> This article is what I actually used to come up with my schema here.  In the "Clustering,
composite keys, and more" section they're using a schema very similarly to how I'm trying
to use it.  They define a composite key with two parts, expecting the first part to be used
as the partition key and the second part to be used for ordering.
> 
> > The hash for (uuid-1 , p1) may be 100 and the hash for (uuid-1, p2) may be 1 .
> 
> Why?  Shouldn't only "uuid-1" be used as the partition key?  (So shouldn't those two
hash to the same location?)
> 
> I'm thinking of using supercolumns for this instead as I know they'll work (where the
row key is the uuid and the supercolumn name is the priority), but aren't composite row keys
supposed to essentially replace the need for supercolumns?
> 
> Thanks, and sorry if I'm getting this all wrong,
> John
> 
> 
> 
> On Wed, Mar 14, 2012 at 12:52 AM, aaron morton <aaron@thelastpickle.com> wrote:
> You are seeing this http://wiki.apache.org/cassandra/FAQ#range_rp
> 
> The hash for (uuid-1 , p1) may be 100 and the hash for (uuid-1, p2) may be 1 .
> 
> You cannot do what you want to. Even if you passed a start of (uuid1,<empty>) and
no finish, you would not only get rows where the key starts with uuid1.
> 
> This may be of use to you http://www.datastax.com/dev/blog/schema-in-cassandra-1-1
> 
> Or you can store all the priorities that are valid for an ID in another row.
> 
> Cheers
> 
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 14/03/2012, at 1:05 PM, John Laban wrote:
> 
> > Forwarding to the Cassandra mailing list as well, in case this is more of an issue
on how I'm using Cassandra.
> >
> > Am I correct to assume that I can use range queries on composite row keys, even
when using a RandomPartitioner, if I make sure that the first part of the composite key is
fixed?
> >
> > Any help would be appreciated,
> > John
> >
> >
> >
> > On Tue, Mar 13, 2012 at 12:15 PM, John Laban <john@pagerduty.com> wrote:
> > Hi,
> >
> > I have a column family that uses a composite key:
> >
> > (ID, priority) -> ...
> >
> > Where the ID is a UUID and the priority is an integer.
> >
> > I'm trying to perform a range query now:  I want all the rows where the ID matches
some fixed UUID, but within a range of priorities.  This is supported even if I'm using a
RandomPartitioner, right?  (Because the first key in the composite key is the partition key,
and the second part of the composite key is automatically ordered?)
> >
> > So I perform a range slices query:
> >
> > val rangeQuery = HFactory.createRangeSlicesQuery(keyspace, new CompositeSerializer,
StringSerializer.get, BytesArraySerializer.get)
> > rangeQuery.setColumnFamily(RouteColumnFamilyName).
> >             setKeys( new Composite(id, priorityStart), new Composite(id, priorityEnd)
).
> >             setRange( null, null, false, Int.MaxValue )
> >
> >
> > But I get this error:
> >
> > me.prettyprint.hector.api.exceptions.HInvalidRequestException: InvalidRequestException(why:start
key's md5 sorts after end key's md5.  this is not allowed; you probably should not specify
end key at all, under RandomPartitioner)
> >
> > Shouldn't they have the same md5, since they have the same partition key?
> >
> > Am I using the wrong query here, or does Hector not support composte range queries,
or am I making some mistake in how I think Cassandra's composite keys work?
> >
> > Thanks,
> > John
> >
> >
> 
> 
> 


Mime
View raw message