incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Doubts related to composite type column names/values
Date Wed, 21 Dec 2011 21:17:25 GMT
Keys are sorted by their token, when using the RandomPartitioner this is a MD5 hash. So they
are essentially randomly sorted. 

I would use CompositeTypes as keys if they make sense for your app. e.g.  you are storing
time series data and the row key is the time stamp and the length of the time span. In this
case you have a stable known format of <int : str>.  The advantage here is the same
as any time you introduce type awareness into a system, somewhere some code notice if you
try to store a key of the wrong form. 

If you have keys that have a variable number of elements, such as a path hierarchy it would
not make sense to model that as a CompositeType (IMHO).

Cheers


-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 22/12/2011, at 1:26 AM, R. Verlangen wrote:

> Is it true that you can also just get the same results as when you pick a UTF8 key with
this content:
> keyA:keyB
> 
> Of should you really use the composite keys? If so, what is the big advantage of composite
over combined utf-8 keys?
> 
> Robin
> 
> 2011/12/21 Sylvain Lebresne <sylvain@datastax.com>
> On Tue, Dec 20, 2011 at 9:33 PM, Maxim Potekhin <potekhin@bnl.gov> wrote:
> > Thank you Aaron! As long as I have plain strings, would you say that I would
> > do almost as well with catenation?
> 
> Not without a concatenation aware comparator. The padding aaron is talking of
> is not a mixed type problem only. What I mean here is that if you use a simple
> string comparator (UTF8Type, AsciiType or even BytesType), then you will have
> the following sorting:
> "foo24:bar"
> "foo:bar"
> "foobar:bar"
> because ':' is between '2' and 'b' in ascii, you could use another separator but
> you get the point. In other words, concatenating strings doesn't make the
> comparator aware of that fact.
> CompositeType on the other hand sorts each component separately, so it will
> sort:
> "foo"      : "bar"
> "foo24"  : "bar"
> "foobar" : "bar"
> which is usually what you want.
> 
> --
> Sylvain
> 
> >
> > Of course I realize that mixed types are a very different case where the
> > composite is very useful.
> >
> > Thanks
> >
> > Maxim
> >
> >
> >
> > On 12/20/2011 2:44 PM, aaron morton wrote:
> >
> > Component values are compared in a type aware fashion, an Integer is an
> > Integer. Not a 10 character zero padded string.
> >
> > You can also slice on the components. Just like with string concat, but
> > nicer.  . e.g. If you app is storing comments for a thing, and the column
> > names have the form <comment_id, field> or  <Integer, String> you can
slice
> > for all properties of a comment or all properties for comments between two
> > comment_id's
> >
> > Finally, the client library knows what's going on.
> >
> > Hope that helps.
> >
> > -----------------
> > Aaron Morton
> > Freelance Developer
> > @aaronmorton
> > http://www.thelastpickle.com
> >
> > On 21/12/2011, at 7:43 AM, Maxim Potekhin wrote:
> >
> > With regards to static, what are major benefits as it compares with
> > string catenation (with some convenient separator inserted)?
> >
> > Thanks
> >
> > Maxim
> >
> >
> > On 12/20/2011 1:39 PM, Richard Low wrote:
> >
> > On Tue, Dec 20, 2011 at 5:28 PM, Ertio Lew<ertiop93@gmail.com>  wrote:
> >
> > With regard to the composite columns stuff in Cassandra, I have the
> >
> > following doubts :
> >
> >
> > 1. What is the storage overhead of the composite type column names/values,
> >
> > The values are the same.  For each dimension, there is 3 bytes overhead.
> >
> >
> > 2. what exactly is the difference between the DynamicComposite and Static
> >
> > Composite ?
> >
> > Static composite type has the types of each dimension specified in the
> >
> > column family definition, so all names within that column family have
> >
> > the same type.  Dynamic composite type lets you specify the type for
> >
> > each column, so they can be different.  There is extra storage
> >
> > overhead for this and care must be taken to ensure all column names
> >
> > remain comparable.
> >
> >
> >
> >
> >
> 


Mime
View raw message