incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <edlinuxg...@gmail.com>
Subject Re: Doubts related to composite type column names/values
Date Mon, 26 Dec 2011 16:12:25 GMT
I would go with composites because cassandra can do better validation. Also
with composites you have a few more options for your slice start; key
inclusive start key exclusive etc. If you are going to concat, tilde is a
better option then : because of It's ASCII value.

On Wednesday, December 21, 2011, aaron morton <aaron@thelastpickle.com>
wrote:
> Keys are sorted by their token, when using the RandomPartitioner this is
a MD5 hash. So they are essentially randomly sorted.
> I would use CompositeTypes as keys if they make sense for your app. e.g.
 you are storing time series data and the row key is the time stamp and the
length of the time span. In this case you have a stable known format of
<int : str>.  The advantage here is the same as any time you introduce type
awareness into a system, somewhere some code notice if you try to store a
key of the wrong form.
> If you have keys that have a variable number of elements, such as a path
hierarchy it would not make sense to model that as a CompositeType (IMHO).
> Cheers
>
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> On 22/12/2011, at 1:26 AM, R. Verlangen wrote:
>
> Is it true that you can also just get the same results as when you pick a
UTF8 key with this content:
> keyA:keyB
> Of should you really use the composite keys? If so, what is the big
advantage of composite over combined utf-8 keys?
> Robin
>
> 2011/12/21 Sylvain Lebresne <sylvain@datastax.com>
>>
>> On Tue, Dec 20, 2011 at 9:33 PM, Maxim Potekhin <potekhin@bnl.gov> wrote:
>> > Thank you Aaron! As long as I have plain strings, would you say that I
would
>> > do almost as well with catenation?
>>
>> Not without a concatenation aware comparator. The padding aaron is
talking of
>> is not a mixed type problem only. What I mean here is that if you use a
simple
>> string comparator (UTF8Type, AsciiType or even BytesType), then you will
have
>> the following sorting:
>> "foo24:bar"
>> "foo:bar"
>> "foobar:bar"
>> because ':' is between '2' and 'b' in ascii, you could use another
separator but
>> you get the point. In other words, concatenating strings doesn't make the
>> comparator aware of that fact.
>> CompositeType on the other hand sorts each component separately, so it
will
>> sort:
>> "foo"      : "bar"
>> "foo24"  : "bar"
>> "foobar" : "bar"
>> which is usually what you want.
>>
>> --
>> Sylvain
>>
>> >
>> > Of course I realize that mixed types are a very different case where
the
>> > composite is very useful.
>> >
>> > Thanks
>> >
>> > Maxim
>> >
>> >
>> >
>> > On 12/20/2011 2:44 PM, aaron morton wrote:
>> >
>> > Component values are compared in a type aware fashion, an Integer is an
>> > Integer. Not a 10 character zero padded string.
>> >
>> > You can also slice on the components. Just like with string concat, but
>> > nicer.  . e.g. If you app is storing comments for a thing, and the
column
>> > names have the form <comment_id, field> or  <Integer, String> you
can
slice
>> > for all properties of a comment or all properties for comments between
two
>> > comment_id's
>> >
>> > Finally, the client library knows what's going on.
>> >
>> > Hope that helps.
>> >
>> > -----------------
>> > Aaron Morton
>> > Freelance Developer
>> > @aaronmorton
>> > http://www.thelastpickle.com
>> >
>> > On 21/12/2011, at 7:43 AM, Maxim Potekhin wrote:
>> >
>> > With regards to static, what are major benefits as it compares with
>> > string catenation (with some convenient separator inserted)?
>> >
>> > Thanks
>> >
>> > Maxim
>> >
>> >
>> > On 12/20/2011 1:39 PM, Richard Low wrote:
>> >
>> > On Tue, Dec 20, 2011 at 5:28 PM, Ertio Lew<ertiop93@gmail.com>  wrote:
>> >
>> > With regard to the composite columns stuff in Cassandra, I have the
>> >
>> > following doubts :
>> >
>> >
>> > 1. What is the storage overhead of the composite type column
names/values,
>> >
>> > The values are the same.  For each dimension, there is 3 bytes
overhead.
>> >
>> >
>> > 2. what exactly is the difference between the DynamicComposite and
Static
>> >
>> > Composite ?
>> >
>> > Static composite type has the types of each dimension specified in the
>> >
>> > column family definition, so all names within that column family have
>> >
>> > the same type.  Dynamic composite type lets you specify the type for
>> >
>> > each column, so they can be different.  There is extra storage
>> >
>> > overhead for this and care must be taken to ensure all column names
>> >
>> > remain comparable.
>> >
>> >
>> >
>> >
>> >
>
>
>

Mime
View raw message