cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Data Modelling Suggestions
Date Sun, 26 Aug 2012 22:27:58 GMT
> Im finding that only the first component is used ….is this understanding correct?
The result is correct. 

> to (end)component1=timestamp3,component2=123 
is less than 
> Timestamp3: 777

Example:

CREATE COLUMN FAMILY 
    Foo
WITH 
    key_validation_class = UTF8Type
AND 
    comparator = 'CompositeType(IntegerType, IntegerType)'
AND 
    default_validation_class = UTF8Type
;


set Foo['bar']['1:1'] = 'baz1';
set Foo['bar']['2:2'] = 'baz2';
set Foo['bar']['3:3'] = 'baz3';
set Foo['bar']['4:4'] = 'baz4';


aarons-MBP-2011:pycassa aaron$ ./pycassaShell -k dev
In [2]: FOO.get("bar")
Out[2]: OrderedDict([((1, 1), u'baz1'), ((2, 2), u'baz2'), ((3, 3), u'baz3'), ((4, 4), u'baz4')])

In [6]: FOO.get("bar", column_start=(2,2))
Out[6]: OrderedDict([((2, 2), u'baz2'), ((3, 3), u'baz3'), ((4, 4), u'baz4')])

In [8]: FOO.get("bar", column_start=(2,2), column_finish=(3,3))
Out[8]: OrderedDict([((2, 2), u'baz2'), ((3, 3), u'baz3')])

In [9]: FOO.get("bar", column_start=(2,2), column_finish=(3,1))
Out[9]: OrderedDict([((2, 2), u'baz2')])

In [10]: FOO.get("bar", column_start=(2,), column_finish=(3,))
Out[10]: OrderedDict([((2, 2), u'baz2'), ((3, 3), u'baz3')])

> We see a lot of examples about Timeseries modelling ...

Sorry I do not understand this question. 

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 24/08/2012, at 11:17 PM, Roshni Rajagopal <Roshni.Rajagopal@wal-mart.com> wrote:

> Thank you Aaron & Guillermo,
> 
> I find composite columns very confusing :(
> To reconfirm ,
> 
> 1.  we can only search for columns  range with the first component on the composite column.
> 2.  After specifying a range for the first component, we cannot further filter for the
second component.  I found this link http://doanduyhai.wordpress.com/2012/07/05/apache-cassandra-tricks-and-traps/
 which seems to suggest filtering is possible by second component in addition to first, and
I tried the same example but I couldn't get it to work. Does anyone have an example where
suppose I have data like this in my column names
> 
> Timestamp1: 123, Timestamp2: 456, Timestamp3: 777,Timestamp4: 654  ---get range of columns
for (start)component1 = timestamp1, component2=123 , to (end)component1=timestamp3,component2=123
 --> should give me only one column
> Im finding that only the first component is used ….is this understanding correct?
> 
> 
> We see a lot of examples about Timeseries modelling with TimeUUID as column names. But
how is the updating or deletion of columns happening here, how are the columns found to know
which ones to delete or modify. Does one always need a separate column family to handle updating/deletion
for time series, or is usually handled by setting TTL for data outside the archival period,
or does time series modelling usually not involve any manipulation of past records?
> 
> Regards,
> Roshni
> 
> 
> 
> From: aaron morton <aaron@thelastpickle.com<mailto:aaron@thelastpickle.com>>
> Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
> To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
> Subject: Re: Data Modelling Suggestions
> 
> I was trying to find hector examples where we search for second column in a composite
column, but I couldn't find any good one. Im not sure if its possible.…if you have any do
have any example please share.
> It's not. When slicing columns you can only return one contiguous range.
> 
> Anyway I would prefer storing the item-ids as column names in the main column family
and having a second CF for the order-by-date query only with the pair timestamp_itemid. That
way you can add later other query strategies without messing with how you store the item
> +1
> Have the orders somewhere, and build a time ordered custom index to show them in order.
> 
> Cheers
> 
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 24/08/2012, at 6:28 AM, Guillermo Winkler <gwinkler@inconcertcc.com<mailto:gwinkler@inconcertcc.com>>
wrote:
> 
> I think you need another CF as index.
> 
> user_itemid -> timestamped column_name
> 
> Otherwise you can't guess what's the timestamp to use in the column name.
> 
> Anyway I would prefer storing the item-ids as column names in the main column family
and having a second CF for the order-by-date query only with the pair timestamp_itemid. That
way you can add later other query strategies without messing with how you store the item information.
> 
> Maybe you can solve it with a secondary index by timestamp too.
> 
> Guille
> 
> 
> On Thu, Aug 23, 2012 at 7:26 AM, Roshni Rajagopal <Roshni.Rajagopal@wal-mart.com<mailto:Roshni.Rajagopal@wal-mart.com>>
wrote:
> Hi,
> 
> Need some help on a data modelling question. We're using Hector & Datastax Enterprise
2.1.
> 
> 
> I want to associate a list of items for a user. It should be sorted on the time added.
And items can be updated (quantity of the item can be changed), and items can be deleted.
> I can model it like this so that its denormalized and I get all my information in one
go from one row, sorted by time added. I can use composite columns.
> 
> Row key: User Id
> Column Name: TimeUUID:item ID: Item Name: Item Description: Item Price: Item Qty
> Column Value : Null
> 
> Now, how do I handle manipulations
> 
> 1.  Add new item :Easy , just a new column
> 2.  Add exiting item or modify qty: I want to get to the correct column to update . Can
I search by second column in the composite column (equals condition) & update the column
name itself to reflect new TimeUUID and qty?  Or would it be better to just add it as a new
column and always use the latest column for an item in the application code and delete duplicates
in the background.
> 3.  Delete item: Can I search by second column in the composite column to find the correct
column to delete?
> 
> I was trying to find hector examples where we search for second column in a composite
column, but I couldn't find any good one. Im not sure if its possible.…if you have any do
have any example please share.
> 
> Regards,
> Roshni
> 
> 
> This email and any files transmitted with it are confidential and intended solely for
the individual or entity to whom they are addressed. If you have received this email in error
destroy it immediately. *** Walmart Confidential ***
> 
> 
> This email and any files transmitted with it are confidential and intended solely for
the individual or entity to whom they are addressed. If you have received this email in error
destroy it immediately. *** Walmart Confidential ***


Mime
View raw message