incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Roshni Rajagopal <Roshni.Rajago...@wal-mart.com>
Subject Re: Data Modelling Suggestions
Date Fri, 24 Aug 2012 11:17:20 GMT
Thank you Aaron & Guillermo,

I find composite columns very confusing :(
To reconfirm ,

 1.  we can only search for columns  range with the first component on the composite column.
 2.  After specifying a range for the first component, we cannot further filter for the second
component.  I found this link http://doanduyhai.wordpress.com/2012/07/05/apache-cassandra-tricks-and-traps/
 which seems to suggest filtering is possible by second component in addition to first, and
I tried the same example but I couldn't get it to work. Does anyone have an example where
suppose I have data like this in my column names

Timestamp1: 123, Timestamp2: 456, Timestamp3: 777,Timestamp4: 654  ---get range of columns
for (start)component1 = timestamp1, component2=123 , to (end)component1=timestamp3,component2=123
 --> should give me only one column
Im finding that only the first component is used ….is this understanding correct?


We see a lot of examples about Timeseries modelling with TimeUUID as column names. But how
is the updating or deletion of columns happening here, how are the columns found to know which
ones to delete or modify. Does one always need a separate column family to handle updating/deletion
for time series, or is usually handled by setting TTL for data outside the archival period,
or does time series modelling usually not involve any manipulation of past records?

Regards,
Roshni



From: aaron morton <aaron@thelastpickle.com<mailto:aaron@thelastpickle.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Re: Data Modelling Suggestions

I was trying to find hector examples where we search for second column in a composite column,
but I couldn't find any good one. Im not sure if its possible.…if you have any do have any
example please share.
It's not. When slicing columns you can only return one contiguous range.

Anyway I would prefer storing the item-ids as column names in the main column family and having
a second CF for the order-by-date query only with the pair timestamp_itemid. That way you
can add later other query strategies without messing with how you store the item
+1
Have the orders somewhere, and build a time ordered custom index to show them in order.

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 24/08/2012, at 6:28 AM, Guillermo Winkler <gwinkler@inconcertcc.com<mailto:gwinkler@inconcertcc.com>>
wrote:

I think you need another CF as index.

user_itemid -> timestamped column_name

Otherwise you can't guess what's the timestamp to use in the column name.

Anyway I would prefer storing the item-ids as column names in the main column family and having
a second CF for the order-by-date query only with the pair timestamp_itemid. That way you
can add later other query strategies without messing with how you store the item information.

Maybe you can solve it with a secondary index by timestamp too.

Guille


On Thu, Aug 23, 2012 at 7:26 AM, Roshni Rajagopal <Roshni.Rajagopal@wal-mart.com<mailto:Roshni.Rajagopal@wal-mart.com>>
wrote:
Hi,

Need some help on a data modelling question. We're using Hector & Datastax Enterprise
2.1.


I want to associate a list of items for a user. It should be sorted on the time added. And
items can be updated (quantity of the item can be changed), and items can be deleted.
I can model it like this so that its denormalized and I get all my information in one go from
one row, sorted by time added. I can use composite columns.

Row key: User Id
Column Name: TimeUUID:item ID: Item Name: Item Description: Item Price: Item Qty
Column Value : Null

Now, how do I handle manipulations

 1.  Add new item :Easy , just a new column
 2.  Add exiting item or modify qty: I want to get to the correct column to update . Can I
search by second column in the composite column (equals condition) & update the column
name itself to reflect new TimeUUID and qty?  Or would it be better to just add it as a new
column and always use the latest column for an item in the application code and delete duplicates
in the background.
 3.  Delete item: Can I search by second column in the composite column to find the correct
column to delete?

I was trying to find hector examples where we search for second column in a composite column,
but I couldn't find any good one. Im not sure if its possible.…if you have any do have any
example please share.

Regards,
Roshni


This email and any files transmitted with it are confidential and intended solely for the
individual or entity to whom they are addressed. If you have received this email in error
destroy it immediately. *** Walmart Confidential ***


This email and any files transmitted with it are confidential and intended solely for the
individual or entity to whom they are addressed. If you have received this email in error
destroy it immediately. *** Walmart Confidential ***

Mime
View raw message