incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Boris Yen <yulin...@gmail.com>
Subject Re: Re: Cassandra DataModeling recommendations
Date Mon, 05 Dec 2011 10:01:39 GMT
I think most of the book for cassandra are outdated, try to get information
from http://www.datastax.com/docs/1.0/index

As for ttl, you could read
http://www.datastax.com/dev/blog/whats-new-cassandra-07-expiring-columns for
more information.

for composite type, you could read
http://www.slideshare.net/edanuff/indexing-in-cassandra

On Mon, Dec 5, 2011 at 5:36 PM, <pcohen@cegetel.net> wrote:

> Hi
> Thanks for the answer, as I read the book on Cassandra, I was not aware at
> that time on Composite Key which I recently discovered.
>
> You mentioned a TTL and let the database remove the date for me. I never
> read about that. Is it possible without an external batch ?
>
> I will try to rephrase in any case my goal:
>
> Storage:
> - I would like to store for a user (identified by its id) several carts
> (BLOB).
> - Associated to these carts, I would like to attach metadata like
> expiration date and possibly others.
>
> Queries/tasks:
> - I would like to be able to retrieve all the carts of a given userId.
> - I would like to have a mean to remove expired carts.
>
> I think that this use case is not very complicated but I was wondering if
> there was any kind of modelling recommendation.
>
> Thanks for your answer.
>
> Best Regards
>
> ========================================
>
>
>
> Not sure I understand your use case, but I think you could use a composite
> column instead of composite key.
>
> For example,
>
> UserID:{
>     TimeUUID1:CartID1,
>     TimeUUID2:CartID2,
>     TimeUUID3:CartID3,
> }
>
> This way, you could do a slice query on the time if you do not need all
> the carts, and you could also get all the carts in one query.
>
> For expired carts, maybe you could attach TTL to each column that has time
> constraint, let the database remove the data for you.
>
> Hi all,
>> In order to evaluate NoSQL solutions and to gain knowledge, I am
>> currently working on a kind of prototype.
>> Here is a brief overview of the scope:
>>
>> I would like to manage user carts. Lets keep things simple:
>> A user can have up to n (lets say 3 for example) carts. Each cart will
>> contain metadata and among them an expiration date and a blob containing
>> stuff (xml in fact but I really don't care of the content).
>>
>> A user can save, retrieve or delete his carts. Additionally, a dedicated
>> batch process would remove carts who are expired.
>>
>> Basically I was thinking of two ways to model the data:
>> 1- A ColumnFamily with the userid as a key and having several
>> SuperColumns each one describing a Cart and its content.
>> This has the advantage that I can get all the Carts in a single get or
>> can do some slice queries to get only some Carts. The problem is that I
>> cannot if I am right create a secondary index on the expired date column
>> inside each Cart.
>> 2- A ColumnFamily with a composite key like userid::cartId containing the
>> expiration date column and the blob. I can in that case create an index to
>> perform a query on the expiration timestamp. The drawback is that if I want
>> to get all the Carts I need to create either a secondary ColumnFamily
>> listing the carts associated to a userid or use a kind of
>> OrderPreservingPartitionner if I want to perform a Key-Range Query.
>>
>> I made some tests and I had some problems
>> First I was unable to perform queries in the case 2 like:
>> get Carts where timestamp < xxxxxxx; The (ugly, really!) workaround was
>> to create a fake column always set to true and the query that worked was:
>> get Carts where dummy=true and timestamp < xxxxxxx; But I really dislike
>> this solution and I am almost sure this is not the right way to go.
>>
>> I tried something different like creating a dedicated timestamp
>> columnfamily associating a key based on a timestamp and columns related to
>> user and carts. In that case if I want outdated entries I could perform a
>> range query on keys of this columnfamily. But again in that case I need an
>> OrderPreservingPartionner and I fear that using a timestamp as a key would
>> lead to a bad repartition scheme among the nodes. If I fit to the second
>> proposal (with Standard Columns), columns could be directly the key like
>> userId::cartId and there is no logic in the removal process. If I fit to
>> first solution solution, I need to have some logic to analyze the column
>> key or value to get userid + cartid.
>> Another point, if I use this column family I have to manage "updates". If
>> for example I replace Cart2 of user1, I need to remove the corresponding
>> entry and add a new one. This is honestly probably not the hardest part.
>>
>> I have the feeling that having a userId based ColumnFamily with
>> SuperColumns inside and a dedicated timestamp table is the best choice. In
>> fact I think that basically my requests will be:
>> - Give me all the carts of a userId
>> - Remove all the expired carts: which is probably in fact 2 requests:
>> Find all carts whose expiry date is before a given date. Delete the found
>> stuff.
>>
>> I am fairly new to NoSQL and especially to Cassandra so I would like to
>> get any advice on:
>> 1- Is Cassandra suited to this kind of storage ? I would say yes
>> 2- What is the right way to model the data and the related constraints.
>>
>> If my description is unclear or anyone does need more details, do not
>> hesitate to ask
>> Thanks in advance for any help or advice
>>
>> Regards
>>
>> Pascal
>>
>
>
>

Mime
View raw message