incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aditya Narayan <ady...@gmail.com>
Subject Re: Schema Design Question : Supercolumn family or just a Standard column family with columns containing serialized aggregate data?
Date Wed, 02 Feb 2011 15:43:35 GMT
I think you got it exactly what I wanted to convey except for few
things I want to clarify:

I was thinking of a single row containing all reminders (& not split
by day). History of the reminders need to be maintained for some time.
After certain time (say 3 or 6 months) they may be deleted by ttl
facility.

"While presenting the reminders timeline to the user, latest
supercolumns like around 50 from the start_end will be picked up and
their subcolumns values will be compared to the Tags user has chosen
to see and, corresponding to the filtered subcolumn values(tags), the
rows of the reminder details would be picked up.."

Is supercolumn a preferable choice for this ? Can there be a better
schema than this ?


-Aditya Narayan



On Wed, Feb 2, 2011 at 8:54 PM, William R Speirs <bill.speirs@gmail.com> wrote:
> To reiterate, so I know we're both on the same page, your schema would be
> something like this:
>
> - A column family (as you describe) to store the details of a reminder. One
> reminder per row. The row key would be a TimeUUID.
>
> - A super column family to store the reminders for each user, for each day.
> The row key would be something like: YYYYMMDD:user_id. The column names
> would simply be the TimeUUID of the messages. The sub column names would be
> the tag names of the various reminders.
>
> The idea is that you would then get a slice of each row for a user, for a
> day, that would only contain sub column names with the tags you're looking
> for? Then based upon the column names returned, you'd look-up the reminders.
>
> That seems like a solid schema to me.
>
> Bill-
>
> On 02/02/2011 09:37 AM, Aditya Narayan wrote:
>>
>> Actually, I am trying to use Cassandra to display to users on my
>> applicaiton, the list of all Reminders set by themselves for
>> themselves, on the application.
>>
>> I need to store rows containing the timeline of daily Reminders put by
>> the users, for themselves, on application. The reminders need to be
>> presented to the user in a chronological order like a news feed.
>> Each reminder has got certain tags associated with it(so that, at
>> times, user may also choose to see the reminders filtered by tags in
>> chronological order).
>>
>> So I thought of a schema something like this:-
>>
>> -Each Reminder details may be stored as separate rows in column family.
>> -For presenting the timeline of reminders set by user to be presented
>> to the user, the timeline row of each user would contain the Id/Key(s)
>> (of the Reminder rows) as the supercolumn names and the subcolumns
>> inside that supercolumns could contain the list of tags associated
>> with particular reminder. All tags set at once during first write. The
>> no of tags(subcolumns) will be around 8 maximum.
>>
>> Any comments, suggestions and feedback on the schema design are
>> requested..
>>
>> Thanks
>> Aditya Narayan
>>
>>
>> On Wed, Feb 2, 2011 at 7:49 PM, Aditya Narayan<adynnn@gmail.com>  wrote:
>>>
>>> Hey all,
>>>
>>> I need to store supercolumns each with around 8 subcolumns;
>>> All the data for a supercolumn is written at once and all subcolumns
>>> need to be retrieved together. The data in each subcolumn is not big,
>>> it just contains keys to other rows.
>>>
>>> Would it be preferred to have a supercolumn family or just a standard
>>> column family containing "all the subcolumns data serialized in single
>>> column(s) " ?
>>>
>>> Thanks
>>> Aditya Narayan
>>>
>

Mime
View raw message