incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aditya <ady...@gmail.com>
Subject Re: Seeking advice on Schema and Caching
Date Thu, 17 Nov 2011 05:34:28 GMT
On Thu, Nov 17, 2011 at 10:25 AM, samal <samalgorai@gmail.com> wrote:

> >> Edanuff + Beautiful People
>
> I think "row cache" could be the best fit but it can take resource
> depending on row size. It will only touch disk once (first time) in case of
> SST, rest of the req for that row will be served from memory. Try
> increasing row cache size and decreasing save period to appropriate value
> *Row cache size / save period in seconds: *200/30
>

Very nice . I didn't knew that we could even have the "save period" setting
as well. This makes the job easier. Now can reduce the period to 30 sec &
put the row cache size to a good enough limit. Thanks :)

Yes there may be rows that will be very wide, I'll need to figure if I can
do something better for that, but even this wont be problematic until my
cache period is reasonable and cache size is set to a good limit, right ?

>> one catch this is only good for small size row, as your one row contain
> all entry with first 3 similar char, this can happen that one row could
> become very large while other remain very thin.
> eg:
>  many ppl can have aditya name
> adi{
> {tya,1}
> .
> .
> }
>
> but only few ppl will have name with x or y.
>
>
>
> On Thu, Nov 17, 2011 at 3:29 AM, Aditya <adynnn@gmail.com> wrote:
>
>> Thanks to samal who pointed to look at the composite columns. I am now
>> using composite columns names containing username+userId & valueless
>> column. Thus column names are now unique even for users with same name as
>> userId is also attached to the same composite col name. Thus the
>> supercolumn issue is resolved.
>> But I am still seeking advice some on the caching strategy for these
>> rows. Since while a user is doing the search, the DB will be
>> queried multiple times because  I 'm not keeping the retrieved columns in
>> the application layer. Thus I am thinking of caching this row so that
>> the further queries be served through the cache. However the important
>> point here is that I am using very fewer resources for this cache so that
>> the rows remain in cache for a very short time so as to serve the needs
>> only for a single search time interval like max 30 seconds. Is this
>> approach correct.? That way I wont be putting unneccessary data in cache
>> for a long time thus saving resources for other needs.
>>
>>
>> On Wed, Nov 16, 2011 at 11:20 AM, samal <samalgorai@gmail.com> wrote:
>>
>>> I think you can but I am not sure, I haven't tried that yet, Nothing
>>> harm in keeping value also it will be read in single query only.
>>>
>>> In 2nd case, yes 2 or more query required to get specific user details.
>>> As username is map to user_id's key(unique like UUID) and user_id key store
>>> actual details.
>>>
>>>
>>> On Wed, Nov 16, 2011 at 11:10 AM, Aditya Narayan <adynnn@gmail.com>wrote:
>>>
>>>> Regarding the first option that you suggested through composite
>>>> columns, can I store the username & id both in the column name and keep
the
>>>> column valueless?
>>>> Will I be able to retrieve both the username and id from the composite
>>>> col name ?
>>>>
>>>> Thanks a lot
>>>>
>>>> On Wed, Nov 16, 2011 at 10:56 AM, Aditya Narayan <adynnn@gmail.com>wrote:
>>>>
>>>>> Got the first option that you suggested.
>>>>>
>>>>> However, In the second one, are you suggested to use, for e.g,
>>>>> key='Marcos' & store cols, for all users of that name, containing
userId
>>>>> inside that row. That way it would have to read multiple rows while user
is
>>>>> doing a single search.
>>>>>
>>>>>
>>>>> On Wed, Nov 16, 2011 at 10:47 AM, samal <samalgorai@gmail.com>
wrote:
>>>>>
>>>>>>
>>>>>>  > I need to add 'search users' functionality to my application.
(The
>>>>>>>> trigger for fetching searched items(like google instant search)
is made
>>>>>>>> when 3 letters have been typed in).
>>>>>>>> >
>>>>>>>> > For this, I make a CF with String type keys. Each such
key is
>>>>>>>> made of first 3 letters of a user's name.
>>>>>>>> >
>>>>>>>> > Thus all names starting with 'Mar-' are stored in single
row
>>>>>>>> (with key="Mar").
>>>>>>>> > The column names are framed as remaining letters of
the names.
>>>>>>>> Thus, a name 'Marcos' will be stored within rowkey "Mar"
& col name "cos".
>>>>>>>> The id will be stored as column value. Since there could
be many users with
>>>>>>>> same name. Thus I would have multple userIds(of users named
"Marcos") to be
>>>>>>>> stored inside columnname "cos" under key "Mar". Thus,
>>>>>>>> >
>>>>>>>> > 1. Supercolumn seems to be a better fit for my use case(so
that
>>>>>>>> ids of users with same name may fit as sub-columns inside
a super-column)
>>>>>>>> but since supercolumns are not encouraged thus I want to
use an alternative
>>>>>>>> schema for this usecase if possible. Could you suggest some
ideas on this ?
>>>>>>>> >
>>>>>>>>
>>>>>>>
>>>>>> Aditya,
>>>>>>
>>>>>> Have you any given thought on Composite columns [1]. I think it can
>>>>>> help you solve your problem of multiple user with same name.
>>>>>>
>>>>>> mar:{
>>>>>>   {cos,unique_user_id}:unique_user_id,
>>>>>>   {cos,1}:1,
>>>>>>   {cos,2}:2,
>>>>>>   {cos,3}:3,
>>>>>>
>>>>>> //  {utf8,timeUUID}:timeUUID,
>>>>>> }
>>>>>> OR
>>>>>> you can try wide rows indexing user name to ID's
>>>>>>
>>>>>> marcos{
>>>>>>    user1:' ',
>>>>>>    user2:' ',
>>>>>>    user3:' '
>>>>>> }
>>>>>>
>>>>>> [1]http://www.slideshare.net/edanuff/indexing-in-cassandra
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message