cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aditya Narayan <>
Subject Re: Seeking advice on Schema and Caching
Date Wed, 16 Nov 2011 04:22:19 GMT
Hi Ben,

Solr, as I understood is for implementing full text
search capability within documents, but in my case, as of now I just need
to implement search on user names which seems to be easily provided
by Cassandra as user names (as column names) may be sorted alphabetically
within rows. I am splitting these rows by the first three characters of the
username. Thus all user names starting with 'Mar' are stored in a row with
key 'Mar'.  Column values store the userId of that user.

So Cassandra seems to fully satisfy my needs for this. Only issue i m
having is how to deal with multiple users of same name. Thus super
columns seem to fit appropriately but I really want to avoid them since
they are seriously discouraged by everyone.

On Wed, Nov 16, 2011 at 3:19 AM, Ben Gambley <>wrote:

> Hi Aditya
> Not sure the best way to do in Cassandra but have you considered using
> apache solr - you could then include just the row keys pointing back
> to Cassandra where the actual data is.
> Solr seems quite capable of performing google like searches and is fast.
> Cheers
> Ben
> On 16/11/2011, at 1:50 AM, Aditya Narayan <> wrote:
> > Hi
> >
> > I need to add 'search users' functionality to my application. (The
> trigger for fetching searched items(like google instant search) is made
> when 3 letters have been typed in).
> >
> > For this, I make a CF with String type keys. Each such key is made of
> first 3 letters of a user's name.
> >
> > Thus all names starting with 'Mar-' are stored in single row (with
> key="Mar").
> > The column names are framed as remaining letters of the names. Thus, a
> name 'Marcos' will be stored within rowkey "Mar" & col name "cos". The id
> will be stored as column value. Since there could be many users with same
> name. Thus I would have multple userIds(of users named "Marcos") to be
> stored inside columnname "cos" under key "Mar". Thus,
> >
> > 1. Supercolumn seems to be a better fit for my use case(so that ids of
> users with same name may fit as sub-columns inside a super-column) but
> since supercolumns are not encouraged thus I want to use an alternative
> schema for this usecase if possible. Could you suggest some ideas on this ?
> >
> > 2. Another thing, I would like to row cache this CF so that when the
> user types in the next character & the query is made consequently, then
> this row be retrieved from the cache without touching DB. It is expected
> while searching for a single username, the query(as a part of making
> instantaneous suggestions) will be made at least 2-3 times. One may also
> suggest to fetch all the columns starting with queried string to be
> retrieved & then filter out at application level but what about just
> fecthing the exact no of columns(ids/names of users) I need to show to the
> user. Thus instead of keeping all the hundreds of cols in the application
> layer what about keeping it within the DB cache.!?
> > The space alloted for the cache will be very small so that row remains
> in cache for a very short time(enough to serve only for the time duration
> while user is making a single search!?) ?
> >
> >

View raw message