incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aditya Narayan <ady...@gmail.com>
Subject Re: Programmatically allow only one out of two types of rows in a CF to enter the CACHE
Date Sat, 29 Oct 2011 20:35:01 GMT
Thanks Zach, Nice Idea !

and what about looking at, may be, some custom caching solutions, leaving
aside cassandra caching   .. ?



On Sun, Oct 30, 2011 at 2:00 AM, Zach Richardson <
j.zach.richardson@gmail.com> wrote:

> Aditya,
>
> Depending on how often you have to write to the database, you could
> perform dual writes to two different column families, one that has
> summary + details in it, and one that only has the summary.
>
> This way you can get everything with one query, or the summary with
> one query, this should also help optimize your caching.
>
> The question here would of course be whether or not you have a read or
> write heavy workload.  Since you seem to be concerned about the
> caching, it sounds like you have more of a read heavy workload and
> wouldn't pay to heavily with the dual writes.
>
> Zach
>
>
> On Sat, Oct 29, 2011 at 2:21 PM, Mohit Anchlia <mohitanchlia@gmail.com>
> wrote:
> > On Sat, Oct 29, 2011 at 11:23 AM, Aditya Narayan <adynnn@gmail.com>
> wrote:
> >> @Mohit:
> >> I have stated the example scenarios in my first post under this heading.
> >> Also I have stated above why I want to split that data in two rows &
> like
> >> Ikeda below stated, I'm too trying out to prevent the frequently
> accessed
> >> rows being bloated with large data & want to prevent that data from
> entering
> >> cache as well.
> >
> > I think you are missing the point. You don't get any benefit
> > (performance, access), you are already breaking it into 2 rows.
> >
> > Also, I don't know of any way where you can selectively keep the rows
> > or keys in the cache. Other than having some background job that keeps
> > the cache hot with those keys/rows you only have one option of keeping
> > it in different CF since you are already breaking a row in 2 rows.
> >
> >>
> >>> Okay so as most know this practice is called a wide row - we use them
> >>> quite a lot. However, as your schema shows it will cache (while being
> >>> active) all the row in memory.  One way we got around this issue was to
> >>> basically create some materialized views of any more common data so we
> can
> >>> easily get to the minimum amount of information required without
> blowing too
> >>> much memory with the larger representations.
> >>
> >> Yes exactly this is problem I am facing but I want to keep the both the
> >> types(common + large/detailed) of data in single CF so that it could
> server
> >> 'two materialized views'.
> >>
> >>>
> >>> My perspective is that indexing some of the higher levels of data would
> be
> >>> the way to go - Solr or elastic search for distributed or if you know
> you
> >>> only need it local just use a caching solution like ehcache
> >>
> >> What do you mean exactly by  "indexing some of the higher levels of
> data" ?
> >>
> >> Thanks you guys!
> >>
> >>
> >>
> >>>
> >>> Anthony
> >>>
> >>>
> >>> On 28/10/2011, at 21:42 PM, Aditya Narayan wrote:
> >>>
> >>> > I need to keep the data of some entities in a single CF but split in
> two
> >>> > rows for each entity. One row contains an overview information for
> the
> >>> > entity & another row contains detailed information about entity.
I am
> >>> > wanting to keep both rows in single CF so they may be retrieved in
a
> single
> >>> > query when required together.
> >>> >
> >>> > Now the problem I am facing is that I want to cache only first type
> of
> >>> > rows(ie, the overview containing rows) & avoid second type rows(that
> >>> > contains large data) from getting into cache.
> >>> >
> >>> > Is there a way I can manipulate such filtering of cache entering rows
> >>> > from a single CF?
> >>> >
> >>> >
> >>>
> >>
> >>
> >
>

Mime
View raw message