incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Dürgner <mich...@duergner.de>
Subject Re: advice, is cassandra suitable for a multi-tanency vBulletin type application?
Date Tue, 13 Jul 2010 06:35:59 GMT
The thing about slow on joins is true (we experience that ourselves) but still I wonder myself,
why you use cassandra for the indices. Can't you just store them in MySQL although?

Am 13.07.2010 um 08:26 schrieb Sandeep Kalidindi at PaGaLGuY.com:

> @paul - cassandra is really good for storing indices. But i like redis because it provides
us with some of the really good data-structures like sorted sets and all. So we use both to
their strengths. For example in a forum - all the posts and replies in a thread + which user
is following which threads etc etc are all stored in cassandra. 
> 
> But for something like which threads make up the top page (should be able to sort these
both in terms of latest posted thread first - most active thread first etc etc) - sorted sets
makes lots of sense as recomputing the whole (which threads make up the first page) is too
much to re-compute every time a post is made. Hence in such cases i use redis but for storage
of posts and follower lists et al  i use cassandra. 
> 
> @micheal  - Well with mysql writes are fast but reads are slow cause of joins et al .
my main criterion is reads should be really really fast - which means i have to precompute
indices so that my front end code will just read the index and output it - no more complex
joins et al . so you need a store which is really good for storing lots of indices - in my
opinion cassandra is one of the best store for storing indices. Hence the use of cassandra.

> 
> Also complete de-normalization isn't so easy. There will be many cases where it will
be a pain in the ass. I could handle most of such cases with the help of Varnish - ESI module.
For example. You have a timeline of posts stored in cassandra. But we cannot store the author
information in all the posts as it is subjected to change any time. This is the classic case
of a join. If you want to use a NoSQL for that then you will end up making multiple calls
to the database (first to fetch the posts and then to fetch the user information).  But if
you just use ESI then no need to make so many calls because you can cache that small user
info module for as long as that information doesn't change. This seems a much more elegant
solution than a mysql join. Hope i explained the point. 
> 
> Cheers,
> Deepu.
> 
> On Tue, Jul 13, 2010 at 11:38 AM, Michael Dürgner <michael@duergner.de> wrote:
> Are your PVs mostly read or write? As if they are read, I'd think you wouldn't need a
Cassandra like storage which is tuned towards writes.
> 
> Am 12.07.2010 um 23:40 schrieb Sandeep Kalidindi at PaGaLGuY.com:
> 
> > well we were going down constantly with VB running on 3-4 dedicated servers due
to huge traffic(couple of tens of millions of page views). We are also planning on some new
major features, hence the shift to cassandra with future in mind.
> >
> > Well roughly the architecture is like this(in order of how the request proceeds)
:-
> >
> > 1) Varnish - php reads from cassandra and the performance isn't always good(i am
still yet to master it though. so probably my lack of expertise here).  So we use heavy use
of varnish to cache as much as possible. VCL means we can cache same page for different logged
in users differently. ESI means no need to worry about joins. Really varnish is quite a good
companion for NoSQL .
> >
> > 2) Front end php servers - contains most of the template code - reads directly from
cassandra and Redis.
> >
> > 3) Middleware(written in scala + python -- planning to move middleware to scala
completely to reduce no of langs in production) - all writes from php directly go to the middleware
- As cassandra is infact mostly a storage of indices - which means you need to change your
strategy from mysql(post computation) to precomputing all the needed indices and storing them
on cassandra. so middleware takes care of computing the indices and storing them in cassandra
and redis accordingly. This way php will just submit the write to middleware and the request
can be completed while middleware might take couple secs at most to compute the indices and
finish the request completely.
> >
> > 4) Cassandra + redis clusters.
> >
> >
> > So writes are taken care of by the middleware and hence writes complete uber fast
and reads are also quite fast courtesy of utilizing varnish where ever it helps.
> >
> > Still not in production though. Hope it helped. Would welcome anybody's suggestions
on the way i am using cassandra and if i can do anything better
> >
> > Cheers,
> > Deepu.
> >
> > On Tue, Jul 13, 2010 at 2:48 AM, S Ahmed <sahmed1020@gmail.com> wrote:
> > What sort of traffic levels made you port the application to Cassandra?
> >
> > Very interested in seeing this go live.
> >
> > What sort of server setup are you looking at using?
> >
> >
> > On Mon, Jul 12, 2010 at 4:39 PM, Sandeep Kalidindi at PaGaLGuY.com <sandeep.kalidindi@pagalguy.com>
wrote:
> > No we re-coded from scratch with most of the needed functionality.
> >
> > Cheers,
> > Deepu.
> >
> >
> > On Mon, Jul 12, 2010 at 7:49 PM, S Ahmed <sahmed1020@gmail.com> wrote:
> > Very interesting!
> >
> > What kind of integration do you have between vB and Cassandra? its not a port then?
> >
> >
> > On Mon, Jul 12, 2010 at 3:34 AM, Sandeep Kalidindi at PaGaLGuY.com <sandeep.kalidindi@pagalguy.com>
wrote:
> > we were one of the vbulletin customers and our forums has been facing some bad scaling
issues.
> >
> > we coded our forum software to work with cassandra. we are still testing for bugs
and might go live in couple of weeks. You can ask any specific questions about vbulletin and
cassandra and i will answer to the best of my knowledge.
> >
> > I our case a combination of cassandra and redis took care of most of the functionality
that vbulletin offers and much more.
> >
> > Cheers,
> > Deepu.
> >
> >
> > On Mon, Jul 12, 2010 at 9:58 AM, Paul Prescod <prescod@gmail.com> wrote:
> > On Sun, Jul 11, 2010 at 8:39 AM, S Ahmed <sahmed1020@gmail.com> wrote:
> > > I want to build a vBulletin type application (forums, threads, posts, user
> > > management, etc).
> > > Support multi-tenancy for a Saas type environment.
> > > Would Cassandra be suitable for this type of application?
> > >
> > >
> > > Thanks in advance.
> >
> > Most likely, it is technically a fine fit. But Cassandra is very early
> > stage software, so you should expect that the documentation will not
> > always be clear and things will change from version to version. If you
> > are not extremely self-reliant, you may find it a frustrating
> > experience. Unless you are confident you will have trouble scaling
> > traditional technologies, it might not make business sense.
> >
> >  Paul Prescod
> >
> >
> >
> >
> >
> 
> 


Mime
View raw message