cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David McNelis <dmcne...@agentisenergy.com>
Subject Re: data agility
Date Sun, 20 Nov 2011 19:22:34 GMT
Dotan,

I think that if you're in the early stages you have a basic idea of what
your product is going to be, architecturally speaking.  While you may
change your business model, or features at the display layer, I would think
the data models itself would remain relatively similar
throughout...otherwise you'd have another product on your hands, no?

But, even if your requirements radically shift, Cassandra is schemaless, so
you'd be able to make 'structural' changes to your data without as much
risk as in a traditional RDBMS, i.e. MySql.

At the end of the day, I don't think you've given enough information about
your proposed data models for anyone to say, "Yes, Cassandra would or would
not be the right choice for your startup."  If well administered, depending
on the services offered, MySQL or Oracle  could support a site with 200M
users, and a poorly designed Cassandra data store could work very poorly
for a site supporting 200 users.

I will say that I think it makes a lot of  sense  to use tradional RDBMS
systems for relational data and a Cassandra-like system when there is a
need  for larger data storage, or something that lends itself well to a
structureless design.  If you are using a framework that supports a good
ORM layer (i.e. Hibernate for Java), you can have  your build process
update your database  schema as you build out your application.  I haven't
done much work in Rails or Django, but I understand those support the
transparent schema updating as well.  That sort of setup can work very
effectively in early development...but that is more a discussion for other
communities.

If you're interested in doing Map/Reduce jobs with Cassandra, look into
Brisk, the system created by DataStax (which is also open source) that
allows you to run Hadoop on top of your Cassandra cluster.  This may not be
exactly what you're looking for when asking this question...but it might
give you the insights you're looking for.

Hope this has been at least somewhat helpful.

David

On Sun, Nov 20, 2011 at 1:06 PM, Dotan N. <dipidi@gmail.com> wrote:

> Hi all,
> my question may be more philosophical than related technically
> to Cassandra, but please bear with me.
>
> Given that a young startup may not know its product full at the early
> stages, but that it definitely points to ~200M users,
> would Cassandra will be the right way to go?
>
> That is, the requirement is for a large data store, that can move with
> product changes and requirements swiftly.
>
> Given that in Cassandra one thinks hard about the queries, and then builds
> a model to suit it best, I was thinking of
> this situation as problematic.
>
> So here are some questions:
>
> - would it be wiser to start with a more agile data store (such as
> mongodb) and then progress onto Cassandra, when the product itself
> solidifies?
> - given that we start with Cassandra from the get go, what is a common
> (and quick in terms of development) way or practice to change data, change
> schemas, as the product evolves?
> - is it even smart to start with Cassandra? would only startups whose core
> business is big data start with it from the get go?
> - how would you do map/reduce with Cassandra? how agile is that? (for
> example, can you run map/reduce _very_ frequently?)
>
> Thanks!
>
> --
> Dotan, @jondot <http://twitter.com/jondot>
>
>


-- 
*David McNelis*
Lead Software Engineer
Agentis Energy
www.agentisenergy.com
c: 219.384.5143

*A Smart Grid technology company focused on helping consumers of energy
control an often under-managed resource.*

Mime
View raw message