incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bill <b...@dehora.net>
Subject Re: commodity server spec
Date Tue, 06 Sep 2011 19:47:18 GMT
Mongodb, last time I looked does not scale horizontally.

I've seen reasonable behavour putting Cassandra database tables onto 
remote filers, but you absolutely have to test against the SAN 
configuration and carefully manage things like concurrent reader/writer 
settings, the fs and cassandra caches, etc. You generally won't be 
recommended to use a NAS/SAN for this class of system.

The commitlogs work best on attached (dedicated) disk.

Bill

On 04/09/11 14:08, China Stoffen wrote:
> Then what will be the sweetspot for Cassandra? I am more interested in
> Cassandra because my application is write heavy.
>
> Till now what I have understood is that Cassandra will not work best for
> SANs too?
>
> P.S
> Mongodb is also a nosql database and designed for horizontal scaling
> then how its good for the same hardware for which Cassandra is not a
> good candidate?
>
>
> ----- Original Message -----
> From: Bill <bill@dehora.net>
> To: user@cassandra.apache.org
> Cc:
> Sent: Sunday, September 4, 2011 4:34 AM
> Subject: Re: commodity server spec
>
> [100% agree with Chris]
>
> China, the machines you're describing sound nice for
> mongodb/postgres/mysql, but probably not the sweetspot for Cassandra.
>
> Obviously (well depending on near term load) you don't want to get
> burned on excess footprint. But a realistic, don't lose data, be fairly
> available deployment is going to span at least 2 racks/power supplies
> and have data replicated offsite (at least as passive for DR). So I
> would consider 6-9 relatively weaker servers rather than 3 scale up
> joints. You'll save some capex, and the amount of opex overhead is
> probably worth it traded off against the operational risk. 3 is an
> awkward number to operate for anything that needs to be available
> (although many people seem to start with that, I am guessing because
> triplication is traditionally understood under failure) as it
> immediately puts 50% extra load on the remaining 2 when one node goes
> away. One will go away, even transiently, when it is upgraded, crashes,
> gets into a funk due to compaction or garbage collection, and load will
> then be shunted onto the other 2 - remember Cassandra has no
> backoff/throttling in place. I'd allow for something breaking at some
> point (dbs even the mature ones, fail from time to time) and 2 doesn't
> give you much room to maneuver in production.
>
> Bill
>
>
> On 03/09/11 23:05, Chris Goffinet wrote:
>  > It will also depend on how long you can handle recovery time. So imagine
>  > this case:
>  >
>  > 3 nodes w/ RF of 3
>  > Each node has 30TB of space used (you never want to fill up entire node).
>  > If one node fails and you must recover, that will take over 3.6 days in
>  > just transferring data alone. That's with a sustained 800megabit/s
>  > (100MB/s). In the real world it's going to fluctuate so add some
>  > padding. Also, since you will be saturating one of the other nodes, now
>  > you're network latency performance is suffering and you only have 1
>  > machine to handle the remaining traffic while you're recovering. And if
>  > you want to expand the cluster in the future (more nodes), the amount of
>  > data to transfer is going to be very large and most likely days to add
>  > machines. From my experience it's must better to have a larger cluster
>  > setup upfront for future growth than getting by with 6-12 nodes at the
>  > start. You will feel less pain, easier to manage node failures (bad
>  > disks, mem, etc).
>  >
>  > 3 nodes with RF of 1 wouldn't make sense.
>  >
>  >
>  > On Sat, Sep 3, 2011 at 4:05 AM, China Stoffen <chinastoffen@yahoo.com
> <mailto:chinastoffen@yahoo.com>
>  > <mailto:chinastoffen@yahoo.com <mailto:chinastoffen@yahoo.com>>>
wrote:
>  >
>  > Many small servers would drive up the hosting cost way too high so
>  > want to avoid this solution if we can.
>  >
>  > ----- Original Message -----
>  > From: Radim Kolar <hsn@sendmail.cz <mailto:hsn@sendmail.cz>
> <mailto:hsn@sendmail.cz <mailto:hsn@sendmail.cz>>>
>  > To: user@cassandra.apache.org <mailto:user@cassandra.apache.org>
> <mailto:user@cassandra.apache.org <mailto:user@cassandra.apache.org>>
>  > Cc:
>  > Sent: Saturday, September 3, 2011 9:37 AM
>  > Subject: Re: commodity server spec
>  >
>  > many smaller servers way better
>  >
>  >
>


Mime
View raw message