Interesting idea, .
If it is like dividing the entire load on the system by 6, so if the effective load is still the same and used SSD's for commit volume we could get away with 1 commitlog SSD. Even if these 6 instances can handle 80% of the load (compared to 1 on this machine), that might be acceptable. Could that help?
I mean the benefits of smaller cassandra nodes does sound very enticing. Sure we would probably have to throw more memory/CPU at it to get comparable to 1 instance on that box (or reduce the load), but it does look better than 6 boxes.
The major downside is you're going to want to let each instance have
its own dedicated commitlog spindle too, unless you just don't have
On Tue, Dec 7, 2010 at 8:25 PM, Edward Capriolo <email@example.com> wrote:
> I am quite ready to be stoned for this thread but I have been thinking
> about this for a while and I just wanted to bounce these ideas of some
> Cassandra does allow multiple data directories, but as far as I can
> tell no one runs in this configuration. This is something that is very
> different between the hbase architecture and the Cassandra
> architecture. HBase borrows the concept from hadoop of JBOD
> configurations. HBase has many small ish (~256 MB) regions managed
> with Zookeeper. Cassandra has a few (1 per node) large node sized
> Token Ranges managed by Gossip consensus.
> Lets say a node has 6 300 GB disks. You have the options of RAID5,
> RAID6, RAID10, or RAID0. The problem I have found with these
> configurations are major compactions (of even large minor ones) can
> take a long time. Even if your disk is not heavily utilized this is a
> lot of data to move through. Thus node joins take a long time. Node
> moves take a long time.
> The idea behind "micrandra" is for a 6 disk system run 6 instances of
> Cassandra, one per disk. Use the RackAwareSnitch to make sure no
> replicas live on the same node.
> The downsides
> 1) we would have to manage 6x the instances of cassandra
> 2) we would have some overhead for each JVM.
> The upsides ?
> 1) Since disk/instance failure only degrades the overall performance
> 1/6th (RAID0 you lost the entire node) (RAID5 still takes a hit when
> down a disk)
> 2) Moves and joins have less work to do
> 3) Can scale up a single node by adding a single disk to an existing
> system (assuming the ram and cpu is light)
> 4) OPP would be "easier" to balance out hot spots (maybe not on this
> one in not an OPP)
> What does everyone thing? Does it ever make sense to run this way?
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support