Sent from my iPhone

On 27.09.2010, at 19:30, Marc Canaleta <mcanaleta@gmail.com> wrote:

What do you mean by "running live"? I am also planning to use cassandra on EC2 using small nodes. Small nodes have 1/4 cpu of the large ones, 1/4 cost, but I/O is more than 1/4 (amazon does not give explicit I/O numbers...), so I think 4 small instances should perform better than 1 large one (and the cost is the same), am I wrong?

Based on results we saw and what you also find in different sources around the web, EC2 small instances perform worse than 1/4 regarding IO performance.

El 27 de septiembre de 2010 18:09:14 UTC+2, Jonathan Ellis <jbellis@gmail.com> escribiĆ³:
I strongly recommend not running live on Small nodes.  So in your case
I would recommend starting up Large instances with raid0'd disks, shut
down cassandra on the Small ones, rsync to the Large, and start up on
Large.

On Mon, Sep 27, 2010 at 6:46 AM, Utku Can TopƧu <utku@topcu.gen.tr> wrote:
> Hi All,
>
> We're currently running a cassandra cluster with Replication Factor 3,
> consisting of 4 nodes.
>
> The current situation is:
>
> - The nodes are all identical (AWS small instances)
> - Data directory is in the partition (/mnt) which has 150G capacity and each
> node has around 90 GB load, so 60 G free space per node is left.
>
> So adding a new node to the cluster will seem to cause problems for us. I
> think the node which will stream the data to the new bootstrapping node,
> will not have enough disk space for anticompacting its data.
>
> What should be the best practice for such scenarios?
>
> Regards,
>
> Utku
>



--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com