Some quick thoughts that might be helpful:
- use ephemeral instances and RAID0 over the local volumes for both cassandra's data as well as the log directory. The log directory because if you crash due to heap size, the heap dump will be stored in the log directory. you don't want that to go in your root/OS partition.
- probably want to stripe across AZs so that a single AZ failure doesn't affect you as much.
- for seeds, it's nice to use elastic ips so that your seed configuration doesn't have to change if a node is replaced.
- the ec2snitch makes it so each AZ appears as a rack wrt topology - simpler as it inspects the ec2 metadata. if you need more than one DC in your cluster (we need a second virtual DC for analytics), you'll probably want to use the property file snitch. there's a cross region ec2snitch that's coming in 1.0.
would probably be good to add some ec2 specific tips in the wiki. the page that dave mentioned is a good step-by-step, but there's been a lot of community knowledge accumulated about best practices in the year since that was done.
On Aug 3, 2011, at 8:28 AM, Eldad Yamin wrote:
Is there any manual or important notes I should know before I try to install Cassandra on EC2?