Thanks Milind for sharing!

As Sasha already asked, ec2 sends data across regions over the internet without any encryption. So you may consider to tunnel the traffic thru ssh.

I don't know how to do that with cassandra. Any?

Regards, mike

On Tue, Mar 22, 2011 at 5:29 AM, Milind Parikh <milindparikh@gmail.com> wrote:
Patch is attached... I don't have access to Jira.
 
A cautionery note: This is NOT a general solution and is not intended as such. It could be included as a part of larger patch. I will explain in the limitation sections about why it is not a general solution; as I find time.

Regards
Milind
 
On Mon, Mar 21, 2011 at 11:42 PM, Jeremy Hanna <jeremy.hanna1234@gmail.com> wrote:
Sorry if I was presumptuous earlier.  I created a ticket so that the patch could be submitted and reviewed - that is if it can be generalized so that it works across regions and doesn't adversely affect the common case.
https://issues.apache.org/jira/browse/CASSANDRA-2362

On Mar 21, 2011, at 10:41 PM, Jeremy Hanna wrote:

> Sorry if I was presumptuous earlier.  I created a ticket so that the patch could be submitted and reviewed - that is if it can be generalized so that it works across regions and doesn't adversely affect the common case.
> https://issues.apache.org/jira/browse/CASSANDRA-2362
>
> On Mar 21, 2011, at 12:20 PM, Jeremy Hanna wrote:
>
>> I talked to Matt Dennis in the channel about it and I think everyone would like to make sure that cassandra works great across multiple regions.  He sounded like he didn't know why it wouldn't work after having looked at the patches.  I would like to try it both ways - with and without the patches later today if I can and I'd like to help out with getting it working out of the box.
>>
>> Thanks for the investigative work and documentation Milind!
>>
>> Jeremy
>>
>> On Mar 21, 2011, at 12:12 PM, Dave Viner wrote:
>>
>>> Hi Milind,
>>>
>>> Great work here.  Can you provide the patch against the 2 files?
>>>
>>> Perhaps there's some way to incorporate it into the trunk of cassandra so that this is feasible (in a future release) without patching the source code.
>>>
>>> Dave Viner
>>>
>>>
>>> On Mon, Mar 21, 2011 at 9:41 AM, A J <s5alye@gmail.com> wrote:
>>> Thanks for sharing the document, Milind !
>>> Followed the instructions and it worked for me.
>>>
>>> On Mon, Mar 21, 2011 at 5:01 AM, Milind Parikh <milindparikh@gmail.com> wrote:
>>>> Here's the document on Cassandra (0.7.4) across EC2 regions. Clearly this is
>>>> work in progress.... but wanted to share what I have. PDF is the working
>>>> copy.
>>>>
>>>>
>>>> https://docs.google.com/document/d/175duUNIx7m5mCDa2sjXVI04ekyMa5bdiWdu-AFgisaY/edit?hl=en
>>>>
>>>> On Sun, Mar 20, 2011 at 7:49 PM, aaron morton <aaron@thelastpickle.com>
>>>> wrote:
>>>>>
>>>>> Recent discussion on the dev list
>>>>> http://www.mail-archive.com/dev@cassandra.apache.org/msg01832.html
>>>>> Aaron
>>>>> On 19 Mar 2011, at 06:46, A J wrote:
>>>>>
>>>>> Just to add, all the telnet (port 7000) and cassandra-cli (port 9160)
>>>>> connections are done using the public DNS (that goes like
>>>>> ec2-.....compute.amazonaws.com)
>>>>>
>>>>> On Fri, Mar 18, 2011 at 1:37 PM, A J <s5alye@gmail.com> wrote:
>>>>>
>>>>> I am able to telnet from one region to another on 7000 port without
>>>>>
>>>>> issues. (I get the expected Connected to .....Escape character is
>>>>>
>>>>> '^]'.)
>>>>>
>>>>> Also I am able to execute cassandra client on 9160 port from one
>>>>>
>>>>> region to another without issues (this is when I run cassandra
>>>>>
>>>>> separately on each region without forming a cluster).
>>>>>
>>>>> So I think the ports 7000 and 9160 are not the issue.
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Mar 18, 2011 at 1:26 PM, Dave Viner <daveviner@gmail.com> wrote:
>>>>>
>>>>> From the us-west instance, are you able to connect to the us-east instance
>>>>>
>>>>> using telnet on port 7000 and 9160?
>>>>>
>>>>> If not, then you need to open those ports for communication (via your
>>>>>
>>>>> Security Group)
>>>>>
>>>>> Dave Viner
>>>>>
>>>>> On Fri, Mar 18, 2011 at 10:20 AM, A J <s5alye@gmail.com> wrote:
>>>>>
>>>>> Thats exactly what I am doing.
>>>>>
>>>>> I was able to do the first two scenarios without any issues (i.e. 2
>>>>>
>>>>> nodes in same availability zone. Followed by an additional node in a
>>>>>
>>>>> different zone but same region)
>>>>>
>>>>> I am stuck at the third scenario of separate regions.
>>>>>
>>>>> (I did read the "Cassandra nodes on EC2 in two different regions not
>>>>>
>>>>> communicating" thread but it did not seem to end with resolution)
>>>>>
>>>>>
>>>>> On Fri, Mar 18, 2011 at 1:15 PM, Dave Viner <daveviner@gmail.com> wrote:
>>>>>
>>>>> Hi AJ,
>>>>>
>>>>> I'd suggest getting to a multi-region cluster step-by-step.  First, get
>>>>>
>>>>> 2
>>>>>
>>>>> nodes running in the same availability zone.  Make sure that works
>>>>>
>>>>> properly.
>>>>>
>>>>> Second, add a node in a separate availability zone, but in the same
>>>>>
>>>>> region.
>>>>>
>>>>> Make sure that's working properly.  Third, add a node that's in a
>>>>>
>>>>> separate
>>>>>
>>>>> region.
>>>>>
>>>>> Taking it step-by-step will ensure that any issues are specific to the
>>>>>
>>>>> region-to-region communication, rather than intra-zone connectivity or
>>>>>
>>>>> cassandra cluster configuration.
>>>>>
>>>>> Dave Viner
>>>>>
>>>>> On Fri, Mar 18, 2011 at 8:34 AM, A J <s5alye@gmail.com> wrote:
>>>>>
>>>>> Hello,
>>>>>
>>>>> I am trying to setup a cassandra cluster across regions.
>>>>>
>>>>> For testing I am keeping it simple and just having one node in US-EAST
>>>>>
>>>>> (say ec2-1-2-3-4.compute-1.amazonaws.com) and one node in US-WEST (say
>>>>>
>>>>> ec2-2-2-3-4.us-west-1.compute.amazonaws.com).
>>>>>
>>>>> Using Cassandra 0.7.4
>>>>>
>>>>>
>>>>> The one in east region is the seed node and has the values as:
>>>>>
>>>>> auto_bootstrap: false
>>>>>
>>>>> seeds: ec2-1-2-3-4.compute-1.amazonaws.com
>>>>>
>>>>> listen_address: ec2-1-2-3-4.compute-1.amazonaws.com
>>>>>
>>>>> rpc_address: 0.0.0.0
>>>>>
>>>>> The one in west region is non seed and has the values as:
>>>>>
>>>>> auto_bootstrap: true
>>>>>
>>>>> seeds: ec2-1-2-3-4.compute-1.amazonaws.com
>>>>>
>>>>> listen_address: ec2-2-2-3-4.us-west-1.compute.amazonaws.com
>>>>>
>>>>> rpc_address: 0.0.0.0
>>>>>
>>>>> I first fire the seed node (east region instance) and it comes up
>>>>>
>>>>> without issues.
>>>>>
>>>>> When I fire the non-seed node (west region instance) it fails after
>>>>>
>>>>> sometime with the error:
>>>>>
>>>>> DEBUG 15:09:08,844 Created HHOM instance, registered MBean.
>>>>>
>>>>> INFO 15:09:08,844 Joining: getting load information
>>>>>
>>>>> INFO 15:09:08,845 Sleeping 90000 ms to wait for load information...
>>>>>
>>>>> DEBUG 15:09:09,822 attempting to connect to
>>>>>
>>>>> ec2-1-2-3-4.compute-1.amazonaws.com/1.2.3.4
>>>>>
>>>>> DEBUG 15:09:10,825 Disseminating load info ...
>>>>>
>>>>> DEBUG 15:10:10,826 Disseminating load info ...
>>>>>
>>>>> DEBUG 15:10:38,845 ... got load info
>>>>>
>>>>> INFO 15:10:38,845 Joining: getting bootstrap token
>>>>>
>>>>> ERROR 15:10:38,847 Exception encountered during startup.
>>>>>
>>>>> java.lang.RuntimeException: No other nodes seen!  Unable to bootstrap
>>>>>
>>>>>      at
>>>>>
>>>>>
>>>>> org.apache.cassandra.dht.BootStrapper.getBootstrapSource(BootStrapper.java:164)
>>>>>
>>>>>      at
>>>>>
>>>>>
>>>>> org.apache.cassandra.dht.BootStrapper.getBalancedToken(BootStrapper.java:146)
>>>>>
>>>>>      at
>>>>>
>>>>>
>>>>> org.apache.cassandra.dht.BootStrapper.getBootstrapToken(BootStrapper.java:141)
>>>>>
>>>>>      at
>>>>>
>>>>>
>>>>> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:450)
>>>>>
>>>>>      at
>>>>>
>>>>>
>>>>> org.apache.cassandra.service.StorageService.initServer(StorageService.java:404)
>>>>>
>>>>>      at
>>>>>
>>>>>
>>>>> org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:192)
>>>>>
>>>>>      at
>>>>>
>>>>>
>>>>> org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:314)
>>>>>
>>>>>      at
>>>>>
>>>>> org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:79)
>>>>>
>>>>>
>>>>> The seed node seems to somewhat acknowledge the non-seed node:
>>>>>
>>>>> attempting to connect to /2.2.3.4
>>>>>
>>>>> attempting to connect to /10.170.190.31
>>>>>
>>>>> Can you suggest how can I fix it (I did see a few threads on similar
>>>>>
>>>>> issue but did not really follow the chain)
>>>>>
>>>>> Thanks, AJ
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>