incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ben Bromhead <...@instaclustr.com>
Subject Re: Ec2 Network I/O
Date Tue, 20 May 2014 20:36:10 GMT
Also once you've got your phi_convict_threshold sorted, if you see these again check:

http://status.aws.amazon.com/ 

AWS does occasionally have the odd increased latency issue / outage. 

Ben Bromhead
Instaclustr | www.instaclustr.com | @instaclustr | +61 415 936 359


On 19/05/2014, at 1:15 PM, Nate McCall <nate@thelastpickle.com> wrote:

> It's a good idea to increase phi_convict_threshold to at least 12 on EC2. Using placement
groups and single-tenant systems will certainly help.
> 
> Another optimization would be dedicating an Enhanced Network Interface (http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-eni.html)
specifically for gossip traffic. 
> 
> 
> On Mon, May 19, 2014 at 1:36 PM, Phil Burress <philburresseme@gmail.com> wrote:
> Has anyone experienced network i/o issues with ec2? We are seeing a lot of these in our
logs:
> 
> HintedHandOffManager.java (line 477) Timed out replaying hints to /10.0.x.xxx; aborting
(15 delivered)
> 
> and these...
> 
> Cannot handshake version with /10.0.x.xxx
> 
> and these...
> 
> java.io.IOException: Cannot proceed on repair because a neighbor (/10.0.x.xxx) is dead:
session failed
> 
> Occurs on all of our nodes. Even though in all cases, the host that is being reported
as down or unavailable is up and readily 'pingable'.
> 
> We are using shared tenancy on all our nodes (instance type m1.xlarge) with cassandra
2.0.7. Any suggestions on how to debug these errors?
> 
> Is there a recommendation to move to Placement Groups for Cassandra?
> 
> Thanks!
> 
> Phil 
> 
> 
> 
> -- 
> -----------------
> Nate McCall
> Austin, TX
> @zznate
> 
> Co-Founder & Sr. Technical Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com


Mime
View raw message