cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: All host pools Marked Down
Date Thu, 31 May 2012 01:01:56 GMT
I would remove the load balancer from the equation.

Compactions do not stop the world, they may degrade performance for a while but thats about
it. 

Look in the logs on the servers, are the nodes logging that other nodes are going DOWN ? 

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 30/05/2012, at 2:25 AM, cem wrote:

> It should retry but it doesn't. It is also clear that it delegates the retry to the client
" Retry burden pushed out to client " you can also check Hector code. I wrote a separate service
that retries when this exception occurs. 
> 
> I think you have a problem with your load balancer. Try to connect with telnet.  
> 
> Cem.
> 
> On Tue, May 29, 2012 at 3:06 PM, Shubham Srivastava <Shubham.Srivastava@makemytrip.com>
wrote:
> My webapp connects to the LoadBalancer IP which has the actual nodes in its pool.
> 
> If there is by any chance a connection break then will hector not retry to re-establish
connection I guess it should retry every XX seconds based on  retryDownedHostsDelayInSeconds
>  .
> 
> 
> Regards,
> Shubham
> From: cem [cayiroglu@gmail.com]
> Sent: Tuesday, May 29, 2012 6:13 PM
> To: user@cassandra.apache.org
> Subject: Re: All host pools Marked Down
> 
> Since all hosts are seem to be down, Hector will not do retry. There should be at least
one node up in a cluster. Make sure that you have a proper connection from your webapps to
your cluster.
> 
> Cem. 
> 
> On Tue, May 29, 2012 at 1:46 PM, Shubham Srivastava <Shubham.Srivastava@makemytrip.com>
wrote:
> Any takers on this. Hitting us badly right now.
> 
> Regards,
> Shubham
> From: Shubham Srivastava
> Sent: Tuesday, May 29, 2012 12:55 PM
> To: user@cassandra.apache.org
> Subject: All host pools Marked Down
> 
> I am getting this exception lot of times
> 
>  
> me.prettyprint.hector.api.exceptions.HectorException: All host pools marked down. Retry
burden pushed out to client.
> 
>  
> 
> What this causes is no data read/write from the ring from my WebApp.
> 
>  
> I have retries as 3 and can see that max retries 3 getting exhausted with the same error
as above.
> 
>  
> Checked cfstats and tpstats nothing seem to be a problem.
> 
>  
> However through the logs I see lot of time taken in compactions like the below
> 
>  
> INFO [CompactionExecutor:73] 2012-05-29 11:03:01,605 CompactionManager.java (line 608)
Compacted to /opt/cassandra-data/data/LH/UserPrefrences-tmp-g-8906-Data.db.  36,986,932 to
36,961,554 (~99% of original) bytes for 132,743 keys.  Time: 112,910ms.
> 
>  
> The time taken here seems pretty high. Will this cause a pause or read timeout etc.
> 
>  
> I have the connection from my web app through a hardware loadbalancer . Cassandra version
is 0.8.6 with multi-DC ring on 6 nodes each in one DC.
> 
> CL:1 and RF:3.
> 
>  
> Memeory:8Gb heap -> 14Gb Server memory with 8Core CPU.
> 
>  
> How do I move ahead in this.
> 
>  
> Shubham Srivastava | Technical Lead - Technology Development
> 
> +91 124 4910 548   |  MakeMyTrip.com, 243 SP Infocity, Udyog Vihar Phase 1, Gurgaon,
Haryana - 122 016, India
> 
> <image001.gif>What's new? My Trip Rewards - An exclusive loyalty program for MakeMyTrip
customers.
> 
> <image002.gif>
> 
> <image003.gif>
> Office Map
> 
> <image004.gif>
> Facebook
> 
> <image005.gif>
> Twitter
> 
>  
> 
> 


Mime
View raw message