incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Request timeout and host marked down
Date Sun, 08 Apr 2012 21:15:59 GMT
You need to see if the timeout is from the client to the server, or between the server nodes.


If it's server side a TimedOutException will be thrown from thrift. Take a look at the nodetool
tpstats on the servers, you will probably see lots of "Pending" tasks. Basically the cluster
is overloaded. Consider:

* check the IO, CPU, GC state on the servers. 
* ensuring the data and requests are evenly spread around the cluster. 
* reducing the number of columns read in a select. 

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 6/04/2012, at 5:30 AM, Daning Wang wrote:

> Hi all,
> 
> We are using Hector and ofter we see lots of timeout exception in the log, I know that
the hector can failover to other node, but I want to reduce the number of timeouts.
> 
> any hector parameter I should change to reduce this error?
> 
> also, on the server side, any kind of tunning need to do for the timeout?
>  
> 
> Thanks in advance.
> 
> 
> 12/04/04 15:13:20 ERROR com.netseer.services.keywordstat.io.KeywordServiceImpl: Timout
10000 ms
> 12/04/04 15:13:25 ERROR me.prettyprint.cassandra.connection.HConnectionManager: MARK
HOST AS DOWN TRIGGERED for host 10.28.78.123(10.28.78.123):9160
> 12/04/04 15:13:25 ERROR me.prettyprint.cassandra.connection.HConnectionManager: Pool
state on shutdown: <ConcurrentCassandraClientPoolByHost>:{10.28.78.123(10.28.78.123):9160};
IsActive?: true; Active: 1; Blocked: 0; Idle: 5; NumBeforeExhausted: 19
> 12/04/04 15:13:44 ERROR me.prettyprint.cassandra.connection.HConnectionManager: MARK
HOST AS DOWN TRIGGERED for host 10.240.113.171(10.240.113.171):9160
> 12/04/04 15:13:44 ERROR me.prettyprint.cassandra.connection.HConnectionManager: Pool
state on shutdown: <ConcurrentCassandraClientPoolByHost>:{10.240.113.171(10.240.113.171):9160};
IsActive?: true; Active: 1; Blocked: 0; Idle: 5; NumBeforeExhausted: 19
> 12/04/04 15:13:46 ERROR me.prettyprint.cassandra.connection.HConnectionManager: MARK
HOST AS DOWN TRIGGERED for host 10.28.78.123(10.28.78.123):9160
> 12/04/04 15:13:46 ERROR me.prettyprint.cassandra.connection.HConnectionManager: Pool
state on shutdown: <ConcurrentCassandraClientPoolByHost>:{10.28.78.123(10.28.78.123):9160};
IsActive?: true; Active: 1; Blocked: 0; Idle: 5; NumBeforeExhausted: 19
> 12/04/04 15:13:46 ERROR me.prettyprint.cassandra.connection.HConnectionManager: MARK
HOST AS DOWN TRIGGERED for host 10.123.83.114(10.123.83.114):9160
> 12/04/04 15:13:46 ERROR me.prettyprint.cassandra.connection.HConnectionManager: Pool
state on shutdown: <ConcurrentCassandraClientPoolByHost>:{10.123.83.114(10.123.83.114):9160};
IsActive?: true; Active: 1; Blocked: 0; Idle: 5; NumBeforeExhausted: 19
> 12/04/04 15:13:46 ERROR me.prettyprint.cassandra.connection.HConnectionManager: MARK
HOST AS DOWN TRIGGERED for host 10.6.115.239(10.6.115.239):9160
> 12/04/04 15:13:46 ERROR me.prettyprint.cassandra.connection.HConnectionManager: Pool
state on shutdown: <ConcurrentCassandraClientPoolByHost>:{10.6.115.239(10.6.115.239):9160};
IsActive?: true; Active: 1; Blocked: 0; Idle: 5; NumBeforeExhausted: 19
> 12/04/04 15:13:49 ERROR com.netseer.services.keywordstat.io.KeywordServiceImpl: Timout
10000 ms
> 12/04/04 15:13:49 ERROR me.prettyprint.cassandra.connection.HConnectionManager: MARK
HOST AS DOWN TRIGGERED for host 10.120.205.48(10.120.205.48):9160
> 12/04/04 15:13:49 ERROR me.prettyprint.cassandra.connection.HConnectionManager: Pool
state on shutdown: <ConcurrentCassandraClientPoolByHost>:{10.120.205.48(10.120.205.48):9160};
IsActive?: true; Active: 3; Blocked: 0; Idle: 3; NumBeforeExhausted: 17
> 12/04/04 15:13:50 ERROR me.prettyprint.cassandra.connection.HConnectionManager: MARK
HOST AS DOWN TRIGGERED for host 10.28.20.200(10.28.20.200):9160
> 12/04/04 15:13:50 ERROR me.prettyprint.cassandra.connection.HConnectionManager: Pool
state on shutdown: <ConcurrentCassandraClientPoolByHost>:{10.28.20.200(10.28.20.200):9160};
IsActive?: true; Active: 2; Blocked: 0; Idle: 4; NumBeforeExhausted: 18
> 12/04/04 15:13:51 ERROR com.netseer.services.keywordstat.io.KeywordServiceImpl: Timout
10000 ms


Mime
View raw message