incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: HTimedOutException and cluster not working
Date Wed, 19 Sep 2012 22:56:30 GMT

> No, all keyspaces that we created do not have secondary indexes. So probably the settings
'memtable_flush_queue_size' is not relevant? 
It may be. 
If you had a lot a CF's and cassandra tried to flush more than memtable_flush_queue_size at
once. 

> one would think that the compaction would probably hitting
> the limit I/O of the system and hence, the FlushWriter has to be block.
Maybe sort of. 
The flush writer will block if it cannot flush data fast enough. So it depends on the number
of CF's being flushed and the speed it is written it. It's unlikely that compaction is hurting
that much on one machine. 

> nodetool has a command 'setcompactionthroughput', would setting
> the throughput to a lower number to a number lower than 16 helps?
Cannot hurt to try. 

> What other methods to find out if all the nodes in the system is in OK or problem from
cassandra point of view? 
Monitoring of whatever flavour. e.g. http://www.datastax.com/products/opscenter

Look in the logs for nodes thinking node-3 is DOWN

> when the logging in cassandra is enable with DEBUG, here is the log I got for node-1
This is node-1 failing to write hints to node-3. 


I forgot to ask, what consistency level are you using for writes ? 
Have you checked the disk health on node-3 ?

Cheers


-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 19/09/2012, at 1:10 AM, Jason Wee <peichieh@gmail.com> wrote:

> Hi Aaron, thank you for your reply. Please read inline comment.
> 
> On Tue, Sep 18, 2012 at 7:36 PM, aaron morton <aaron@thelastpickle.com> wrote:
> What version are you on ?
> cassandra version 1.0.8 
> 
>>  HTimedOutException is logged for all the nodes. 
> TimedOutException happens when less than CL replica nodes respond to the coordinator
in time. 
> You could get the error from all nodes in your cluster if the 3 nodes that store the
key are having problems. 
> we have check the system load and tpstats for all the nodes in the cluster. Only node
3 having high active and pending 
> count for MutationStage and FlushWriter. 
> 
> 
>> MutationStage 16 2177067 879092633 0 0
> This looks like mutations are blocked or running very very slowly. 
> 
>> FlushWriter 0 0 5616 0 1321
> The All Timed Blocked number means there were 1,321 times a thread tried to flush a memtable
but the queue of flushers was full. Do you use secondary indexes ? If so take a look at the
comments for memtable_flush_queue_size in the yaml file. 
> 
> No, all keyspaces that we created do not have secondary indexes. So probably the settings
'memtable_flush_queue_size' is not relevant? 
> 
> What I interpret based on the output as above and the output of compactionstats on node-3,
one would think that the compaction would probably hitting
> the limit I/O of the system and hence, the FlushWriter has to be block. I read further,
nodetool has a command 'setcompactionthroughput', would setting
> the throughput to a lower number to a number lower than 16 helps?
>>   and cluster settings, it should be possible in this scenario, write success 
>>   on one of the nodes even though node-3 is too busy or failing for any reason? 
> Yes. 
> If only one node fails to respond the write should succeed. If you got a TimedOut with
CL ONE it sounds like more nodes were having problems. 
> We check the nodes by running command like top or nodetool cfstats on all the nodes in
the cluster. The load for the rest of the nodes all below 1 and there is no
> active or pending count for all the stages. What other methods to find out if all the
nodes in the system is in OK or problem from cassandra point of view? Because 
> if there is a concrete method to gauge all the nodes, we will have better clues where
exactly is the problem.
> 
>> * when hector client failover to other nodes, basically all the nodes fail, why
>>   is this so?
> Sorry I don't understand this question. 
> 
> 14.Sep 12:14:04,430 WARN HConnectionManager [operateWithFailover]: Could not fullfill
request on this host CassandraClient<node-1.example.com:9160-88909>
> 14.Sep 12:14:04,430 WARN HConnectionManager [operateWithFailover]: Exception:
> me.prettyprint.hector.api.exceptions.HTimedOutException: TimedOutException()
> at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:42)
> at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:252)
> ..
> ..
> ..
> 14.Sep 12:14:04,302 WARN HConnectionManager [operateWithFailover]: Could not fullfill
request on this host CassandraClient<node-2.example.com:9160-89196>
> 14.Sep 12:14:04,302 WARN HConnectionManager [operateWithFailover]: Exception:
> me.prettyprint.hector.api.exceptions.HTimedOutException: TimedOutException()
> at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:42)
> at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:252)

> 
> the log show that similar log with hector trying to failover to all the nine nodes.
> 
>> * what factors that increase MutationStage active and pending values?
> Check the log for ERRORs
> Check for failing or overloaded IO. 
> See comment above about memtable flush queue size. 
> 
> when the logging in cassandra is enable with DEBUG, here is the log I got for node-1
> 
> INFO [HintedHandoff:1] 2012-09-14 12:27:38,955 HintedHandOffManager.java (line 354) Timed
out replaying hints to /node-3; aborting further deliveries
> INFO [HintedHandoff:1] 2012-09-14 12:27:38,956 HintedHandOffManager.java (line 373) Finished
hinted handoff of 0 rows to endpoint /node-3
> 
> DEBUG [RPC-Thread:577363] 2012-09-14 12:52:12,484 StorageProxy.java (line 212) Write
timeout java.util.concurrent.TimeoutException for one (or more) of:
> DEBUG [RPC-Thread:577363] 2012-09-14 12:52:12,484 CassandraServer.java (line 648) ...
timed out
>  
> Hope that helps. 
> 
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 18/09/2012, at 4:24 PM, Jason Wee <peichieh@gmail.com> wrote:
> 
>> Hello,
>> 
>> A context to our environment, we have a clusters of 9 nodes with a few keyspaces.
The client write to the cluster with consistency level of one to a keyspace in the cluster
with a replication factor of 3. The hector client is configured such that all the nodes in
cluster is specified and so that we would want to ensure that at any write request, two nodes,
can fail and one write is succcess to the cluster node.
>> 
>> However, under certain situation, we seen in the log, HTimedOutException is logged
during writing to the cluster. Hector client thus failover to the next node in the cluster
but what we noticed is that, the same exception, HTimedOutException is logged for all the
nodes. This result that the cluster is not working as a whole. Logically, we checked all the
nodes in the cluster for load. Only node-3 seem to have high pending MutationStage when nodetool
tpstats is run. Other nodes are fine with 0 active and 0 pending for all the stages. 
>> 
>> /nodetool -h localhost tpstats
>> Pool Name Active Pending Completed Blocked All time blocked
>> ReadStage 0 0 11116983 0 0
>> RequestResponseStage 0 0 1252368951 0 0
>> MutationStage 16 2177067 879092633 0 0
>> ReadRepairStage 0 0 3648106 0 0
>> ReplicateOnWriteStage 0 0 33722610 0 0
>> GossipStage 0 0 20504608 0 0
>> AntiEntropyStage 0 0 1197 0 0
>> MigrationStage 0 0 89 0 0
>> MemtablePostFlusher 0 0 5659 0 0
>> StreamStage 0 0 296 0 0
>> FlushWriter 0 0 5616 0 1321
>> MiscStage 0 0 5964 0 0
>> AntiEntropySessions 0 0 88 0 0
>> InternalResponseStage 0 0 27 0 0
>> HintedHandoff 1 2 5976 0 0
>> 
>> Message type Dropped
>> RANGE_SLICE 0
>> READ_REPAIR 0
>> BINARY 0
>> READ 178
>> MUTATION 17467
>> REQUEST_RESPONSE 0
>> 
>> We proceed to check if there is any compaction in node-3 and found out the 
>> following:
>> 
>> ./nodetool -hlocalhost compactionstats
>> pending tasks: 196
>> compaction type keyspace column family bytes compacted bytes total progress
>> Cleanup MyKeyspace MyCF 6946398685 10230720119 67.90%
>> 
>> 
>> Question:
>> * with a replication factor of 3 in the keyspace and client write consistency 
>>   level of one, in the situation above, and the current hector client settings 
>>   and cluster settings, it should be possible in this scenario, write success 
>>   on one of the nodes even though node-3 is too busy or failing for any reason? 
>>   
>> * when hector client failover to other nodes, basically all the nodes fail, why
>>   is this so?
>>   
>> * what factors that increase MutationStage active and pending values?
>> 
>> Thank you for any comments and insight
>> 
>> Regards,
>> Jason
> 
> 


Mime
View raw message