cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: nodetool repair - when is it not needed ?
Date Thu, 23 Aug 2012 22:15:03 GMT
> Also when hints are replayed they are sent of as mutations, which may still be dropped
by the target if they are not serviced before rpc_timeout. Sending nodes throttle their requests
so it's unlikely but possible. 

My bad there. I thought the mutations were send one way. 

When node is sending hints it waits the normal rpc_timeout. If there is a time out hint delivery
for that endpoint is aborted. It will be re-tried the in the next HH round, which is every
10 minutes. 

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 23/08/2012, at 9:36 PM, aaron morton <aaron@thelastpickle.com> wrote:

> HH works to a point. Specifically, it only collects hints for the first hour the node
is down and it has a safety valve to avoid the node collecting hints getting overwhelmed.
Looking at the code it takes a bit for that the trip and you would get a TimeoutException
coming back. 
> 
> Also when hints are replayed they are sent of as mutations, which may still be dropped
by the target if they are not serviced before rpc_timeout. Sending nodes throttle their requests
so it's unlikely but possible. 
> 
> HH is is much more robust, but AFAIK repair is still _the_ way to ensure on disk consistency.

> 
> Cheers
> 
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 23/08/2012, at 6:59 AM, Rob Coli <rcoli@palominodb.com> wrote:
> 
>> On Wed, Aug 22, 2012 at 8:37 AM, Senthilvel Rangaswamy
>> <senthilvel@gmail.com> wrote:
>>> We are running Cassandra 1.1.2 on EC2. Our database is primarily all
>>> counters and we don't do any
>>> deletes.
>>> 
>>> Does nodetool repair do anything for such a database. All the docs I read
>>> for nodetool repair suggests
>>> that nodetool repair is needed only if there is deletes.
>> 
>> Since 1.0, repair is only needed if a node crashes. If a node crashes,
>> my understanding is that a cluster-wide repair (with -pr on each node)
>> is required, because the crashed node could have lost a hint for any
>> other node.
>> 
>> https://issues.apache.org/jira/browse/CASSANDRA-2034
>> 
>> =Rob
>> 
>> -- 
>> =Robert Coli
>> AIM&GTALK - rcoli@palominodb.com
>> YAHOO - rcoli.palominob
>> SKYPE - rcoli_palominodb
> 


Mime
View raw message