incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yan Chunlu <springri...@gmail.com>
Subject Re: Corrupted data
Date Mon, 11 Jul 2011 03:17:09 GMT
it has already run about 20 hours...

On Mon, Jul 11, 2011 at 1:36 AM, aaron morton <aaron@thelastpickle.com>wrote:

> 1) do I need to treat every node as failure and do a rolling replacement?
>  since there might be some inconsistent in the cluster even I have no way to
> find out.
>
> see
> http://wiki.apache.org/cassandra/Operations#Dealing_with_the_consequences_of_nodetool_repair_not_running_within_GCGraceSeconds
>
>
> <http://wiki.apache.org/cassandra/Operations#Dealing_with_the_consequences_of_nodetool_repair_not_running_within_GCGraceSeconds>
>
> 2) is that the reason that caused the node repair hung? the log message
> says:
> Jul 10, 2011 4:40:35 AM ClientCommunicatorAdmin Checker-run
> WARNING: Failed to check the connection: java.net.SocketTimeoutException:
> Read timed out
>
> I cannot find that anywhere in the code base, can you provide some more
> information ?
>
> Cheers
>
>  -----------------
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 10 Jul 2011, at 03:26, Yan Chunlu wrote:
>
> I am running RF=2(I have changed it from 2->3 and back to 2) and 3 nodes
> and didn't running node repair more than 10 days, did not aware of this is
> critical.  I run node repair recently and one of the node always hung...
> from log it seems doing nothing related to the repair.
>
> so I got two problems:
>
> 1) do I need to treat every node as failure and do a rolling replacement?
>  since there might be some inconsistent in the cluster even I have no way to
> find out.
> 2) is that the reason that caused the node repair hung? the log message
> says:
> Jul 10, 2011 4:40:35 AM ClientCommunicatorAdmin Checker-run
> WARNING: Failed to check the connection: java.net.SocketTimeoutException:
> Read timed out
>
> then nothing.
>
> thanks!
>
> On Sat, Jul 9, 2011 at 10:16 PM, Peter Schuller <
> peter.schuller@infidyne.com> wrote:
>
>> >> - Have you been running repair consistently ?
>> >
>> > Nop, only when something breaks
>>
>> This is unrelated to the problem you were asking about, but if you
>> never run delete, make sure you are aware of:
>>
>> http://wiki.apache.org/cassandra/Operations#Frequency_of_nodetool_repair
>> http://wiki.apache.org/cassandra/DistributedDeletes
>>
>>
>> --
>> / Peter Schuller
>>
>
>
>
> --
> 闫春路
>
>
>


-- 
Charles

Mime
View raw message