incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Fleming <bigbrianflem...@gmail.com>
Subject Re: Read-repair working, repair not working?
Date Mon, 11 Feb 2013 08:13:09 GMT

Hi Aaron,
 
Many thanks for your reply - answers below.
 
Cheers,
 
Brian
 
 
> What CL are you using for reads and writes?
> I would first build a test case to ensure correct operation when using strong consistency.
i.e. QUOURM write and read. Because you are using RF 2 per DC I assume you are not using LOCAL_QUOURM
because that is 2 and you would not have any redundancy in the DC.
CL.ONE : this is primarily for performance reasons but also because there are only three local
nodes as you suggest and we need at least some resiliency.  
In the context of this issue, I considered increasing this to CL.LOCAL_QUORUM but the behaviour
suggests than none of the 3 local nodes have the data (say I make 100 requests : all 100 initially
fail and subsequently all 100 succeed), so not sure it'll help?
 
 
> Dropped mutations in a multi DC setup may be a sign of network congestion or overloaded
nodes.
This DC is remote in terms of network topology - it's in Asia (Hong Kong) while the rest of
the cluster is in Europe/North America, so network latency rather than congestion could be
a cause?  However I see some pretty aggressive data transfer speeds during the initial repairs
& the data footprint approximately matches the nodes elsewhere in the ring, so something
doesn't add up?
 
Here are the tpstats for one of these nodes :
Pool Name                    Active   Pending      Completed   Blocked  All time blocked
ReadStage                         0         0       4919185         0                 0
RequestResponseStage              0         0      16869994         0                 0
MutationStage                     0         0      16764910         0                 0
ReadRepairStage                   0         0           3703         0                 0
ReplicateOnWriteStage             0         0              0         0                 0
GossipStage                       0         0         845225         0                 0
AntiEntropyStage                  0         0          52441         0                 0
MigrationStage                    0         0           4362         0                 0
MemtablePostFlusher               0         0            952         0                 0
StreamStage                       0         0             24         0                 0
FlushWriter                       0         0            960         0                 5
MiscStage                         0         0           3592         0                 0
AntiEntropySessions               4         4            121         0                 0
InternalResponseStage             0         0              0         0                 0
HintedHandoff                     1         2             55         0                 0
 
Message type           Dropped
RANGE_SLICE                  0
READ_REPAIR             150597
BINARY                       0
READ                    781490
MUTATION                853846
REQUEST_RESPONSE             0
The numbers of READ_REPAIR, READ & MUTATION operations & are non-negligable.  The
nodes in Europe/North America have effectively zero dropped messages.  This suggests network
latency is probably a significant factor? 
[the network ping from Europe to a HK node is ~250ms, so I wouldn’t have expected it to
be such a problem?]
 

> It would, but the INFO logging for the AES is pretty good. I would hold off for now.
Ok.
 
> [AES session logging]
Yes, I see the expected start/end logs, so that's another thing off the list.
 
 


On 10 Feb 2013, at 20:12, aaron morton <aaron@thelastpickle.com> wrote:

>> I’d request data, nothing would be returned, I would then re-request the data and
it would correctly be returned:
> What CL are you using for reads and writes?
> 
>> I see a number of dropped ‘MUTATION’ operations : just under 5% of the total
‘MutationStage’ count.
> Dropped mutations in a multi DC setup may be a sign of network congestion or overloaded
nodes. 
> 
> 
>> -          Could anybody suggest anything specific to look at to see why the repair
operations aren’t having the desired effect?
> I would first build a test case to ensure correct operation when using strong consistency.
i.e. QUOURM write and read. Because you are using RF 2 per DC I assume you are not using LOCAL_QUOURM
because that is 2 and you would not have any redundancy in the DC. 
> 
> 
>> 
>> -          Would increasing logging level to ‘DEBUG’ show read-repair activity
(to confirm that this is happening, when & for what proportion of total requests)?
> It would, but the INFO logging for the AES is pretty good. I would hold off for now.

> 
>> 
>> -          Is there something obvious that I could be missing here?
> When a new AES session starts it logs this
> 
>            logger.info(String.format("[repair #%s] new session: will sync %s on range
%s for %s.%s", getName(), repairedNodes(), range, tablename, Arrays.toString(cfnames)));
> 
> When it completes it logs this
> 
> logger.info(String.format("[repair #%s] session completed successfully", getName()));
> 
> Or this on failure 
> 
> logger.error(String.format("[repair #%s] session completed with the following error",
getName()), exception);
> 
> 
> Cheers
> 
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
> 
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 10/02/2013, at 9:56 PM, Brian Fleming <bigbrianfleming@gmail.com> wrote:
> 
>> 
>> 
>> 
>> Hi,
>> 
>> 
>> 
>> I have a 20 node cluster running v1.0.7 split between 5 data centres, each with an
RF of 2, containing a ~1TB unique dataset/~10TB of total data. 
>> 
>> 
>> 
>> I’ve had some intermittent issues with a new data centre (3 nodes, RF=2) I brought
online late last year with data consistency & availability: I’d request data, nothing
would be returned, I would then re-request the data and it would correctly be returned: i.e.
read-repair appeared to be occurring.  However running repairs on the nodes didn’t resolve
this (I tried general ‘repair’ commands as well as targeted keyspace commands) – this
didn’t alter the behaviour.
>> 
>> 
>> 
>> After a lot of fruitless investigation, I decided to wipe & re-install/re-populate
the nodes.  The re-install & repair operations are now complete: I see the expected amount
of data on the nodes, however I am still seeing the same behaviour, i.e. I only get data after
one failed attempt.
>> 
>> 
>> 
>> When I run repair commands, I don’t see any errors in the logs. 
>> 
>> I see the expected ‘AntiEntropySessions’ count in ‘nodetool tpstats’ during
repair sessions.
>> 
>> I see a number of dropped ‘MUTATION’ operations : just under 5% of the total
‘MutationStage’ count.
>> 
>> 
>> 
>> Questions :
>> 
>> -          Could anybody suggest anything specific to look at to see why the repair
operations aren’t having the desired effect? 
>> 
>> -          Would increasing logging level to ‘DEBUG’ show read-repair activity
(to confirm that this is happening, when & for what proportion of total requests)?
>> 
>> -          Is there something obvious that I could be missing here?
>> 
>> 
>> 
>> Many thanks,
>> 
>> Brian
> 

Mime
View raw message