incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Token Ring Gaps in a 2 DC Setup
Date Sun, 25 Mar 2012 17:01:00 GMT
What about for writes ? 

If you are seeing read repair it means that less than RF nodes got the mutation.  If you are
writing at a low CL you may be overloading the cluster, check for dropped messages.  If this
is the case increase the CL to increase the chances that reads are not issued until the write
has been applied. 

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 23/03/2012, at 8:03 PM, Caleb Rackliffe wrote:

> Yup, all repairs are complete.  I'm reading at a CL of ONE pretty much everywhere.
> 
> Caleb Rackliffe | Software Developer	
> M 949.981.0159 | caleb@steelhouse.com
> <EB2FF764-478C-4966-9B0A-E7B76D6AD7DC[1].png>
> 
> From: aaron morton <aaron@thelastpickle.com>
> Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org>
> Date: Tue, 20 Mar 2012 13:15:27 -0400
> To: "user@cassandra.apache.org" <user@cassandra.apache.org>
> Subject: Re: Token Ring Gaps in a 2 DC Setup
> 
> mmm, has repair completed on all nodes ? 
> 
>> Also, while it was digging around, I noticed that we do a LOT of reads immediately
after writes, and almost every read from the first DC was bringing a read-repair along with
it. 
> What CL are you using ? 
> 
> Cheers
> 
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 20/03/2012, at 7:39 AM, Caleb Rackliffe wrote:
> 
>> Hey Aaron,
>> 
>> I've run cleanup jobs across all 15 nodes, and after that, I still have about a 24
million to 15 million key ratio between the data centers.  The first DC is a few months older
than the second, and it also began its life before 1.0.7 was out, whereas the second started
at 1.0.7.  I wonder if running and upgradesstables would be interesting?
>> 
>> Also, while it was digging around, I noticed that we do a LOT of reads immediately
after writes, and almost every read from the first DC was bringing a read-repair along with
it.  (Possibly because the distant DC had not yet received certain mutations?)  I ended up
turning RR off entirely, since I've got HH in place to handle short-duration failures :)
>> 
>> Caleb Rackliffe | Software Developer	
>> M 949.981.0159 | caleb@steelhouse.com
>> <EB2FF764-478C-4966-9B0A-E7B76D6AD7DC[21].png>
>> 
>> From: aaron morton <aaron@thelastpickle.com>
>> Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org>
>> Date: Mon, 19 Mar 2012 13:34:38 -0400
>> To: "user@cassandra.apache.org" <user@cassandra.apache.org>
>> Subject: Re: Token Ring Gaps in a 2 DC Setup
>> 
>>>  I've also run repair on a few nodes in both data centers, but the sizes are
still vastly different.
>> If repair is completing on all the nodes then the data is fully distributed. 
>> 
>> If you want to dig around…
>> 
>> Take a look at the data files on disk. Do the nodes in DC 1 have some larger, older,
data files ? These may be waiting for compaction to catch up them. 
>> 
>> If you have done any toke moves, did you run cleanup afterwards ? 
>> 
>> 
>> Cheers
>> 
>> -----------------
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 18/03/2012, at 8:35 PM, Caleb Rackliffe wrote:
>> 
>>> More detail…
>>> 
>>> I'm running 1.0.7 on these boxes, and the keyspace readout from the CLI looks
like this:
>>> 
>>> create keyspace Users
>>>   with placement_strategy = 'NetworkTopologyStrategy'
>>>   and strategy_options = {DC2 : 1, DC1 : 2}
>>>   and durable_writes = true;
>>> 
>>> Thanks!
>>> 
>>> Caleb Rackliffe | Software Developer	
>>> M 949.981.0159 | caleb@steelhouse.com
>>> 
>>> From: Caleb Rackliffe <caleb@steelhouse.com>
>>> Date: Sun, 18 Mar 2012 02:47:05 -0400
>>> To: "user@cassandra.apache.org" <user@cassandra.apache.org>
>>> Subject: Token Ring Gaps in a 2 DC Setup
>>> 
>>> Hi Everyone,
>>> 
>>> I have a cluster using NetworkTopologyStrategy that looks like this:
>>> 
>>> 10.41.116.22     DC1         RAC1         Up     Normal  13.21 GB        10.00%
 0                                           
>>> 10.54.149.202   DC2         RAC1         Up     Normal  6.98 GB            0.00%
  1                                           
>>> 10.41.116.20     DC1         RAC2         Up     Normal  12.75 GB        10.00%
 17014118300000000000000000000000000000      
>>> 10.41.116.16     DC1         RAC3         Up     Normal  12.62 GB        10.00%
 34028236700000000000000000000000000000      
>>> 10.54.149.203   DC2         RAC1         Up     Normal  6.7 GB              0.00%
  34028236700000000000000000000000000001      
>>> 10.41.116.18     DC1         RAC4         Up     Normal  10.8 GB          10.00%
 51042355000000000000000000000000000000      
>>> 10.41.116.14     DC1         RAC5         Up     Normal  10.27 GB        10.00%
 68056473400000000000000000000000000000      
>>> 10.54.149.204   DC2         RAC1         Up     Normal  6.7 GB             0.00%
  68056473400000000000000000000000000001      
>>> 10.41.116.12     DC1         RAC6         Up     Normal  10.58 GB        10.00%
 85070591700000000000000000000000000000      
>>> 10.41.116.10     DC1         RAC7         Up     Normal  10.89 GB        10.00%
 102084710000000000000000000000000000000     
>>> 10.54.149.205   DC2         RAC1         Up     Normal  7.51 GB           0.00%
  102084710000000000000000000000000000001     
>>> 10.41.116.8       DC1         RAC8          Up     Normal  10.48 GB        10.00%
 119098828000000000000000000000000000000     
>>> 10.41.116.24     DC1         RAC9         Up     Normal  10.89 GB        10.00%
 136112947000000000000000000000000000000     
>>> 10.54.149.206   DC2         RAC1         Up     Normal  6.37 GB           0.00%
  136112947000000000000000000000000000001     
>>> 10.41.116.26     DC1         RAC10       Up     Normal  11.17 GB        10.00%
 153127065000000000000000000000000000000
>>> 
>>> There are two data centers, one with 10 nodes/2 replicas and one with 5 nodes/1
replica.  What I've attempted to do with my token assignments is have each node in the smaller
DC handle 20% of the keyspace, and this would mean that I should see roughly equal usage on
all 15 boxes.  It just doesn't seem to be happening that way, though.  It looks like the "1
replica" nodes are carrying about half the data the "2 replica" nodes are.  It's almost as
if those nodes are only handling 10% of the keyspace instead of 20%.
>>> 
>>> Does anybody have any suggestions as to what might be going on?  I've run nodetool
getendpoints against a bunch of keys, and I always get back three nodes, so I'm pretty confused.
 I've also run repair on a few nodes in both data centers, but the sizes are still vastly
different.
>>> 
>>> Thanks!
>>> 
>>> Caleb Rackliffe | Software Developer	
>>> M 949.981.0159 | caleb@steelhouse.com
>> 
> 


Mime
View raw message