cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vladimir Barinov (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-5178) Sometimes repair process doesn't work properly
Date Mon, 21 Jan 2013 13:58:12 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-5178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Vladimir Barinov updated CASSANDRA-5178:
----------------------------------------

    Description: 
Pre-conditions:
1. We have two separate datacenters called "DC1" and "DC2" correspondingly. Each of them contains
of 6 nodes.
2. DC2 is disabled.
3. Tokens for DC1 are calculated via https://raw.github.com/riptano/ComboAMI/2.2/tokentoolv2.py.
Tokens for DC2 are the same as for DC1 but they have an offset: +100. So for token 0 in DC1
we'll have token 100 in DC2 and so on.
4. We have a test data set (1 billion keys).

*Steps to reproduce:*

*Step 1:*
Lets check current configuration.

nodetool ring:                                                                       
{quote}
{noformat}
    <ip>     DC1         RAC1        Up     Normal  44,53 KB        33,33%         
    0                                           
    <ip>     DC1         RAC1        Up     Normal  51,8 KB         33,33%         
    28356863910078205288614550619314017621      
    <ip>     DC1         RAC1        Up     Normal  21,82 KB        33,33%         
    56713727820156410577229101238628035242      
    <ip>     DC1         RAC1        Up     Normal  21,82 KB        33,33%         
    85070591730234615865843651857942052864      
    <ip>     DC1         RAC1        Up     Normal  51,8 KB         33,33%         
    113427455640312821154458202477256070485     
    <ip>     DC1         RAC1        Up     Normal  21,82 KB        33,33%         
    141784319550391026443072753096570088106  
{noformat}   
{quote}

*Current schema:*
{quote}
    create keyspace benchmarks
      with placement_strategy = 'NetworkTopologyStrategy'
      *and strategy_options = \{DC1 : 2};*
    use benchmarks;
    create column family test_family
      with compaction_strategy = 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'

      ... 

      and compaction_strategy_options = \{'sstable_size_in_mb' : '20'}
      and compression_options = \{'chunk_length_kb' : '32', 'sstable_compression' : 'org.apache.cassandra.io.compress.SnappyCompressor'};
{quote}

*STEP 2:*
Write first part of test data set (500 000 of keys) to DC1 with LOCAL_QUORUM consistency level.

*STEP 3:*

Update cassandra.yaml,cassandra-topology.properties with new IP's from DC2 and current keyspace
schema with *strategy_options = \{DC1 : 2, DC2 : 0};*

*STEP 4:*

Start all nodes from DC2.

Check that nodes are started successfully:
{quote}
{noformat}
    <ip>     DC1         RAC1        Up     Normal  11,4 MB         33,33%         
    0                                           
    <ip>     DC2         RAC2        Up     Normal  27,7 KB         0,00%          
    100                                         
    <ip>     DC1         RAC1        Up     Normal  11,34 MB        33,33%         
    28356863910078205288614550619314017621      
    <ip>     DC2         RAC2        Up     Normal  42,69 KB        0,00%          
    28356863910078205288614550619314017721      
    <ip>     DC1         RAC1        Up     Normal  11,37 MB        33,33%         
    56713727820156410577229101238628035242      
    <ip>     DC2         RAC2        Up     Normal  52,02 KB        0,00%          
    56713727820156410577229101238628035342      
    <ip>     DC1         RAC1        Up     Normal  11,4 MB         33,33%         
    85070591730234615865843651857942052864      
    <ip>     DC2         RAC2        Up     Normal  42,69 KB        0,00%          
    85070591730234615865843651857942052964      
    <ip>     DC1         RAC1        Up     Normal  11,43 MB        33,33%         
    113427455640312821154458202477256070485     
    <ip>     DC2         RAC2        Up     Normal  42,69 KB        0,00%          
    113427455640312821154458202477256070585     
    <ip>     DC1         RAC1        Up     Normal  11,39 MB        33,33%         
    141784319550391026443072753096570088106     
    <ip>     DC2         RAC2        Up     Normal  42,69 KB        0,00%          
    141784319550391026443072753096570088206     
{noformat}
{quote}

*STEP 5:*

Update keyspace schema with *strategy_options = \{DC1 : 2, DC2 : 2};*

STEP 6:

Write last 500 000 keys of the test data set to DC1 with *LOCAL_QUORUM* consistency level.


STEP 7:

Check that first part of the test data set (first 500 000 of keys) was written correct to
DC1.
Check that last part of the test data set (last 500 000 of keys) was written correct to both
datacenters.

STEP 8:

Run *nodetool repair* on each node of DC2 and wait until it's completed.

STEP 9:

Current nodetool ring:
{quote}
{noformat}
    <ip>     DC1         RAC1        Up     Normal  21,45 MB        33,33%         
    0                                           
    <ip>     DC2         RAC2        Up     Normal  23,5 MB         33,33%         
    100                                         
    <ip>     DC1         RAC1        Up     Normal  20,67 MB        33,33%         
    28356863910078205288614550619314017621      
    <ip>     DC2         RAC2        Up     Normal  23,55 MB        33,33%         
    28356863910078205288614550619314017721      
    <ip>     DC1         RAC1        Up     Normal  21,18 MB        33,33%         
    56713727820156410577229101238628035242      
    <ip>     DC2         RAC2        Up     Normal  23,5 MB         33,33%         
    56713727820156410577229101238628035342      
    <ip>     DC1         RAC1        Up     Normal  23,5 MB         33,33%         
    85070591730234615865843651857942052864      
    <ip>     DC2         RAC2        Up     Normal  23,55 MB        33,33%         
    85070591730234615865843651857942052964      
    <ip>     DC1         RAC1        Up     Normal  21,44 MB        33,33%         
    113427455640312821154458202477256070485     
    <ip>     DC2         RAC2        Up     Normal  23,46 MB        33,33%         
    113427455640312821154458202477256070585     
    <ip>     DC1         RAC1        Up     Normal  20,53 MB        33,33%         
    141784319550391026443072753096570088106     
    <ip>     DC2         RAC2        Up     Normal  23,55 MB        33,33%         
    141784319550391026443072753096570088206   
{noformat}  
{quote}

Check that full test data set has been written to both datacenters.
Resulit : 
Full test data set was successfully written to DC1. *24448* of them are not present on DC1.

Repeating *nodetool repair* doesn’t help. 

Result:
It seems that problem is related with the process of identifying keys which must be repaired
when current datacenter already had some keys.

If we start empty DC2 nodes after DC1 have got all 1 000 000  - *nodetool repair*  works fine,
without missing keys.

  was:
Pre-conditions:
1. We have two separate datacenters called "DC1" and "DC2" correspondingly. Each of them contains
of 6 nodes.
2. DC2 is disabled.
3. Tokens for DC1 are calculated via https://raw.github.com/riptano/ComboAMI/2.2/tokentoolv2.py.
Tokens for DC2 are the same as for DC1 but they have an offset: +100. So for token 0 in DC1
we'll have token 100 in DC2 and vice versus.
4. We have a test data set (1 billion keys).

*Steps to reproduce:*

*Step 1:*
Lets check current configuration.

nodetool ring:                                                                       
{quote}
{noformat}
    <ip>     DC1         RAC1        Up     Normal  44,53 KB        33,33%         
    0                                           
    <ip>     DC1         RAC1        Up     Normal  51,8 KB         33,33%         
    28356863910078205288614550619314017621      
    <ip>     DC1         RAC1        Up     Normal  21,82 KB        33,33%         
    56713727820156410577229101238628035242      
    <ip>     DC1         RAC1        Up     Normal  21,82 KB        33,33%         
    85070591730234615865843651857942052864      
    <ip>     DC1         RAC1        Up     Normal  51,8 KB         33,33%         
    113427455640312821154458202477256070485     
    <ip>     DC1         RAC1        Up     Normal  21,82 KB        33,33%         
    141784319550391026443072753096570088106  
{noformat}   
{quote}

*Current schema:*
{quote}
    create keyspace benchmarks
      with placement_strategy = 'NetworkTopologyStrategy'
      *and strategy_options = \{DC1 : 2};*
    use benchmarks;
    create column family test_family
      with compaction_strategy = 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'

      ... 

      and compaction_strategy_options = \{'sstable_size_in_mb' : '20'}
      and compression_options = \{'chunk_length_kb' : '32', 'sstable_compression' : 'org.apache.cassandra.io.compress.SnappyCompressor'};
{quote}

*STEP 2:*
Write first part of test data set (500 000 of keys) to DC1 with LOCAL_QUORUM consistency level.

*STEP 3:*

Update cassandra.yaml,cassandra-topology.properties with new IP's from DC2 and current keyspace
schema with *strategy_options = \{DC1 : 2, DC2 : 0};*

*STEP 4:*

Start all nodes from DC2.

Check that nodes are started successfully:
{quote}
{noformat}
    <ip>     DC1         RAC1        Up     Normal  11,4 MB         33,33%         
    0                                           
    <ip>     DC2         RAC2        Up     Normal  27,7 KB         0,00%          
    100                                         
    <ip>     DC1         RAC1        Up     Normal  11,34 MB        33,33%         
    28356863910078205288614550619314017621      
    <ip>     DC2         RAC2        Up     Normal  42,69 KB        0,00%          
    28356863910078205288614550619314017721      
    <ip>     DC1         RAC1        Up     Normal  11,37 MB        33,33%         
    56713727820156410577229101238628035242      
    <ip>     DC2         RAC2        Up     Normal  52,02 KB        0,00%          
    56713727820156410577229101238628035342      
    <ip>     DC1         RAC1        Up     Normal  11,4 MB         33,33%         
    85070591730234615865843651857942052864      
    <ip>     DC2         RAC2        Up     Normal  42,69 KB        0,00%          
    85070591730234615865843651857942052964      
    <ip>     DC1         RAC1        Up     Normal  11,43 MB        33,33%         
    113427455640312821154458202477256070485     
    <ip>     DC2         RAC2        Up     Normal  42,69 KB        0,00%          
    113427455640312821154458202477256070585     
    <ip>     DC1         RAC1        Up     Normal  11,39 MB        33,33%         
    141784319550391026443072753096570088106     
    <ip>     DC2         RAC2        Up     Normal  42,69 KB        0,00%          
    141784319550391026443072753096570088206     
{noformat}
{quote}

*STEP 5:*

Update keyspace schema with *strategy_options = \{DC1 : 2, DC2 : 2};*

STEP 6:

Write last 500 000 keys of the test data set to DC1 with *LOCAL_QUORUM* consistency level.


STEP 7:

Check that first part of the test data set (first 500 000 of keys) was written correct to
DC1.
Check that last part of the test data set (last 500 000 of keys) was written correct to both
datacenters.

STEP 8:

Run *nodetool repair* on each node of DC2 and wait until it's completed.

STEP 9:

Current nodetool ring:
{quote}
{noformat}
    <ip>     DC1         RAC1        Up     Normal  21,45 MB        33,33%         
    0                                           
    <ip>     DC2         RAC2        Up     Normal  23,5 MB         33,33%         
    100                                         
    <ip>     DC1         RAC1        Up     Normal  20,67 MB        33,33%         
    28356863910078205288614550619314017621      
    <ip>     DC2         RAC2        Up     Normal  23,55 MB        33,33%         
    28356863910078205288614550619314017721      
    <ip>     DC1         RAC1        Up     Normal  21,18 MB        33,33%         
    56713727820156410577229101238628035242      
    <ip>     DC2         RAC2        Up     Normal  23,5 MB         33,33%         
    56713727820156410577229101238628035342      
    <ip>     DC1         RAC1        Up     Normal  23,5 MB         33,33%         
    85070591730234615865843651857942052864      
    <ip>     DC2         RAC2        Up     Normal  23,55 MB        33,33%         
    85070591730234615865843651857942052964      
    <ip>     DC1         RAC1        Up     Normal  21,44 MB        33,33%         
    113427455640312821154458202477256070485     
    <ip>     DC2         RAC2        Up     Normal  23,46 MB        33,33%         
    113427455640312821154458202477256070585     
    <ip>     DC1         RAC1        Up     Normal  20,53 MB        33,33%         
    141784319550391026443072753096570088106     
    <ip>     DC2         RAC2        Up     Normal  23,55 MB        33,33%         
    141784319550391026443072753096570088206   
{noformat}  
{quote}

Check that full test data set has been written to both datacenters.
Resulit : 
Full test data set was successfully written to DC1. *24448* of them are not present on DC1.

Repeating *nodetool repair* doesn’t help. 

Result:
It seems that problem is related with the process of identifying keys which must be repaired
when current datacenter already had some keys.

If we start empty DC2 nodes after DC1 have got all 1 000 000  - *nodetool repair*  works fine,
without missing keys.

    
> Sometimes repair process doesn't work properly
> ----------------------------------------------
>
>                 Key: CASSANDRA-5178
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5178
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.1.7
>            Reporter: Vladimir Barinov
>
> Pre-conditions:
> 1. We have two separate datacenters called "DC1" and "DC2" correspondingly. Each of them
contains of 6 nodes.
> 2. DC2 is disabled.
> 3. Tokens for DC1 are calculated via https://raw.github.com/riptano/ComboAMI/2.2/tokentoolv2.py.
Tokens for DC2 are the same as for DC1 but they have an offset: +100. So for token 0 in DC1
we'll have token 100 in DC2 and so on.
> 4. We have a test data set (1 billion keys).
> *Steps to reproduce:*
> *Step 1:*
> Lets check current configuration.
> nodetool ring:                                                                      

> {quote}
> {noformat}
>     <ip>     DC1         RAC1        Up     Normal  44,53 KB        33,33%    
         0                                           
>     <ip>     DC1         RAC1        Up     Normal  51,8 KB         33,33%    
         28356863910078205288614550619314017621      
>     <ip>     DC1         RAC1        Up     Normal  21,82 KB        33,33%    
         56713727820156410577229101238628035242      
>     <ip>     DC1         RAC1        Up     Normal  21,82 KB        33,33%    
         85070591730234615865843651857942052864      
>     <ip>     DC1         RAC1        Up     Normal  51,8 KB         33,33%    
         113427455640312821154458202477256070485     
>     <ip>     DC1         RAC1        Up     Normal  21,82 KB        33,33%    
         141784319550391026443072753096570088106  
> {noformat}   
> {quote}
> *Current schema:*
> {quote}
>     create keyspace benchmarks
>       with placement_strategy = 'NetworkTopologyStrategy'
>       *and strategy_options = \{DC1 : 2};*
>     use benchmarks;
>     create column family test_family
>       with compaction_strategy = 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'
>       ... 
>       and compaction_strategy_options = \{'sstable_size_in_mb' : '20'}
>       and compression_options = \{'chunk_length_kb' : '32', 'sstable_compression' : 'org.apache.cassandra.io.compress.SnappyCompressor'};
> {quote}
> *STEP 2:*
> Write first part of test data set (500 000 of keys) to DC1 with LOCAL_QUORUM consistency
level.
> *STEP 3:*
> Update cassandra.yaml,cassandra-topology.properties with new IP's from DC2 and current
keyspace schema with *strategy_options = \{DC1 : 2, DC2 : 0};*
> *STEP 4:*
> Start all nodes from DC2.
> Check that nodes are started successfully:
> {quote}
> {noformat}
>     <ip>     DC1         RAC1        Up     Normal  11,4 MB         33,33%    
         0                                           
>     <ip>     DC2         RAC2        Up     Normal  27,7 KB         0,00%     
         100                                         
>     <ip>     DC1         RAC1        Up     Normal  11,34 MB        33,33%    
         28356863910078205288614550619314017621      
>     <ip>     DC2         RAC2        Up     Normal  42,69 KB        0,00%     
         28356863910078205288614550619314017721      
>     <ip>     DC1         RAC1        Up     Normal  11,37 MB        33,33%    
         56713727820156410577229101238628035242      
>     <ip>     DC2         RAC2        Up     Normal  52,02 KB        0,00%     
         56713727820156410577229101238628035342      
>     <ip>     DC1         RAC1        Up     Normal  11,4 MB         33,33%    
         85070591730234615865843651857942052864      
>     <ip>     DC2         RAC2        Up     Normal  42,69 KB        0,00%     
         85070591730234615865843651857942052964      
>     <ip>     DC1         RAC1        Up     Normal  11,43 MB        33,33%    
         113427455640312821154458202477256070485     
>     <ip>     DC2         RAC2        Up     Normal  42,69 KB        0,00%     
         113427455640312821154458202477256070585     
>     <ip>     DC1         RAC1        Up     Normal  11,39 MB        33,33%    
         141784319550391026443072753096570088106     
>     <ip>     DC2         RAC2        Up     Normal  42,69 KB        0,00%     
         141784319550391026443072753096570088206     
> {noformat}
> {quote}
> *STEP 5:*
> Update keyspace schema with *strategy_options = \{DC1 : 2, DC2 : 2};*
> STEP 6:
> Write last 500 000 keys of the test data set to DC1 with *LOCAL_QUORUM* consistency level.

> STEP 7:
> Check that first part of the test data set (first 500 000 of keys) was written correct
to DC1.
> Check that last part of the test data set (last 500 000 of keys) was written correct
to both datacenters.
> STEP 8:
> Run *nodetool repair* on each node of DC2 and wait until it's completed.
> STEP 9:
> Current nodetool ring:
> {quote}
> {noformat}
>     <ip>     DC1         RAC1        Up     Normal  21,45 MB        33,33%    
         0                                           
>     <ip>     DC2         RAC2        Up     Normal  23,5 MB         33,33%    
         100                                         
>     <ip>     DC1         RAC1        Up     Normal  20,67 MB        33,33%    
         28356863910078205288614550619314017621      
>     <ip>     DC2         RAC2        Up     Normal  23,55 MB        33,33%    
         28356863910078205288614550619314017721      
>     <ip>     DC1         RAC1        Up     Normal  21,18 MB        33,33%    
         56713727820156410577229101238628035242      
>     <ip>     DC2         RAC2        Up     Normal  23,5 MB         33,33%    
         56713727820156410577229101238628035342      
>     <ip>     DC1         RAC1        Up     Normal  23,5 MB         33,33%    
         85070591730234615865843651857942052864      
>     <ip>     DC2         RAC2        Up     Normal  23,55 MB        33,33%    
         85070591730234615865843651857942052964      
>     <ip>     DC1         RAC1        Up     Normal  21,44 MB        33,33%    
         113427455640312821154458202477256070485     
>     <ip>     DC2         RAC2        Up     Normal  23,46 MB        33,33%    
         113427455640312821154458202477256070585     
>     <ip>     DC1         RAC1        Up     Normal  20,53 MB        33,33%    
         141784319550391026443072753096570088106     
>     <ip>     DC2         RAC2        Up     Normal  23,55 MB        33,33%    
         141784319550391026443072753096570088206   
> {noformat}  
> {quote}
> Check that full test data set has been written to both datacenters.
> Resulit : 
> Full test data set was successfully written to DC1. *24448* of them are not present on
DC1.
> Repeating *nodetool repair* doesn’t help. 
> Result:
> It seems that problem is related with the process of identifying keys which must be repaired
when current datacenter already had some keys.
> If we start empty DC2 nodes after DC1 have got all 1 000 000  - *nodetool repair*  works
fine, without missing keys.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message