cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mikhail Stepura (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-7317) Repair range validation and calculation is off
Date Wed, 04 Jun 2014 02:44:02 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-7317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14017327#comment-14017327
] 

Mikhail Stepura commented on CASSANDRA-7317:
--------------------------------------------

Current behavior, when {{-pr}} is specified, is to treat a multi-DC setup as a single ring.
Because {{TokenMetadata.getPredecessor(Token)}} doesn't take into account a DC for a token,
and just search for a predecessor across all tokens from all DCs.

I'm not sure if that's expected or not



> Repair range validation and calculation is off
> ----------------------------------------------
>
>                 Key: CASSANDRA-7317
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7317
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Nick Bailey
>            Assignee: Yuki Morishita
>             Fix For: 2.0.9
>
>         Attachments: Untitled Diagram(1).png
>
>
> From what I can tell the calculation (using the -pr option) and validation of tokens
for repairing ranges is broken. Or at least should be improved. Using an example with ccm:
> Nodetool ring:
> {noformat}
> Datacenter: dc1
> ==========
> Address    Rack        Status State   Load            Owns                Token
>                                                                           -10
> 127.0.0.1  r1          Up     Normal  188.96 KB       50.00%              -9223372036854775808
> 127.0.0.2  r1          Up     Normal  194.77 KB       50.00%              -10
> Datacenter: dc2
> ==========
> Address    Rack        Status State   Load            Owns                Token
>                                                                           0
> 127.0.0.4  r1          Up     Normal  160.58 KB       0.00%               -9223372036854775798
> 127.0.0.3  r1          Up     Normal  139.46 KB       0.00%               0
> {noformat}
> Schema:
> {noformat}
> CREATE KEYSPACE system_traces WITH replication = {
>   'class': 'NetworkTopologyStrategy',
>   'dc2': '2',
>   'dc1': '2'
> };
> {noformat}
> Repair -pr:
> {noformat}
> [Nicks-MacBook-Pro:21:35:58 cassandra-2.0] cassandra$ bin/nodetool -p 7100 repair -pr
system_traces
> [2014-05-28 21:36:01,977] Starting repair command #12, repairing 1 ranges for keyspace
system_traces
> [2014-05-28 21:36:02,207] Repair session f984d290-e6d9-11e3-9edc-5f8011daec21 for range
(0,-9223372036854775808] finished
> [2014-05-28 21:36:02,207] Repair command #12 finished
> [Nicks-MacBook-Pro:21:36:02 cassandra-2.0] cassandra$ bin/nodetool -p 7200 repair -pr
system_traces
> [2014-05-28 21:36:14,086] Starting repair command #1, repairing 1 ranges for keyspace
system_traces
> [2014-05-28 21:36:14,406] Repair session 00bd45b0-e6da-11e3-98fc-5f8011daec21 for range
(-9223372036854775798,-10] finished
> [2014-05-28 21:36:14,406] Repair command #1 finished
> {noformat}
> Note that repairing both nodes in dc1, leaves very small ranges unrepaired. For example
(-10,0]. Repairing the 'primary range' in dc2 will repair those small ranges. Maybe that is
the behavior we want but it seems counterintuitive.
> The behavior when manually trying to repair the full range of 127.0.0.01 definitely needs
improvement though.
> Repair command:
> {noformat}
> [Nicks-MacBook-Pro:21:50:44 cassandra-2.0] cassandra$ bin/nodetool -p 7100 repair -st
-10 -et -9223372036854775808 system_traces
> [2014-05-28 21:50:55,803] Starting repair command #17, repairing 1 ranges for keyspace
system_traces
> [2014-05-28 21:50:55,804] Starting repair command #17, repairing 1 ranges for keyspace
system_traces
> [2014-05-28 21:50:55,804] Repair command #17 finished
> [Nicks-MacBook-Pro:21:50:56 cassandra-2.0] cassandra$ echo $?
> 1
> {noformat}
> system.log:
> {noformat}
> ERROR [Thread-96] 2014-05-28 21:40:05,921 StorageService.java (line 2621) Repair session
failed:
> java.lang.IllegalArgumentException: Requested range intersects a local range but is not
fully contained in one; this would lead to imprecise repair
> {noformat}
> * The actual output of the repair command doesn't really indicate that there was an issue.
Although the command does return with a non zero exit status.
> * The error here is invisible if you are using the synchronous jmx repair api. It will
appear as though the repair completed successfully.
> * Personally, I believe that should be a valid repair command. For the system_traces
keyspace, 127.0.0.1 is responsible for this range (and I would argue the 'primary range' of
the node).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message