Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E679917C8A for ; Fri, 8 May 2015 16:49:04 +0000 (UTC) Received: (qmail 57517 invoked by uid 500); 8 May 2015 16:49:04 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 57477 invoked by uid 500); 8 May 2015 16:49:04 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 57465 invoked by uid 99); 8 May 2015 16:49:04 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 08 May 2015 16:49:04 +0000 Date: Fri, 8 May 2015 16:49:04 +0000 (UTC) From: "Kenneth Failbus (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (CASSANDRA-7317) Repair range validation is too strict MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-7317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14534858#comment-14534858 ] Kenneth Failbus edited comment on CASSANDRA-7317 at 5/8/15 4:48 PM: -------------------------------------------------------------------- Folks, I am seeing this error again in 2.0.9 release. I have vnodes in my cluster enabled. {code} 2015-05-08 15:01:56,021 [AntiEntropyStage:1] INFO Validator [repair #254edb00-f593-11e4-9397-51babce9f892] Sending completed merkle tree to /10.22.168.35 for CF1/Sequence 2015-05-08 15:01:58,518 [AntiEntropyStage:1] INFO Validator [repair #e3ca16e0-f592-11e4-bce3-6f1b5fa480b1] Sending completed merkle tree to /10.22.168.105 for system_auth/permissions 2015-05-08 15:01:58,791 [AntiEntropyStage:1] INFO Validator [repair #e3ca16e0-f592-11e4-bce3-6f1b5fa480b1] Sending completed merkle tree to /10.22.168.105 for system_auth/credentials 2015-05-08 15:01:58,980 [AntiEntropyStage:1] INFO Validator [repair #e3ca16e0-f592-11e4-bce3-6f1b5fa480b1] Sending completed merkle tree to /10.22.168.105 for system_auth/users 2015-05-08 15:02:00,640 [AntiEntropyStage:1] INFO Validator [repair #e0d31e00-f592-11e4-993e-c9cc22925782] Sending completed merkle tree to /10.22.168.97 for system_auth/credentials 2015-05-08 15:02:01,345 [AntiEntropyStage:1] INFO Validator [repair #e0d31e00-f592-11e4-993e-c9cc22925782] Sending completed merkle tree to /10.22.168.97 for system_auth/users 2015-05-08 15:02:01,577 [AntiEntropyStage:1] INFO Validator [repair #e0d31e00-f592-11e4-993e-c9cc22925782] Sending completed merkle tree to /10.22.168.97 for system_auth/permissions 2015-05-08 15:02:01,753 [AntiEntropyStage:1] INFO Validator [repair #27dba060-f593-11e4-873b-9d346bbba08e] Sending completed merkle tree to /10.22.168.87 for CF1/Sequence 2015-05-08 15:02:02,622 [AntiEntropyStage:1] INFO Validator [repair #dba213a0-f592-11e4-b745-192986bd7af2] Sending completed merkle tree to /10.22.168.117 for system_auth/credentials 2015-05-08 15:02:02,873 [AntiEntropyStage:1] INFO Validator [repair #dba213a0-f592-11e4-b745-192986bd7af2] Sending completed merkle tree to /10.22.168.117 for system_auth/users 2015-05-08 15:02:03,508 [AntiEntropyStage:1] INFO Validator [repair #dba213a0-f592-11e4-b745-192986bd7af2] Sending completed merkle tree to /10.22.168.117 for system_auth/permissions 2015-05-08 15:02:03,988 [AntiEntropyStage:1] INFO Validator [repair #d0a2ad70-f592-11e4-a5a2-b73fe73dbe79] Sending completed merkle tree to /10.22.168.109 for system_auth/credentials 2015-05-08 15:02:04,759 [AntiEntropyStage:1] INFO Validator [repair #d0a2ad70-f592-11e4-a5a2-b73fe73dbe79] Sending completed merkle tree to /10.22.168.109 for system_auth/users 2015-05-08 15:02:05,066 [AntiEntropyStage:1] INFO Validator [repair #d0a2ad70-f592-11e4-a5a2-b73fe73dbe79] Sending completed merkle tree to /10.22.168.109 for system_auth/permissions 2015-05-08 15:02:05,200 [Thread-227856] ERROR StorageService Repair session failed: java.lang.IllegalArgumentException: Requested range intersects a local range but is not fully contained in one; this would lead to imprecise repair at org.apache.cassandra.service.ActiveRepairService.getNeighbors(ActiveRepairService.java:161) at org.apache.cassandra.repair.RepairSession.(RepairSession.java:130) at org.apache.cassandra.repair.RepairSession.(RepairSession.java:119) at org.apache.cassandra.service.ActiveRepairService.submitRepairSession(ActiveRepairService.java:97) at org.apache.cassandra.service.StorageService.forceKeyspaceRepair(StorageService.java:2628) at org.apache.cassandra.service.StorageService$4.runMayThrow(StorageService.java:2564) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.lang.Thread.run(Thread.java:744) was (Author: kenfailbus): Folks, I am seeing this error again in 2.0.9 release. I have vnodes in my cluster enabled. > Repair range validation is too strict > ------------------------------------- > > Key: CASSANDRA-7317 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7317 > Project: Cassandra > Issue Type: Bug > Reporter: Nick Bailey > Assignee: Yuki Morishita > Fix For: 2.0.9 > > Attachments: 7317-2.0.txt, Untitled Diagram(1).png > > > From what I can tell the calculation (using the -pr option) and validation of tokens for repairing ranges is broken. Or at least should be improved. Using an example with ccm: > Nodetool ring: > {noformat} > Datacenter: dc1 > ========== > Address Rack Status State Load Owns Token > -10 > 127.0.0.1 r1 Up Normal 188.96 KB 50.00% -9223372036854775808 > 127.0.0.2 r1 Up Normal 194.77 KB 50.00% -10 > Datacenter: dc2 > ========== > Address Rack Status State Load Owns Token > 0 > 127.0.0.4 r1 Up Normal 160.58 KB 0.00% -9223372036854775798 > 127.0.0.3 r1 Up Normal 139.46 KB 0.00% 0 > {noformat} > Schema: > {noformat} > CREATE KEYSPACE system_traces WITH replication = { > 'class': 'NetworkTopologyStrategy', > 'dc2': '2', > 'dc1': '2' > }; > {noformat} > Repair -pr: > {noformat} > [Nicks-MacBook-Pro:21:35:58 cassandra-2.0] cassandra$ bin/nodetool -p 7100 repair -pr system_traces > [2014-05-28 21:36:01,977] Starting repair command #12, repairing 1 ranges for keyspace system_traces > [2014-05-28 21:36:02,207] Repair session f984d290-e6d9-11e3-9edc-5f8011daec21 for range (0,-9223372036854775808] finished > [2014-05-28 21:36:02,207] Repair command #12 finished > [Nicks-MacBook-Pro:21:36:02 cassandra-2.0] cassandra$ bin/nodetool -p 7200 repair -pr system_traces > [2014-05-28 21:36:14,086] Starting repair command #1, repairing 1 ranges for keyspace system_traces > [2014-05-28 21:36:14,406] Repair session 00bd45b0-e6da-11e3-98fc-5f8011daec21 for range (-9223372036854775798,-10] finished > [2014-05-28 21:36:14,406] Repair command #1 finished > {noformat} > Note that repairing both nodes in dc1, leaves very small ranges unrepaired. For example (-10,0]. Repairing the 'primary range' in dc2 will repair those small ranges. Maybe that is the behavior we want but it seems counterintuitive. > The behavior when manually trying to repair the full range of 127.0.0.01 definitely needs improvement though. > Repair command: > {noformat} > [Nicks-MacBook-Pro:21:50:44 cassandra-2.0] cassandra$ bin/nodetool -p 7100 repair -st -10 -et -9223372036854775808 system_traces > [2014-05-28 21:50:55,803] Starting repair command #17, repairing 1 ranges for keyspace system_traces > [2014-05-28 21:50:55,804] Starting repair command #17, repairing 1 ranges for keyspace system_traces > [2014-05-28 21:50:55,804] Repair command #17 finished > [Nicks-MacBook-Pro:21:50:56 cassandra-2.0] cassandra$ echo $? > 1 > {noformat} > system.log: > {noformat} > ERROR [Thread-96] 2014-05-28 21:40:05,921 StorageService.java (line 2621) Repair session failed: > java.lang.IllegalArgumentException: Requested range intersects a local range but is not fully contained in one; this would lead to imprecise repair > {noformat} > * The actual output of the repair command doesn't really indicate that there was an issue. Although the command does return with a non zero exit status. > * The error here is invisible if you are using the synchronous jmx repair api. It will appear as though the repair completed successfully. > * Personally, I believe that should be a valid repair command. For the system_traces keyspace, 127.0.0.1 is responsible for this range (and I would argue the 'primary range' of the node). -- This message was sent by Atlassian JIRA (v6.3.4#6332)