cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-9022) Node Cleanup deletes all its data after a new node joined the cluster
Date Mon, 23 Mar 2015 21:29:53 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-9022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Benedict updated CASSANDRA-9022:
--------------------------------
    Attachment: 9022.txt

OK, so I think this is a very simple bug, which would have been fixed by CASSANDRA-8946 if
we had rolled it out to both constructors. Since only the other constructor is covered by
extensive unit tests, I'm attaching a patch that shares the behaviour between each. The simple
likely explanation is that the normalize call yields a minimum() bound for the RHS of a range,
which would cause nothing to be returned for that interval. At the time of writing it I didn't
realise the minimum() bound was used for the RHS max (which is actually an unnecessary complication
for Range, since it is an inclusive RHS, so we could consider changing that to avoid anyone
else making the mistake).

[~aboudreault]: could you try out with this patch?

> Node Cleanup deletes all its data after a new node joined the cluster
> ---------------------------------------------------------------------
>
>                 Key: CASSANDRA-9022
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9022
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Alan Boudreault
>            Assignee: Benedict
>            Priority: Critical
>             Fix For: 2.1.4
>
>         Attachments: 9022.txt, bisect.sh, results_cassandra_2.1.3.txt, results_cassandra_2.1_branch.txt
>
>
> I try to add a node in my cluster and doing some cleanup deleted all my data on a node.
This makes the cluster totally broken since all next read seem to not be able to validate
the data. Even a repair on the problematic node doesn't fix the issue.  I've attached the
bisect script used and the output results of the procedure.
> Procedure to reproduce:
> {code}
> ccm stop && ccm remove
> ccm create -n 2 --install-dir=path/to/cassandra-2.1/branch demo
> ccm start
> ccm node1 stress -- write n=1000000 -schema replication\(factor=2\) -rate threads=50
> ccm node1 nodetool status
> ccm add -i 127.0.0.3 -j 7400 node3 # no auto-boostrap
> ccm node3 start
> ccm node1 nodetool status
> ccm node3 repair
> ccm node3 nodetool status
> ccm node1 nodetool cleanup
> ccm node2 nodetool cleanup
> ccm node3 nodetool cleanup
> ccm node1 nodetool status
> ccm node1 repair
> ccm node1 stress -- read n=1000000 ## CRASH Data returned was not validated ?!?
> {code}
> bisec script output:
> {code}
> $ git bisect start cassandra-2.1 cassandra-2.1.3
> $ git bisect run ~/dev/cstar/cleanup_issue/bisect.sh
> ...
> 4b05b204acfa60ecad5672c7e6068eb47b21397a is the first bad commit
> commit 4b05b204acfa60ecad5672c7e6068eb47b21397a
> Author: Benedict Elliott Smith <benedict@apache.org>
> Date:   Wed Feb 11 15:49:43 2015 +0000
>     Enforce SSTableReader.first/last
>     
>     patch by benedict; reviewed by yukim for CASSANDRA-8744
> :100644 100644 3f0463731e624cbe273dcb3951b2055fa5d9e1a2 b2f894eb22b9102d410f1eabeb3e11d26727fbd3
M      CHANGES.txt
> :040000 040000 51ac2a6cd39bd2377c2e1ed6693ef789ab65a26c 79fa2501f4155a64dca2bbdcc9e578008e4e425a
M      src
> bisect run success
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message