cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Arena (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-2798) Repair Fails 0.8
Date Wed, 22 Jun 2011 11:14:47 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13053185#comment-13053185
] 

David Arena commented on CASSANDRA-2798:
----------------------------------------

Soo after a restart and a compact.. im looking at this.. ( Still doesnt seem absolutely correct,
but yes you are correct about compaction problems.. )

10.0.1.150 Up Normal 2.61 GB 33.33% 0 
10.0.1.152 Up Normal 2.61 GB 33.33% 56713727820156410577229101238628035242 
10.0.1.154 Up Normal 3.16 GB 33.33% 113427455640312821154458202477256070485

Node1 & Node2 is now back to normal..
but Node3 did not return to 2.61GB...
Ive tried, compact, flush, cleanup.. etc etc.. It wont get smaller.. :(

I still dont understand why a repair on node3 balloons the data on node1 & node2 in 0.8..
This should happen as far as i believe.. 
Its my understanding that node3 should, copy the data from its replicas on other nodes ( hence
why we see 2x the data size... ) and then a compact to aggregate it down to a proper replica
for the cluster..

Node1 & Node2 really shouldnt be changing at all ???


> Repair Fails 0.8
> ----------------
>
>                 Key: CASSANDRA-2798
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2798
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.8.0
>            Reporter: David Arena
>            Assignee: Sylvain Lebresne
>
> I am seeing a fatal problem in the new 0.8
> Im running a 3 node cluster with a replication_factor of 3..
> On Node 3.. If i 
> # kill -9 cassandra-pid
> # rm -rf "All data & logs"
> # start cassandra
> # nodetool -h "node-3-ip" repair
> The whole cluster become duplicated..
> * e.g Before 
> node 1 -> 2.65GB
> node 2 -> 2.65GB
> node 3 -> 2.65GB
> * e.g After
> node 1 -> 5.3GB
> node 2 -> 5.3GB
> node 3 -> 7.95GB
> -> nodetool repair, never ends (96 hours +), however there is no streams running,
nor any cpu or disk activity..
> -> Manually killing the repair and restarting does not help.. Restarting the server/cassandra
does not help..
> -> nodetool flush,compact,cleanup all complete, but do not help...
> This is not occuring in 0.7.6.. I have come to the conclusion this is a Major 0.8 issue
> Running: CentOS 5.6, JDK 1.6.0_26

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message