incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Repair does not fix inconsistency
Date Thu, 04 Apr 2013 01:35:41 GMT
What version are you on ? 

Can you run a repair on the CF and check:

Does the repair detect differences in the CF and stream changes ? 
After the streaming does it run a secondary index rebuild on the new sstable ? (Should be
in the logs)

Can you provide the full query trace ? 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 3/04/2013, at 5:25 PM, Michal Michalski <michalm@opera.com> wrote:

> Hi,
> 
> TL;DR: I have inconsistend data (1 live row on node A & 1 tombstoned row on node
B) that do not get fixed by repair. What can be a problem?
> 
> Long version:
> 
> I have a CF containing Users' info, which I sometimes query by key, and sometimes by
indexed columns like email. I'm using RF=2. I write with CL.ONE, but  this CF is very rarely
updated, so C* has a looot of time to fix inconsistencies that may occur, so I'm fine with
this (at least in theory ;-) ).
> 
> To be clear:
> - I've run a successfull cluster-wide repair on this CF before testing, so I do not expect
any inconsistency
> - All indexes are built, I've rebuilt them manually before testing, so I expect them
to work properly (I mention it because it seems to be somehow related to indexes, but I'm
not sure - see below)
> 
> The problem is:
> 
> When I query (cqlsh) some rows by key (CL is default = ONE) I _always_ get a correct
result.  However, when I query it by indexed column, it returns nothing.
> 
> When tracing a query with CL.ALL in cqlsh, I get info that C* has:
> 
> Read 0 live cells and 1 tombstoned       // for first replica node
> Read 1 live cells and 0 tombstoned       // for second replica node
> 
> When CL is ONE it's never asking second replica for data (possibly due to DynamicSnitch
scores or so), so it returns nothing.
> 
> Switching to CL >= TWO obviously fixes this problem for us, but it's not the solution
I'd like to use as I'd rather rely on fast read/write requests with CL.ONE + frequent repairs,
allowing some short-term inconsistency.
> 
> Any ideas why it may happen that data are still inconsistent after repair? Is there something
I could have missed?
> 
> I'm mainly surprised that repair does not fix this inconsistency in ANY way - either
by pulling missing data to first replica _OR_ tombstoning it on second replica. First one
would be correct (delete was made a long time ago and then the row reappeared), but both could
make sense, as both will make the data consistent. In this state it's definitely inconsistent
and I don't understand it :-)
> 
> 
> M.


Mime
View raw message