cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edmond Lau <edm...@ooyala.com>
Subject Re: read repair keeps occurring on every quorum read
Date Wed, 30 Sep 2009 18:57:47 GMT
Sweet.  Jonathan, I patched in your change and no longer see a read
repair for every quorum read in the debug logs.  I also reran a load
test consisting of only simple read operations, and my quorum read
throughput has doubled now that the extraneous read repairs aren't
occurring.  This makes sense since the read repair essentially doubled
the amount of work - one read to serve data and one read to repair.

Thanks for the fast response.
Edmond

On Wed, Sep 30, 2009 at 8:14 AM, Jonathan Ellis <jbellis@gmail.com> wrote:
> Since JIRA is mostly dead right now, here is the patch to test against 0.4.
>
> On Mon, Sep 28, 2009 at 4:30 PM, Edmond Lau <edmond@ooyala.com> wrote:
>> On Fri, Sep 25, 2009 at 8:10 PM, Jonathan Ellis <jbellis@gmail.com> wrote:
>>> No, you're mixing two related concepts.
>>>
>>> When you do a quorum read it will fetch the actual data from one
>>> replica and do digest reads from the others.  If the data from the one
>>> does not match the hash from the others, then you have the
>>> digestmismatchexception Edmond is seeing and read repair is performed.
>>>  So this is not normal and probably a bug.
>>
>> Ok - I've filed a bug in jira: CASSANDRA-462.
>>
>>>
>>> P.S. Edmund: with 2 replicas, quorum is the same as all.  So you will
>>> not be able to perform reads if any node is available.  This is why
>>> usually if you are going to do quorum reads you will have replication
>>> factor 3 or more.
>>
>> Yep.  I only have 3 machines available at the moment as I'm testing
>> cassandra out.
>>
>>>
>>> -Jonathan
>>>
>>> On Fri, Sep 25, 2009 at 8:31 PM, Sandeep Tata <sandeep.tata@gmail.com>
wrote:
>>>> This is a known issue, and we should perhaps open a JIRA on it.
>>>> The original Dynamo approach was to have 3 mechanisms --
>>>> HintedHandoff, read-repair, and Merk trees to guarantee convergence
>>>> (eventual consistency). Cassandra only has the first two. There are
>>>> some corner cases where hinted-handoff alone can't be relied on to
>>>> guarantee convergence which is why there's read-repair  on every read.
>>>>
>>>> Turning off read repair is a relatively simple (but risky) change to
>>>> the code. However, minimizing unnecessary read repair is a lot
>>>> trickier :)
>>>>
>>>>
>>>> On Fri, Sep 25, 2009 at 5:39 PM, Edmond Lau <edmond@ooyala.com> wrote:
>>>>> I have a 3 node cluster with a replication factor of 2, running on 0.4
>>>>> RC1.  I've set both my read and write consistency levels to use a
>>>>> quorum.
>>>>>
>>>>> I'm observing that quorum reads keep invoking read repair and log
>>>>> DigestMismatchExceptions from the StorageProxy.  Obviously, this
>>>>> significantly reduces my read throughput.  In the absence of any
>>>>> additional inserts, I'd expect that read repair would happen at most
>>>>> once before the 2 nodes responsible for the data both have fresh views
>>>>> of the data.
>>>>>
>>>>> Here's what I see in my debug log for one machine on two consecutive
>>>>> quorum reads for the data.  I get similar messages when querying any
>>>>> of the 3 nodes.  Similar messages are logged on subsequent queries for
>>>>> the exact same row/column.  The issue happens when reading both
>>>>> supercolumns or columns.  Restarting the cluster has no effect.
>>>>>
>>>>> DEBUG [pool-1-thread-1] 2009-09-26 00:26:20,317 CassandraServer.java
>>>>> (line 305) multiget
>>>>> DEBUG [pool-1-thread-1] 2009-09-26 00:26:20,360 StorageProxy.java
>>>>> (line 375) strongread reading data for
>>>>> SliceByNamesReadCommand(table='Analytics', key='test',
>>>>> columnParent='QueryPath(columnFamilyName='Domain',
>>>>> superColumnName='null', co\
>>>>> lumnName='null')', columns=[www.ooyala.com,]) from 17@172.16.130.130:7000
>>>>> DEBUG [pool-1-thread-1] 2009-09-26 00:26:20,365 StorageProxy.java
>>>>> (line 382) strongread reading digest for
>>>>> SliceByNamesReadCommand(table='Analytics', key='test',
>>>>> columnParent='QueryPath(columnFamilyName='Domain',
>>>>> superColumnName='null', \
>>>>> columnName='null')', columns=[www.ooyala.com,]) from 18@172.16.130.131:7000
>>>>> DEBUG [ROW-READ-STAGE:1] 2009-09-26 00:26:20,380 ReadVerbHandler.java
>>>>> (line 100) Read key test; sending response to
>>>>> EEF5BCFF-D592-F1DE-6DEE-B74029218A29@172.16.130.130:7000
>>>>> DEBUG [RESPONSE-STAGE:1] 2009-09-26 00:26:20,387
>>>>> ResponseVerbHandler.java (line 34) Processing response on a callback
>>>>> from EEF5BCFF-D592-F1DE-6DEE-B74029218A29@172.16.130.130:7000
>>>>> DEBUG [RESPONSE-STAGE:2] 2009-09-26 00:26:20,449
>>>>> ResponseVerbHandler.java (line 34) Processing response on a callback
>>>>> from EEF5BCFF-D592-F1DE-6DEE-B74029218A29@172.16.130.131:7000
>>>>> DEBUG [pool-1-thread-1] 2009-09-26 00:26:20,474
>>>>> ReadResponseResolver.java (line 84) Response deserialization time : 0
>>>>> ms.
>>>>> DEBUG [pool-1-thread-1] 2009-09-26 00:26:20,474
>>>>> ReadResponseResolver.java (line 84) Response deserialization time : 0
>>>>> ms.
>>>>>  INFO [pool-1-thread-1] 2009-09-26 00:26:20,475 StorageProxy.java
>>>>> (line 411) DigestMismatchException: test
>>>>> DEBUG [ROW-READ-STAGE:2] 2009-09-26 00:26:20,477 ReadVerbHandler.java
>>>>> (line 100) Read key test; sending response to 19@172.16.130.130:7000
>>>>> DEBUG [RESPONSE-STAGE:3] 2009-09-26 00:26:20,478
>>>>> ResponseVerbHandler.java (line 34) Processing response on a callback
>>>>> from 19@172.16.130.130:7000
>>>>> DEBUG [RESPONSE-STAGE:4] 2009-09-26 00:26:20,480
>>>>> ResponseVerbHandler.java (line 34) Processing response on a callback
>>>>> from 19@172.16.130.131:7000
>>>>> DEBUG [pool-1-thread-1] 2009-09-26 00:26:20,481
>>>>> ReadResponseResolver.java (line 84) Response deserialization time : 0
>>>>> ms.
>>>>> DEBUG [pool-1-thread-1] 2009-09-26 00:26:20,481
>>>>> ReadResponseResolver.java (line 84) Response deserialization time : 0
>>>>> ms.
>>>>>  INFO [pool-1-thread-1] 2009-09-26 00:26:20,482
>>>>> ReadResponseResolver.java (line 148) resolve: 1 ms.
>>>>> DEBUG [pool-1-thread-2] 2009-09-26 00:27:22,099 CassandraServer.java
>>>>> (line 305) multiget
>>>>> DEBUG [pool-1-thread-2] 2009-09-26 00:27:22,100 StorageProxy.java
>>>>> (line 375) strongread reading data for
>>>>> SliceByNamesReadCommand(table='Analytics', key='test',
>>>>> columnParent='QueryPath(columnFamilyName='Domain',
>>>>> superColumnName='null', co\
>>>>> lumnName='null')', columns=[www.ooyala.com,]) from 224@172.16.130.130:7000
>>>>> DEBUG [pool-1-thread-2] 2009-09-26 00:27:22,100 StorageProxy.java
>>>>> (line 382) strongread reading digest for
>>>>> SliceByNamesReadCommand(table='Analytics', key='test',
>>>>> columnParent='QueryPath(columnFamilyName='Domain',
>>>>> superColumnName='null', \
>>>>> columnName='null')', columns=[www.ooyala.com,]) from 225@172.16.130.131:7000
>>>>> DEBUG [ROW-READ-STAGE:1] 2009-09-26 00:27:22,103 ReadVerbHandler.java
>>>>> (line 100) Read key test; sending response to
>>>>> CD1A7545-F759-1CA7-4D17-87FA4A16E2E4@172.16.130.130:7000
>>>>> DEBUG [RESPONSE-STAGE:1] 2009-09-26 00:27:22,103
>>>>> ResponseVerbHandler.java (line 34) Processing response on a callback
>>>>> from CD1A7545-F759-1CA7-4D17-87FA4A16E2E4@172.16.130.130:7000
>>>>> DEBUG [RESPONSE-STAGE:2] 2009-09-26 00:27:22,107
>>>>> ResponseVerbHandler.java (line 34) Processing response on a callback
>>>>> from CD1A7545-F759-1CA7-4D17-87FA4A16E2E4@172.16.130.131:7000
>>>>> DEBUG [pool-1-thread-2] 2009-09-26 00:27:22,108
>>>>> ReadResponseResolver.java (line 84) Response deserialization time : 1
>>>>> ms.
>>>>> DEBUG [pool-1-thread-2] 2009-09-26 00:27:22,108
>>>>> ReadResponseResolver.java (line 84) Response deserialization time : 0
>>>>> ms.
>>>>>  INFO [pool-1-thread-2] 2009-09-26 00:27:22,109 StorageProxy.java
>>>>> (line 411) DigestMismatchException: test
>>>>> DEBUG [ROW-READ-STAGE:2] 2009-09-26 00:27:22,114 ReadVerbHandler.java
>>>>> (line 100) Read key test; sending response to 226@172.16.130.130:7000
>>>>> DEBUG [RESPONSE-STAGE:3] 2009-09-26 00:27:22,114
>>>>> ResponseVerbHandler.java (line 34) Processing response on a callback
>>>>> from 226@172.16.130.130:7000
>>>>> DEBUG [RESPONSE-STAGE:4] 2009-09-26 00:27:22,205
>>>>> ResponseVerbHandler.java (line 34) Processing response on a callback
>>>>> from 226@172.16.130.131:7000
>>>>> DEBUG [pool-1-thread-2] 2009-09-26 00:27:22,206
>>>>> ReadResponseResolver.java (line 84) Response deserialization time : 0
>>>>> ms.
>>>>> DEBUG [pool-1-thread-2] 2009-09-26 00:27:22,206
>>>>> ReadResponseResolver.java (line 84) Response deserialization time : 0
>>>>> ms.
>>>>>  INFO [pool-1-thread-2] 2009-09-26 00:27:22,207
>>>>> ReadResponseResolver.java (line 148) resolve: 1 ms.
>>>>>
>>>>> Thoughts?
>>>>>
>>>>> Edmond
>>>>>
>>>>
>>>
>>
>

Mime
View raw message