cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Re: repeated timeouts on quorum reads
Date Thu, 22 Oct 2009 17:04:21 GMT
On Thu, Oct 22, 2009 at 10:59 AM, Edmond Lau <edmond@ooyala.com> wrote:
> Try:
>
> keyspace: "Analytics"
> key: "o:movie"
> column family: "movie"
> super column: "all"
>
> I was able to get timeouts with a few concurrent get_slice() calls
> (3-4 with debug logging, 6 without) on the super column, with start =
> "", finish = "", and count = 1.
>
> Edmond
>
> On Thu, Oct 22, 2009 at 7:42 AM, Jonathan Ellis <jbellis@gmail.com> wrote:
>> What is your columnfamily definition?  What query should I test with?
>>
>> On Wed, Oct 21, 2009 at 7:33 PM, Edmond Lau <edmond@ooyala.com> wrote:
>>> On Tue, Oct 20, 2009 at 2:03 PM, Jonathan Ellis <jbellis@gmail.com> wrote:
>>>> Okay, so we have 2 problems:
>>>>
>>>>  - the read is simply too slow
>>>>  - we're doing unnecessary(?) read repairs
>>>>
>>>> For the first I think it's going to be easiest if you can export the
>>>> row in question so we can do some profiling.  (Eric will have a tool
>>>> to do this soon.)
>>>
>>> Ok - I've created a cassandra setup with just the row in question and
>>> tarred up the data + storage-conf.xml for the three nodes:
>>>
>>> http://demo.ooyala.com/cassandra/cassandra-1.tgz
>>> http://demo.ooyala.com/cassandra/cassandra-2.tgz
>>> http://demo.ooyala.com/cassandra/cassandra-3.tgz
>>>
>>>>
>>>> For the second, how sure are you that the repair is unnecessary?  Does
>>>> it repeat the repair when you do the query again after letting the
>>>> callbacks complete?
>>>
>>> I didn't do any writes as I was debugging the issue whereas I was
>>> reading the same data a whole bunch of times, so the repair should've
>>> been unnecessary.
>>>
>>> Thanks for your help,
>>> Edmond
>>>
>>>>
>>>> -Jonathan
>>>>
>>>> On Mon, Oct 19, 2009 at 11:38 PM, Edmond Lau <edmond@ooyala.com> wrote:
>>>>> A single local read with debug logging takes 3-4 secs on the node with
>>>>> 3 data files.  It actually takes roughly as long on the other nodes
as
>>>>> well (I was incorrect in saying that they were fast).
>>>>>
>>>>> DEBUG [pool-1-thread-22] 2009-10-20 04:26:06,953 CassandraServer.java
>>>>> (line 219) get_slice
>>>>> DEBUG [pool-1-thread-22] 2009-10-20 04:26:06,955 StorageProxy.java
>>>>> (line 468) weakreadlocal reading
>>>>> SliceFromReadCommand(table='Analytics', key='o:movie',
>>>>> column_parent='QueryPath(columnFamilyName='movie',
>>>>> superColumnName='[B@26e00675', columnName='null')', start='',
>>>>> finish='', reversed=false, count=1)
>>>>> DEBUG [pool-1-thread-22] 2009-10-20 04:26:10,382 SliceQueryFilter.java
>>>>> (line 115) collecting 1000036:false:515@1255978226411394
>>>>> DEBUG [CONSISTENCY-MANAGER:2] 2009-10-20 04:26:10,383
>>>>> ConsistencyManager.java (line 168) Reading consistency digest for
>>>>> o:movie from 2926@[172.16.129.71:7000, 172.16.129.72:7000]
>>>>> DEBUG [RESPONSE-STAGE:1] 2009-10-20 04:26:13,332
>>>>> ResponseVerbHandler.java (line 34) Processing response on a callback
>>>>> from 2926@172.16.129.71:7000
>>>>> DEBUG [RESPONSE-STAGE:3] 2009-10-20 04:26:13,675
>>>>> ResponseVerbHandler.java (line 34) Processing response on a callback
>>>>> from 2926@172.16.129.72:7000
>>>>> DEBUG [RESPONSE-STAGE:3] 2009-10-20 04:26:13,677
>>>>> ConsistencyManager.java (line 96) Performing read repair for o:movie
>>>>> to 2937@[172.16.129.71:7000, 172.16.129.72:7000, 172.16.129.75:7000]
>>>>>
>>>>> Also, it's unclear why there's a read repair going on above, since I
>>>>> haven't made any updates.
>>>>>
>>>>> Doing a quorum read on the node with 3 data files takes 7 seconds w/
>>>>> debug logging and leads to a DigestMismatchException, again even
>>>>> though I haven't made any updates.
>>>>>
>>>>> DEBUG [pool-1-thread-25] 2009-10-20 04:30:00,892 CassandraServer.java
>>>>> (line 219) get_slice
>>>>> DEBUG [pool-1-thread-25] 2009-10-20 04:30:00,894 StorageProxy.java
>>>>> (line 374) strongread reading data for
>>>>> SliceFromReadCommand(table='Analytics', key='o:movie',
>>>>> column_parent='QueryPath(columnFamilyName='movie',
>>>>> superColumnName='[B@2cf50c5f', columnName='null')', start='',
>>>>> finish='', reversed=false, count=1) from 3675@172.16.129.75:7000
>>>>> DEBUG [pool-1-thread-25] 2009-10-20 04:30:00,894 StorageProxy.java
>>>>> (line 381) strongread reading digest for
>>>>> SliceFromReadCommand(table='Analytics', key='o:movie',
>>>>> column_parent='QueryPath(columnFamilyName='movie',
>>>>> superColumnName='[B@2cf50c5f', columnName='null')', start='',
>>>>> finish='', reversed=false, count=1) from 3676@172.16.129.71:7000
>>>>> DEBUG [pool-1-thread-25] 2009-10-20 04:30:00,895 StorageProxy.java
>>>>> (line 381) strongread reading digest for
>>>>> SliceFromReadCommand(table='Analytics', key='o:movie',
>>>>> column_parent='QueryPath(columnFamilyName='movie',
>>>>> superColumnName='[B@2cf50c5f', columnName='null')', start='',
>>>>> finish='', reversed=false, count=1) from 3676@172.16.129.72:7000
>>>>> DEBUG [RESPONSE-STAGE:3] 2009-10-20 04:30:03,994
>>>>> ResponseVerbHandler.java (line 34) Processing response on a callback
>>>>> from 15445226-F6EF-C97B-F239-FC0316748C5E@172.16.129.71:7000
>>>>> DEBUG [RESPONSE-STAGE:2] 2009-10-20 04:30:04,107
>>>>> ResponseVerbHandler.java (line 34) Processing response on a callback
>>>>> from 15445226-F6EF-C97B-F239-FC0316748C5E@172.16.129.72:7000
>>>>> DEBUG [ROW-READ-STAGE:9] 2009-10-20 04:30:04,787 SliceQueryFilter.java
>>>>> (line 115) collecting 1000036:false:515@1255978226411394
>>>>> DEBUG [ROW-READ-STAGE:9] 2009-10-20 04:30:04,788 ReadVerbHandler.java
>>>>> (line 100) Read key o:movie; sending response to
>>>>> 15445226-F6EF-C97B-F239-FC0316748C5E@172.16.129.75:7000
>>>>> DEBUG [RESPONSE-STAGE:4] 2009-10-20 04:30:04,788
>>>>> ResponseVerbHandler.java (line 34) Processing response on a callback
>>>>> from 15445226-F6EF-C97B-F239-FC0316748C5E@172.16.129.75:7000
>>>>> DEBUG [pool-1-thread-25] 2009-10-20 04:30:04,789
>>>>> ReadResponseResolver.java (line 84) Response deserialization time : 0
>>>>> ms.
>>>>> DEBUG [pool-1-thread-25] 2009-10-20 04:30:04,790
>>>>> ReadResponseResolver.java (line 84) Response deserialization time : 0
>>>>> ms.
>>>>> DEBUG [pool-1-thread-25] 2009-10-20 04:30:04,791
>>>>> ReadResponseResolver.java (line 84) Response deserialization time : 1
>>>>> ms.
>>>>>  INFO [pool-1-thread-25] 2009-10-20 04:30:04,792 StorageProxy.java
>>>>> (line 410) DigestMismatchException: o:movie
>>>>> DEBUG [RESPONSE-STAGE:1] 2009-10-20 04:30:07,953
>>>>> ResponseVerbHandler.java (line 34) Processing response on a callback
>>>>> from 3690@172.16.129.71:7000
>>>>> DEBUG [RESPONSE-STAGE:3] 2009-10-20 04:30:08,130
>>>>> ResponseVerbHandler.java (line 34) Processing response on a callback
>>>>> from 3690@172.16.129.72:7000
>>>>> DEBUG [pool-1-thread-25] 2009-10-20 04:30:08,132
>>>>> ReadResponseResolver.java (line 84) Response deserialization time : 1
>>>>> ms.
>>>>> DEBUG [pool-1-thread-25] 2009-10-20 04:30:08,132
>>>>> ReadResponseResolver.java (line 84) Response deserialization time : 0
>>>>> ms.
>>>>> DEBUG [pool-1-thread-25] 2009-10-20 04:30:08,132
>>>>> ReadResponseResolver.java (line 149) resolve: 1 ms.
>>>>> DEBUG [ROW-READ-STAGE:10] 2009-10-20 04:30:08,235
>>>>> SliceQueryFilter.java (line 115) collecting
>>>>> 1000036:false:515@1255978226411394
>>>>> DEBUG [ROW-READ-STAGE:10] 2009-10-20 04:30:08,235 ReadVerbHandler.java
>>>>> (line 100) Read key o:movie; sending response to
>>>>> 3690@172.16.129.75:7000
>>>>>
>>>>> Edmond
>>>>>
>>>>> On Mon, Oct 19, 2009 at 8:04 PM, Jonathan Ellis <jbellis@gmail.com>
wrote:
>>>>>> On Mon, Oct 19, 2009 at 9:48 PM, Edmond Lau <edmond@ooyala.com>
wrote:
>>>>>>> 8 data files total.  3 nodes have 1, 1 has 2, the 3rd has 3.
>>>>>>>
>>>>>>>> Does it still take ~8s if you direct a CL.ONE query at one
of the
>>>>>>>> nodes you know has the data (i.e., a local read)?
>>>>>>>
>>>>>>> Local reads return quickly, but if you look at the logs below,
there's
>>>>>>> 6 secs from when cassandra prints that it's reading a consistency
>>>>>>> digest to when it processes a callback.
>>>>>>
>>>>>> That's from when it sends the digest command, to when it gets a callback.
>>>>>>
>>>>>> Something doesn't add up.  If computing &sending a digest response
>>>>>> takes 6s, either reading the row or computing the digest must be
>>>>>> taking substantial time.  But if local reads are fast, it's not
>>>>>> reading the row that is slow, and if resolve is fast (from your logs,
>>>>>> it is 2ms) then it's not computing the digest that is slow.
>>>>>>
>>>>>> Try doing a local read from the node with 3 data files, see if that
is slower.
>>>>>>
>>>>>> If you still see a fast read there, add some more debug logging in
>>>>>> ReadVerbHandler.java, where it says something like (I'm going from
>>>>>> trunk here but 0.4.1 is similar)
>>>>>>
>>>>>>            Row row = readCommand.getRow(table);
>>>>>>            ReadResponse readResponse;
>>>>>>            if (readCommand.isDigestQuery())
>>>>>>            {
>>>>>>                readResponse = new ReadResponse(row.digest());
>>>>>>
>>>>>> time the getRow and the row.digest() calls and log how long those
take
>>>>>> in the quorum scenario.
>>>>>>
>>>>>> -Jonathan
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message