cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edmond Lau <edm...@ooyala.com>
Subject Re: repeated timeouts on quorum reads
Date Tue, 20 Oct 2009 04:38:16 GMT
A single local read with debug logging takes 3-4 secs on the node with
3 data files.  It actually takes roughly as long on the other nodes as
well (I was incorrect in saying that they were fast).

DEBUG [pool-1-thread-22] 2009-10-20 04:26:06,953 CassandraServer.java
(line 219) get_slice
DEBUG [pool-1-thread-22] 2009-10-20 04:26:06,955 StorageProxy.java
(line 468) weakreadlocal reading
SliceFromReadCommand(table='Analytics', key='o:movie',
column_parent='QueryPath(columnFamilyName='movie',
superColumnName='[B@26e00675', columnName='null')', start='',
finish='', reversed=false, count=1)
DEBUG [pool-1-thread-22] 2009-10-20 04:26:10,382 SliceQueryFilter.java
(line 115) collecting 1000036:false:515@1255978226411394
DEBUG [CONSISTENCY-MANAGER:2] 2009-10-20 04:26:10,383
ConsistencyManager.java (line 168) Reading consistency digest for
o:movie from 2926@[172.16.129.71:7000, 172.16.129.72:7000]
DEBUG [RESPONSE-STAGE:1] 2009-10-20 04:26:13,332
ResponseVerbHandler.java (line 34) Processing response on a callback
from 2926@172.16.129.71:7000
DEBUG [RESPONSE-STAGE:3] 2009-10-20 04:26:13,675
ResponseVerbHandler.java (line 34) Processing response on a callback
from 2926@172.16.129.72:7000
DEBUG [RESPONSE-STAGE:3] 2009-10-20 04:26:13,677
ConsistencyManager.java (line 96) Performing read repair for o:movie
to 2937@[172.16.129.71:7000, 172.16.129.72:7000, 172.16.129.75:7000]

Also, it's unclear why there's a read repair going on above, since I
haven't made any updates.

Doing a quorum read on the node with 3 data files takes 7 seconds w/
debug logging and leads to a DigestMismatchException, again even
though I haven't made any updates.

DEBUG [pool-1-thread-25] 2009-10-20 04:30:00,892 CassandraServer.java
(line 219) get_slice
DEBUG [pool-1-thread-25] 2009-10-20 04:30:00,894 StorageProxy.java
(line 374) strongread reading data for
SliceFromReadCommand(table='Analytics', key='o:movie',
column_parent='QueryPath(columnFamilyName='movie',
superColumnName='[B@2cf50c5f', columnName='null')', start='',
finish='', reversed=false, count=1) from 3675@172.16.129.75:7000
DEBUG [pool-1-thread-25] 2009-10-20 04:30:00,894 StorageProxy.java
(line 381) strongread reading digest for
SliceFromReadCommand(table='Analytics', key='o:movie',
column_parent='QueryPath(columnFamilyName='movie',
superColumnName='[B@2cf50c5f', columnName='null')', start='',
finish='', reversed=false, count=1) from 3676@172.16.129.71:7000
DEBUG [pool-1-thread-25] 2009-10-20 04:30:00,895 StorageProxy.java
(line 381) strongread reading digest for
SliceFromReadCommand(table='Analytics', key='o:movie',
column_parent='QueryPath(columnFamilyName='movie',
superColumnName='[B@2cf50c5f', columnName='null')', start='',
finish='', reversed=false, count=1) from 3676@172.16.129.72:7000
DEBUG [RESPONSE-STAGE:3] 2009-10-20 04:30:03,994
ResponseVerbHandler.java (line 34) Processing response on a callback
from 15445226-F6EF-C97B-F239-FC0316748C5E@172.16.129.71:7000
DEBUG [RESPONSE-STAGE:2] 2009-10-20 04:30:04,107
ResponseVerbHandler.java (line 34) Processing response on a callback
from 15445226-F6EF-C97B-F239-FC0316748C5E@172.16.129.72:7000
DEBUG [ROW-READ-STAGE:9] 2009-10-20 04:30:04,787 SliceQueryFilter.java
(line 115) collecting 1000036:false:515@1255978226411394
DEBUG [ROW-READ-STAGE:9] 2009-10-20 04:30:04,788 ReadVerbHandler.java
(line 100) Read key o:movie; sending response to
15445226-F6EF-C97B-F239-FC0316748C5E@172.16.129.75:7000
DEBUG [RESPONSE-STAGE:4] 2009-10-20 04:30:04,788
ResponseVerbHandler.java (line 34) Processing response on a callback
from 15445226-F6EF-C97B-F239-FC0316748C5E@172.16.129.75:7000
DEBUG [pool-1-thread-25] 2009-10-20 04:30:04,789
ReadResponseResolver.java (line 84) Response deserialization time : 0
ms.
DEBUG [pool-1-thread-25] 2009-10-20 04:30:04,790
ReadResponseResolver.java (line 84) Response deserialization time : 0
ms.
DEBUG [pool-1-thread-25] 2009-10-20 04:30:04,791
ReadResponseResolver.java (line 84) Response deserialization time : 1
ms.
 INFO [pool-1-thread-25] 2009-10-20 04:30:04,792 StorageProxy.java
(line 410) DigestMismatchException: o:movie
DEBUG [RESPONSE-STAGE:1] 2009-10-20 04:30:07,953
ResponseVerbHandler.java (line 34) Processing response on a callback
from 3690@172.16.129.71:7000
DEBUG [RESPONSE-STAGE:3] 2009-10-20 04:30:08,130
ResponseVerbHandler.java (line 34) Processing response on a callback
from 3690@172.16.129.72:7000
DEBUG [pool-1-thread-25] 2009-10-20 04:30:08,132
ReadResponseResolver.java (line 84) Response deserialization time : 1
ms.
DEBUG [pool-1-thread-25] 2009-10-20 04:30:08,132
ReadResponseResolver.java (line 84) Response deserialization time : 0
ms.
DEBUG [pool-1-thread-25] 2009-10-20 04:30:08,132
ReadResponseResolver.java (line 149) resolve: 1 ms.
DEBUG [ROW-READ-STAGE:10] 2009-10-20 04:30:08,235
SliceQueryFilter.java (line 115) collecting
1000036:false:515@1255978226411394
DEBUG [ROW-READ-STAGE:10] 2009-10-20 04:30:08,235 ReadVerbHandler.java
(line 100) Read key o:movie; sending response to
3690@172.16.129.75:7000

Edmond

On Mon, Oct 19, 2009 at 8:04 PM, Jonathan Ellis <jbellis@gmail.com> wrote:
> On Mon, Oct 19, 2009 at 9:48 PM, Edmond Lau <edmond@ooyala.com> wrote:
>> 8 data files total.  3 nodes have 1, 1 has 2, the 3rd has 3.
>>
>>> Does it still take ~8s if you direct a CL.ONE query at one of the
>>> nodes you know has the data (i.e., a local read)?
>>
>> Local reads return quickly, but if you look at the logs below, there's
>> 6 secs from when cassandra prints that it's reading a consistency
>> digest to when it processes a callback.
>
> That's from when it sends the digest command, to when it gets a callback.
>
> Something doesn't add up.  If computing &sending a digest response
> takes 6s, either reading the row or computing the digest must be
> taking substantial time.  But if local reads are fast, it's not
> reading the row that is slow, and if resolve is fast (from your logs,
> it is 2ms) then it's not computing the digest that is slow.
>
> Try doing a local read from the node with 3 data files, see if that is slower.
>
> If you still see a fast read there, add some more debug logging in
> ReadVerbHandler.java, where it says something like (I'm going from
> trunk here but 0.4.1 is similar)
>
>            Row row = readCommand.getRow(table);
>            ReadResponse readResponse;
>            if (readCommand.isDigestQuery())
>            {
>                readResponse = new ReadResponse(row.digest());
>
> time the getRow and the row.digest() calls and log how long those take
> in the quorum scenario.
>
> -Jonathan
>

Mime
View raw message