cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anton Winter <an...@myrddin.org>
Subject Re: performance problems on new cluster
Date Mon, 15 Aug 2011 07:34:04 GMT
> OK, node latency is fine and you are using some pretty low
> consistency. You said NTS with RF 2, is that RF 2 for each DC ?

Correct, I'm using RF 2 for each DC.



I was able to reproduce the cli timeouts on the non replica nodes.

The debug log output from dc1host1 (non replica node):

DEBUG [pool-2-thread-14] 2011-08-15 05:26:15,183 StorageProxy.java 
(line 518) Command/ConsistencyLevel is SliceFromReadCommand(table='ks1', 
key='userid1', column_parent='QueryPath(columnFamilyName='cf1', 
superColumnName='java.nio.HeapByteBuffer[pos=64 lim=67 cap=109]', 
columnName='null')', start='', finish='', reversed=false, 
count=1000000)/ONE
DEBUG [pool-2-thread-14] 2011-08-15 05:26:15,187 StorageProxy.java 
(line 546) reading data from /dc1host3
DEBUG [pool-2-thread-14] 2011-08-15 05:26:35,191 StorageProxy.java 
(line 593) Read timeout: java.util.concurrent.TimeoutException: 
Operation timed out - received only 1 responses from /dc1host3,  .


If the query is run again on the same node (dc1host1) 0 rows are 
returned and the following DEBUG messages are logged:


DEBUG [pool-2-thread-14] 2011-08-15 05:32:05,513 StorageProxy.java 
(line 518) Command/ConsistencyLevel is SliceFromReadCommand(table='ks1', 
key='userid1', column_parent='QueryPath(columnFamilyName='cf1', 
superColumnName='java.nio.HeapByteBuffer[pos=64 lim=67 cap=109]', 
columnName='null')', start='', finish='', reversed=false, 
count=1000000)/ONE
DEBUG [pool-2-thread-14] 2011-08-15 05:32:05,513 StorageProxy.java 
(line 546) reading data from /dc1host3
DEBUG [pool-2-thread-14] 2011-08-15 05:32:05,513 StorageProxy.java 
(line 562) reading digest from /dc1host2
DEBUG [pool-2-thread-14] 2011-08-15 05:32:05,514 StorageProxy.java 
(line 562) reading digest from /dc2host3
DEBUG [pool-2-thread-14] 2011-08-15 05:32:05,514 StorageProxy.java 
(line 562) reading digest from /dc2host2
DEBUG [pool-2-thread-14] 2011-08-15 05:32:05,514 StorageProxy.java 
(line 562) reading digest from /dc3host2
DEBUG [pool-2-thread-14] 2011-08-15 05:32:05,514 StorageProxy.java 
(line 562) reading digest from /dc3host3
DEBUG [pool-2-thread-14] 2011-08-15 05:32:05,514 StorageProxy.java 
(line 562) reading digest from /dc4host3
DEBUG [pool-2-thread-14] 2011-08-15 05:32:05,514 StorageProxy.java 
(line 562) reading digest from /dc4host2
DEBUG [pool-2-thread-14] 2011-08-15 05:32:06,022 StorageProxy.java 
(line 588) Read: 508 ms.
ERROR [ReadRepairStage:2112] 2011-08-15 05:32:06,404 
AbstractCassandraDaemon.java (line 133) Fatal exception in thread 
Thread[ReadRepairStage:2112,5,main]
java.lang.AssertionError
         at 
org.apache.cassandra.service.RowRepairResolver.resolve(RowRepairResolver.java:73)
         at 
org.apache.cassandra.service.AsyncRepairCallback$1.runMayThrow(AsyncRepairCallback.java:54)
         at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
         at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
         at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
         at java.lang.Thread.run(Thread.java:662)


Subsequent queries afterwards show "reading data from /dc1host2" 
however the results returned remains 0.


If I run the same query on a replica I get the correct result returned 
but with 2 exceptions as follows:


DEBUG [pool-2-thread-5] 2011-08-15 05:45:49,792 StorageProxy.java (line 
518) Command/ConsistencyLevel is SliceFromReadCommand(table='ks1', 
key='userid1', column_parent='QueryPath(columnFamilyName='cf1', 
superColumnName='java.nio.HeapByteBuffer[pos=64 lim=67 cap=109]', 
columnName='null')', start='', finish='', reversed=false, 
count=1000000)/ONE
DEBUG [pool-2-thread-5] 2011-08-15 05:45:49,793 StorageProxy.java (line 
541) reading data locally
DEBUG [pool-2-thread-5] 2011-08-15 05:45:49,793 StorageProxy.java (line 
562) reading digest from /dc1host3
DEBUG [pool-2-thread-5] 2011-08-15 05:45:49,793 StorageProxy.java (line 
562) reading digest from dns.entry.for.dc3host2/dc3host2
DEBUG [pool-2-thread-5] 2011-08-15 05:45:49,793 StorageProxy.java (line 
562) reading digest from dns.entry.for.dc3host3/dc3host3
DEBUG [pool-2-thread-5] 2011-08-15 05:45:49,794 StorageProxy.java (line 
562) reading digest from dns.entry.for.dc2host2/dc2host2
DEBUG [pool-2-thread-5] 2011-08-15 05:45:49,794 StorageProxy.java (line 
562) reading digest from dns.entry.for.dc2host3/dc2host3
DEBUG [pool-2-thread-5] 2011-08-15 05:45:49,794 StorageProxy.java (line 
562) reading digest from dc4host2/dc4host2
DEBUG [pool-2-thread-5] 2011-08-15 05:45:49,794 StorageProxy.java (line 
562) reading digest from dc4host3/dc4host3
DEBUG [ReadStage:20102] 2011-08-15 05:45:49,793 StorageProxy.java (line 
690) LocalReadRunnable reading SliceFromReadCommand(table='ks1', 
key='userid1', column_parent='QueryPath(columnFamilyName='cf1', 
superColumnName='java.nio.HeapByteBuffer[pos=64 lim=67 cap=109]', 
columnName='null')', start='', finish='', reversed=false, count=1000000)
DEBUG [pool-2-thread-5] 2011-08-15 05:45:49,795 StorageProxy.java (line 
588) Read: 1 ms.
ERROR [ReadRepairStage:11] 2011-08-15 05:45:50,890 
AbstractCassandraDaemon.java (line 133) Fatal exception in thread 
Thread[ReadRepairStage:11,5,main]
java.lang.RuntimeException: java.lang.NullPointerException
         at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
         at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
         at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
         at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.NullPointerException
         at 
org.apache.cassandra.service.RowRepairResolver.resolve(RowRepairResolver.java:74)
         at 
org.apache.cassandra.service.AsyncRepairCallback$1.runMayThrow(AsyncRepairCallback.java:54)
         at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
         ... 3 more
ERROR [ReadRepairStage:11] 2011-08-15 05:45:50,892 
AbstractCassandraDaemon.java (line 133) Fatal exception in thread 
Thread[ReadRepairStage:11,5,main]
java.lang.RuntimeException: java.lang.NullPointerException
         at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
         at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
         at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
         at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.NullPointerException
         at 
org.apache.cassandra.service.RowRepairResolver.resolve(RowRepairResolver.java:74)
         at 
org.apache.cassandra.service.AsyncRepairCallback$1.runMayThrow(AsyncRepairCallback.java:54)
         at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
         ... 3 more

The above have been reproduced in each DC.


The testing that has been performing poorly happens to be on the non 
replica nodes.  Initial testing against one of the replicas appears that 
they do not exhibit the performance problems that the non replicas do.

Regards,
Anton



On Fri, 12 Aug 2011 14:11:06 +1200, aaron morton wrote:
>>
>> iostat doesn't show a request queue bottleneck.  The timeouts we are 
>> seeing is for reads.  The latency on the nodes I have temporarily used 
>> for reads is around 2-45ms.  The next token in the ring at an 
>> alternate DC is showing ~4ms with everything else around 0.05ms.  
>> tpstats desn't show any active/pending.  Reads are at CL.ONE & Writes 
>> using CL.ANY
>
> OK, node latency is fine and you are using some pretty low
> consistency. You said NTS with RF 2, is that RF 2 for each DC ?
>
> The steps below may help get an idea of whats going on…
>
> 1) use nodetool getendpoints to determine which replicas a key is.
> 2) connect directly to one of the endpoints with the CLI, ensure CL
> is ONE and do your test query.
> 3) connect to another node in the same DC that is not a replica and
> do the same.
> 4) connect to another node in a different DC and do the same
>
> Once you can repo it try turning up the logging not the coordinator
> to DEBUG you can do this via JConsole. Look for these lines….
>
> * Command/ConsistencyLevel is….
> * reading data locally... or reading data from…
> * reading digest locally… or reading digest for from…
> * Read timeout:…
>
> You'll also see some lines about receiving messages from other nodes.
>
> Hopefully you can get an idea of which nodes are involved in a
> failing query. Getting a thrift TimedOutException on a read with CL
> ONE is pretty odd.
>
>> What can I do in regards to confirming this issue is still 
>> outstanding and/or we are affected by it?
> It's in 0.8 and will not be fixed. My unscientific approach was to
> repair a single CF at a time, hoping that the differences would be
> smaller and less data would be streamed.
> Minor compaction should help squish things down. If you want to get
> more aggressive reduce the min compaction threshold and trigger a
> minor compaction with nodetool flush.
>
>> Version of failure detection?  I've not seen anything on this so I 
>> suspect this is the default.
> Was asking so I could see if their were any fixed in Gossip or the
> FailureDetect that you were missing. Check the CHANGES.txt file.
>
> Hope that helps.


Mime
View raw message