Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A09B9110AB for ; Sat, 17 May 2014 09:58:00 +0000 (UTC) Received: (qmail 70438 invoked by uid 500); 17 May 2014 09:49:53 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 69726 invoked by uid 500); 17 May 2014 09:49:52 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 62776 invoked by uid 99); 17 May 2014 09:28:17 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 17 May 2014 09:28:17 +0000 Date: Sat, 17 May 2014 09:28:17 +0000 (UTC) From: "Benedict (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-4718) More-efficient ExecutorService for improved throughput MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-4718?page=3Dcom.atlas= sian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D= 14000724#comment-14000724 ]=20 Benedict commented on CASSANDRA-4718: ------------------------------------- [~xedin] why are you only counting the primary replica data? Requests will = hit both replicas by default? If you look at the results there is a reasona= ble amount of variability for both runs, so it's not clear that one is slow= er or faster - there are a number of points where 4718-sep is faster than 2= .1, and vice versa, and given it is disk bound I am inclined to suggest thi= s is not the patch making it perform worse. In fact, a majority of data poi= nts show higher throughput for 4718-sep, not for 2.1. Your first test, ever= y thread count below 271 is faster; 271 seems to be a blip due to a small n= umber of very slow reads affecting the very last measurement (there's a "ra= ce" in stress' auto mode where some measurements are still accepted after i= t's decided enough have been taken, as can be seen by the final stderr bein= g above the acceptability point); 2.1 showed a similar effect at this tc, b= ut smaller, so this seems likely to be random chance. The last test it is f= aster for all thread counts despite some weird max latencies. It's only the= middle test where it appears to be marginally slower, and given this test = performs effectively exactly the same amount of work as the first test, I'm= not sure this demonstrates a great deal other than the variability. It's also worth asking what your max read concurrency is? As I'm surprised = to see thread counts > 180 causing dramatic spikes in latency (both branche= s) when I'd expect them to be saturating the read stage well before then? > More-efficient ExecutorService for improved throughput > ------------------------------------------------------ > > Key: CASSANDRA-4718 > URL: https://issues.apache.org/jira/browse/CASSANDRA-4718 > Project: Cassandra > Issue Type: Improvement > Reporter: Jonathan Ellis > Assignee: Benedict > Priority: Minor > Labels: performance > Fix For: 2.1.0 > > Attachments: 4718-v1.patch, PerThreadQueue.java, austin_diskbound= _read.svg, aws.svg, aws_read.svg, backpressure-stress.out.txt, baq vs trunk= .png, belliotsmith_branches-stress.out.txt, jason_read.svg, jason_read_late= ncy.svg, jason_write.svg, op costs of various queues.ods, stress op rate wi= th various queues.ods, stress_2014May15.txt, stress_2014May16.txt, v1-stres= s.out > > > Currently all our execution stages dequeue tasks one at a time. This can= result in contention between producers and consumers (although we do our b= est to minimize this by using LinkedBlockingQueue). > One approach to mitigating this would be to make consumer threads do more= work in "bulk" instead of just one task per dequeue. (Producer threads te= nd to be single-task oriented by nature, so I don't see an equivalent oppor= tunity there.) > BlockingQueue has a drainTo(collection, int) method that would be perfect= for this. However, no ExecutorService in the jdk supports using drainTo, = nor could I google one. > What I would like to do here is create just such a beast and wire it into= (at least) the write and read stages. (Other possible candidates for such= an optimization, such as the CommitLog and OutboundTCPConnection, are not = ExecutorService-based and will need to be one-offs.) > AbstractExecutorService may be useful. The implementations of ICommitLog= ExecutorService may also be useful. (Despite the name these are not actual = ExecutorServices, although they share the most important properties of one.= ) -- This message was sent by Atlassian JIRA (v6.2#6252)