Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 63A7A10BA6 for ; Fri, 14 Mar 2014 23:02:58 +0000 (UTC) Received: (qmail 56998 invoked by uid 500); 14 Mar 2014 23:02:52 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 56946 invoked by uid 500); 14 Mar 2014 23:02:51 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 56890 invoked by uid 99); 14 Mar 2014 23:02:48 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 14 Mar 2014 23:02:48 +0000 Date: Fri, 14 Mar 2014 23:02:48 +0000 (UTC) From: "Benedict (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-6746) Reads have a slow ramp up in speed MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-6746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13935792#comment-13935792 ] Benedict commented on CASSANDRA-6746: ------------------------------------- I will do some empirical testing so we have some data to work with. It seems to me that "trickle" flushing would still be better than this, although we could still DONTNEED after trickle sync for compaction. WILLNEEDing a large file _after flush_ is potentially even worse behaviour, though, as if the DONTNEED has been obeyed (or they've fallen out of cache due to not being read during flush - which is probably likely during a large flush) we're just proactively inducing a period of high intensity random seeks for data that would naturally be read in anyway if they are needed, and otherwise would not. That said, it might be easier to just pick an approach (the one you suggest is certainly better than what we currently do), and then deliver iterative replacement, as it solves all of the above problems. > Reads have a slow ramp up in speed > ---------------------------------- > > Key: CASSANDRA-6746 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6746 > Project: Cassandra > Issue Type: Bug > Components: Core > Reporter: Ryan McGuire > Assignee: Benedict > Labels: performance > Fix For: 2.1 beta2 > > Attachments: 2.1_vs_2.0_read.png, 6746-patched.png, 6746.txt, cassandra-2.0-bdplab-trial-fincore.tar.bz2, cassandra-2.1-bdplab-trial-fincore.tar.bz2 > > > On a physical four node cluister I am doing a big write and then a big read. The read takes a long time to ramp up to respectable speeds. > !2.1_vs_2.0_read.png! > [See data here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.2.1_vs_2.0_vs_1.2.retry1.json&metric=interval_op_rate&operation=stress-read&smoothing=1] -- This message was sent by Atlassian JIRA (v6.2#6252)