Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AA8ED187F4 for ; Thu, 30 Jul 2015 13:18:05 +0000 (UTC) Received: (qmail 28633 invoked by uid 500); 30 Jul 2015 13:18:05 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 28606 invoked by uid 500); 30 Jul 2015 13:18:05 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 28594 invoked by uid 99); 30 Jul 2015 13:18:05 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 30 Jul 2015 13:18:05 +0000 Date: Thu, 30 Jul 2015 13:18:05 +0000 (UTC) From: "T Jake Luciani (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (CASSANDRA-8894) Our default buffer size for (uncompressed) buffered reads should be smaller, and based on the expected record size MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-8894?page=3Dcom.atla= ssian.jira.plugin.system.issuetabpanels:all-tabpanel ] T Jake Luciani updated CASSANDRA-8894: -------------------------------------- Fix Version/s: (was: 3.0 alpha 1) 3.0 beta 1 > Our default buffer size for (uncompressed) buffered reads should be small= er, and based on the expected record size > -------------------------------------------------------------------------= ----------------------------------------- > > Key: CASSANDRA-8894 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8894 > Project: Cassandra > Issue Type: Improvement > Components: Core > Reporter: Benedict > Assignee: Stefania > Labels: benedict-to-commit > Fix For: 3.0 beta 1 > > Attachments: 8894_25pct.yaml, 8894_5pct.yaml, 8894_tiny.yaml > > > A large contributor to slower buffered reads than mmapped is likely that = we read a full 64Kb at once, when average record sizes may be as low as 140= bytes on our stress tests. The TLB has only 128 entries on a modern core, = and each read will touch 32 of these, meaning we are unlikely to almost eve= r be hitting the TLB, and will be incurring at least 30 unnecessary misses = each time (as well as the other costs of larger than necessary accesses). W= hen working with an SSD there is little to no benefit reading more than 4Kb= at once, and in either case reading more data than we need is wasteful. So= , I propose selecting a buffer size that is the next larger power of 2 than= our average record size (with a minimum of 4Kb), so that we expect to read= in one operation. I also propose that we create a pool of these buffers up= -front, and that we ensure they are all exactly aligned to a virtual page, = so that the source and target operations each touch exactly one virtual pag= e per 4Kb of expected record size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)