Return-Path: X-Original-To: apmail-accumulo-notifications-archive@minotaur.apache.org Delivered-To: apmail-accumulo-notifications-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B613D106CF for ; Wed, 10 Jul 2013 16:33:50 +0000 (UTC) Received: (qmail 36041 invoked by uid 500); 10 Jul 2013 16:33:49 -0000 Delivered-To: apmail-accumulo-notifications-archive@accumulo.apache.org Received: (qmail 35788 invoked by uid 500); 10 Jul 2013 16:33:49 -0000 Mailing-List: contact notifications-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: jira@apache.org Delivered-To: mailing list notifications@accumulo.apache.org Received: (qmail 35319 invoked by uid 99); 10 Jul 2013 16:33:49 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 10 Jul 2013 16:33:49 +0000 Date: Wed, 10 Jul 2013 16:33:49 +0000 (UTC) From: "Josh Elser (JIRA)" To: notifications@accumulo.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (ACCUMULO-884) Take advantage of short circuit read for local files MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/ACCUMULO-884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13704723#comment-13704723 ] Josh Elser commented on ACCUMULO-884: ------------------------------------- bq. In my experiments on these solid state drives, enabling short-circuit reads more than doubled my read throughput! (TP measured in ops/s in a YCSB-derived read-only workload test.) If I remember correctly, I read somewhere that you won't see any benefit from shortcircuit reads until you actually get to hadoop-2 (maybe 0.23 too?). I'll see if I can find that information again... Interesting that you saw such a speedup. How influenced do you think your benchmark is by the YCSB workload itself? I did a bunch of testing early this year on spinning-disks with these parameters on 0.20 baseline and actually saw no performance gain trying to use the shortcircuit. I think I was also tweaking disabling checksums on local datanode reads. I think I had an even part read+write workload. JD had some info on HDFS-2246 but I'll leave it up to you to come to your own conclusions. Out of curiosity, in your read-only test, did you warm the Accumulo or OS caches before the test (or conversely, ensure they were cold)? > Take advantage of short circuit read for local files > ---------------------------------------------------- > > Key: ACCUMULO-884 > URL: https://issues.apache.org/jira/browse/ACCUMULO-884 > Project: Accumulo > Issue Type: Improvement > Components: docs > Reporter: Billie Rinaldi > Assignee: Keith Turner > > This is a new feature in hadoop 1.0.x and some versions of 0.22 and 0.23. It allows a client to read directly from disk instead of through a DataNode when the data is stored locally. Enabling it involves setting two configuration parameters, the first in hdfs-site.xml and the second in accumulo-site.xml. We should make sure this works with Accumulo and recommend it in the documentation. > - dfs.block.local-path-access.user is the key in datanode configuration to specify the user allowed to do short circuit read. > - dfs.client.read.shortcircuit is the key to enable short circuit read at the client side configuration. > See HDFS-2246 and http://hbase.apache.org/book/perf.hdfs.configs.html for more information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira