accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ben Popp (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-884) Take advantage of short circuit read for local files
Date Thu, 11 Jul 2013 15:25:48 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13705911#comment-13705911
] 

Ben Popp commented on ACCUMULO-884:
-----------------------------------

bq. Interesting that you saw such a speedup. How influenced do you think your benchmark is
by the YCSB workload itself?

Our test was specifically designed to do uniformly random reads across a dataset that wouldn't
come close to fitting in memory.  Additionally, it was on solid state drives so we would assume
that the relative overhead of pulling data through HDFS (as opposed to short-circuiting) would
be large compared to a spinning disk scenario.

bq. Out of curiosity, in your read-only test, did you warm the Accumulo or OS caches before
the test (or conversely, ensure they were cold)?

I didn't do anything special to warm/empty OS or Accumulo caches.  Anecdotally, there was
a pretty obvious warmup curve in read TP over the first few minutes of the test whenever Accumulo
was not already warm with this table.   
                
> Take advantage of short circuit read for local files
> ----------------------------------------------------
>
>                 Key: ACCUMULO-884
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-884
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: docs
>            Reporter: Billie Rinaldi
>            Assignee: Keith Turner
>
> This is a new feature in hadoop 1.0.x and some versions of 0.22 and 0.23.  It allows
a client to read directly from disk instead of through a DataNode when the data is stored
locally.  Enabling it involves setting two configuration parameters, the first in hdfs-site.xml
and the second in accumulo-site.xml.  We should make sure this works with Accumulo and recommend
it in the documentation.
> - dfs.block.local-path-access.user is the key in datanode configuration to specify the
user allowed to do short circuit read.
> - dfs.client.read.shortcircuit is the key to enable short circuit read at the client
side configuration.
> See HDFS-2246 and http://hbase.apache.org/book/perf.hdfs.configs.html for more information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message