accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-884) Take advantage of short circuit read for local files
Date Thu, 10 Oct 2013 00:17:42 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13791030#comment-13791030
] 

Todd Lipcon commented on ACCUMULO-884:
--------------------------------------

As of HDFS-347 (in branch-2 for a while now) we do use Unix sockets to pass an open file descriptor
from the DN to the client. It does rely on JNI like you suspected. The client caches the open
file handles as well, to avoid repeated round trips to the DN asking for fds. I don't know
whether this works on default SELinux setups, but it doesn't require any special configs on
typical el6, etc.

The files on the DN are owned by the DN only and have the typical permissions (data dirs and
blocks only readable by the hdfs user)

Performance wise there is a huge difference - 2x or more when the data is in block cache.
see HDFS-2246 (the original non-secure implementation that you are remembering above) and
HDFS-347 for some benchmarks.

> Take advantage of short circuit read for local files
> ----------------------------------------------------
>
>                 Key: ACCUMULO-884
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-884
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: docs
>            Reporter: Billie Rinaldi
>            Assignee: Keith Turner
>
> This is a new feature in hadoop 1.0.x and some versions of 0.22 and 0.23.  It allows
a client to read directly from disk instead of through a DataNode when the data is stored
locally.  Enabling it involves setting two configuration parameters, the first in hdfs-site.xml
and the second in accumulo-site.xml.  We should make sure this works with Accumulo and recommend
it in the documentation.
> - dfs.block.local-path-access.user is the key in datanode configuration to specify the
user allowed to do short circuit read.
> - dfs.client.read.shortcircuit is the key to enable short circuit read at the client
side configuration.
> See HDFS-2246 and http://hbase.apache.org/book/perf.hdfs.configs.html for more information.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message