hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6007) Update documentation about short-circuit local reads
Date Wed, 12 Mar 2014 18:52:44 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932178#comment-13932178

Colin Patrick McCabe commented on HDFS-6007:

Thanks for looking at this.  I think we should limit the scope here to just adding a sentence
about shared-memory segments, and adding some documentation about the legacy short-circuit

I think the zero-copy API should get its own document.  Putting it in here just seems like
information overload.

+  Client and DataNode uses shared memory segments
+  to communicate short-circuit read.

How about "The client and the DataNode exchange information via a shared memory segment."

+  if /dev/shm is not world writable or does not exist in your environment,
+  You can change the paths on which shared memory segments are created by
+  setting the value of <<<dfs.datanode.shared.file.descriptor.paths>>>
+  to comma separated paths like <<</dev/shm,/tmp>>>.
+  It tries paths in order until creation of shared memory segment succeeds.

Can we skip this section?  99.999% of users will never need to change that config value, and
there's documentation in hdfs-defaults.xml for those who do.  The number of UNIX systems without
/tmp must be pretty small indeed.

+  Legacy short-circuit local reads implementation
+  on which clients directly open HDFS block files is still available
+  for platforms other than Linux.

Missing 'the'

I think we need a sentence or two explaining that the old short-circuit implementation is
insecure, because it allows users to directly access the blocks.  We also need some explanation
about how you have to chmod the blocks into the correct UNIX group so that they are accessible.

Please skip the configuration tables.  They just duplicate hdfs-default.xml

> Update documentation about short-circuit local reads
> ----------------------------------------------------
>                 Key: HDFS-6007
>                 URL: https://issues.apache.org/jira/browse/HDFS-6007
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: documentation
>            Reporter: Masatake Iwasaki
>            Priority: Minor
>         Attachments: HDFS-6007-0.patch, HDFS-6007-1.patch, HDFS-6007-2.patch, HDFS-6007-3.patch
> updating the contents of "HDFS SHort-Circuit Local Reads" based on the changes in HDFS-4538
and HDFS-4953.

This message was sent by Atlassian JIRA

View raw message