hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-4817) make HDFS advisory caching configurable on a per-file basis
Date Thu, 23 May 2013 23:47:21 GMT

    [ https://issues.apache.org/jira/browse/HDFS-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13665847#comment-13665847

Hadoop QA commented on HDFS-4817:

{color:green}+1 overall{color}.  Here are the results of testing the latest attachment 
  against trunk revision .

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:green}+1 tests included{color}.  The patch appears to include 7 new or modified
test files.

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of
javac compiler warnings.

    {color:green}+1 javadoc{color}.  The javadoc tool did not generate any warning messages.

    {color:green}+1 eclipse:eclipse{color}.  The patch built with eclipse:eclipse.

    {color:green}+1 findbugs{color}.  The patch does not introduce any new Findbugs (version
1.3.9) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number
of release audit warnings.

    {color:green}+1 core tests{color}.  The patch passed unit tests in hadoop-common-project/hadoop-common
hadoop-hdfs-project/hadoop-hdfs hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle.

    {color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/4432//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4432//console

This message is automatically generated.
> make HDFS advisory caching configurable on a per-file basis
> -----------------------------------------------------------
>                 Key: HDFS-4817
>                 URL: https://issues.apache.org/jira/browse/HDFS-4817
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client
>    Affects Versions: 3.0.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>            Priority: Minor
>         Attachments: HDFS-4817.001.patch, HDFS-4817.002.patch, HDFS-4817.004.patch
> HADOOP-7753 and related JIRAs introduced some performance optimizations for the DataNode.
 One of them was readahead.  When readahead is enabled, the DataNode starts reading the next
bytes it thinks it will need in the block file, before the client requests them.  This helps
hide the latency of rotational media and send larger reads down to the device.  Another optimization
was "drop-behind."  Using this optimization, we could remove files from the Linux page cache
after they were no longer needed.
> Using {{dfs.datanode.drop.cache.behind.writes}} and {{dfs.datanode.drop.cache.behind.reads}}
can improve performance  substantially on many MapReduce jobs.  In our internal benchmarks,
we have seen speedups of 40% on certain workloads.  The reason is because if we know the block
data will not be read again any time soon, keeping it out of memory allows more memory to
be used by the other processes on the system.  See HADOOP-7714 for more benchmarks.
> We would like to turn on these configurations on a per-file or per-client basis, rather
than on the DataNode as a whole.  This will allow more users to actually make use of them.
 It would also be good to add unit tests for the drop-cache code path, to ensure that it is
functioning as we expect.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message