Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6F37810976 for ; Fri, 12 Jul 2013 01:18:47 +0000 (UTC) Received: (qmail 29226 invoked by uid 500); 12 Jul 2013 01:18:47 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 29195 invoked by uid 500); 12 Jul 2013 01:18:47 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 29186 invoked by uid 99); 12 Jul 2013 01:18:47 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 12 Jul 2013 01:18:47 +0000 Date: Fri, 12 Jul 2013 01:18:47 +0000 (UTC) From: "Hadoop QA (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HDFS-4817) make HDFS advisory caching configurable on a per-file basis MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13706545#comment-13706545 ] Hadoop QA commented on HDFS-4817: --------------------------------- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12591944/HDFS-4817.008.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 6 new or modified test files. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4638//console This message is automatically generated. > make HDFS advisory caching configurable on a per-file basis > ----------------------------------------------------------- > > Key: HDFS-4817 > URL: https://issues.apache.org/jira/browse/HDFS-4817 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client > Affects Versions: 3.0.0 > Reporter: Colin Patrick McCabe > Assignee: Colin Patrick McCabe > Priority: Minor > Attachments: HDFS-4817.001.patch, HDFS-4817.002.patch, HDFS-4817.004.patch, HDFS-4817.006.patch, HDFS-4817.007.patch, HDFS-4817.008.patch > > > HADOOP-7753 and related JIRAs introduced some performance optimizations for the DataNode. One of them was readahead. When readahead is enabled, the DataNode starts reading the next bytes it thinks it will need in the block file, before the client requests them. This helps hide the latency of rotational media and send larger reads down to the device. Another optimization was "drop-behind." Using this optimization, we could remove files from the Linux page cache after they were no longer needed. > Using {{dfs.datanode.drop.cache.behind.writes}} and {{dfs.datanode.drop.cache.behind.reads}} can improve performance substantially on many MapReduce jobs. In our internal benchmarks, we have seen speedups of 40% on certain workloads. The reason is because if we know the block data will not be read again any time soon, keeping it out of memory allows more memory to be used by the other processes on the system. See HADOOP-7714 for more benchmarks. > We would like to turn on these configurations on a per-file or per-client basis, rather than on the DataNode as a whole. This will allow more users to actually make use of them. It would also be good to add unit tests for the drop-cache code path, to ensure that it is functioning as we expect. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira