Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5F68E11C46 for ; Thu, 21 Aug 2014 22:00:13 +0000 (UTC) Received: (qmail 61860 invoked by uid 500); 21 Aug 2014 22:00:13 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 61818 invoked by uid 500); 21 Aug 2014 22:00:13 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 61798 invoked by uid 99); 21 Aug 2014 22:00:12 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 21 Aug 2014 22:00:12 +0000 Date: Thu, 21 Aug 2014 22:00:12 +0000 (UTC) From: "Colin Patrick McCabe (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HDFS-6581) Write to single replica in memory MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-6581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106025#comment-14106025 ] Colin Patrick McCabe commented on HDFS-6581: -------------------------------------------- Looks good overall. It's good to see progress on this. Some comments about the design doc: * Why not use ramfs instead of tmpfs? ramfs can't swap. ** The problem with using tmpfs is that the system could move the data to swap at any time. In addition to performance problems, this could cause correctness problems later when we read back the data from swap (i.e. from the hard disk). Since we don't want to verify checksums here, we should use a storage method that we know never touches the disk. Tachyon uses ramfs instead of tmpfs for this reason. * An LRU replacement policy isn't a good choice. It's very easy for a batch job to kick out everything in memory before it can ever be used again (thrashing). An LFU (least frequently used) policy would be much better. We'd have to keep usage statistics to implement this, but that doesn't seem too bad. * How is the maximum tmpfs/ramfs size per datanode configured? I think we should use the existing {{dfs.datanode.max.locked.memory}} property to configure this, for consistency. System administrators should not need to configure separate pools of memory for HDFS-4949 and this feature. It should be one memory size. ** I also think that cache directives from HDFS-4949 should take precedence over this opportunistic write caching. If we need to evict some HDFS-5851 cache items to finish our HDFS-4949 caching, we should do that. * Related to that, we might want to rename {{dfs.datanode.max.locked.memory}} to {{dfs.data.node.max.cache.memory}} or something. * You can effectively revoke access to a block file stored in ramfs or tmpfs by truncating that file to 0 bytes. The client can hang on to the file descriptor, but this doesn't keep any data bytes in memory. So we can move things out of the cache even if the clients are unresponsive. Also see HDFS-6750 and HDFS-6036 for examples of how we can ask the clients to stop using a short-circuit replica before tearing it down. > Write to single replica in memory > --------------------------------- > > Key: HDFS-6581 > URL: https://issues.apache.org/jira/browse/HDFS-6581 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Reporter: Arpit Agarwal > Assignee: Arpit Agarwal > Attachments: HDFSWriteableReplicasInMemory.pdf > > > Per discussion with the community on HDFS-5851, we will implement writing to a single replica in DN memory via DataTransferProtocol. > This avoids some of the issues with short-circuit writes, which we can revisit at a later time. -- This message was sent by Atlassian JIRA (v6.2#6252)